Hi Marcel,
I have modified the TopicContainer class as follows (in order to help
me track down the problem):
private final class TopicContainer {
private volatile TopicHandler topicHandler;
private final boolean fairness = false;
private final ReentrantLock lock = new DebugReentrantLock(fairness);
public TopicContainer(TopicHandler topicHandler) {
this.topicHandler = topicHandler;
}
public TopicHandler getTopicHandler() {
return this.topicHandler;
}
public void erase() {
if (!this.lock.isHeldByCurrentThread())
throw new IllegalStateException("must be locked before
erased");
if (this.topicHandler == null)
return;
synchronized (this.lock) {
this.topicHandler = null;
int c = this.lock.getHoldCount();
for (int i = 0; i < c; i++)
this.lock.unlock();
if (this.lock.isLocked()) log.severe("Must be unlocked!");
}
}
public TopicHandler lock() {
// topicHandler already erased
if (this.topicHandler == null)
return null;
try {
this.lock.tryLock(60, TimeUnit.SECONDS);
} catch (InterruptedException e) {
throw new RuntimeException("Interrupted while locking", e);
}
// topicHandler might have been erased while locking
TopicHandler th = this.topicHandler;
if (th == null) {
this.lock.unlock();
return null;
}
return th;
}
public void unlock() {
synchronized (this.lock) {
if (this.lock.getHoldCount() > 0) // returns 0 if we are
not the
// holder
this.lock.unlock(); // IllegalMonitorStateException if our
// thread is not the holder of the lock,
// never happens because of above if()
}
}
} // class TopicContainer
There are some additional checks as you can see. Additionally, I'm
using tryLock with a timeout of one minute. I also extended
ReentrantLock as DebugReentrantLock which saves the last stacktraces
of the thread which aquired the lock:
public class DebugReentrantLock extends ReentrantLock {
private static final Logger
logger=Logger.getLogger("DebugReentrantLock");
private ArrayList holderTrace=new ArrayList();
public DebugReentrantLock() {
}
public DebugReentrantLock(boolean fair) {
super(fair);
}
public boolean tryLock() {
boolean ret=super.tryLock();
synchronized(this) {
if (ret) {
if (getHoldCount()==1) holderTrace.clear();
holderTrace.add(new Exception());
}
return ret;
}
}
public boolean tryLock(long timeout, TimeUnit unit) throws
InterruptedException {
boolean ret=super.tryLock(timeout, unit);
synchronized(this) {
if (ret) {
if (getHoldCount()==1) holderTrace.clear();
holderTrace.add(new Exception());
}
else {
Thread owner=getOwner();
Exception ownerE=new Exception();
ownerE.setStackTrace(owner.getStackTrace());
logger.log(Level.SEVERE, "trylock failed, locked by", ownerE);
Iterator i=holderTrace.iterator();
while (i.hasNext()) {
logger.log(Level.SEVERE, "lock() called at",
(Exception)i.next());
}
logger.log(Level.SEVERE, "failed to lock() at", new
Exception());
}
return ret;
}
}
public void lock() {
super.lock();
synchronized(this) {
if (getHoldCount()==1) holderTrace.clear();
holderTrace.add(new Exception());
}
}
public void lockInterruptibly() throws InterruptedException {
super.lockInterruptibly();
synchronized(this) {
if (getHoldCount()==1) holderTrace.clear();
holderTrace.add(new Exception());
}
}
}
I hope this will yield some results, I'm trying to stress test the
server now. Unfortunately this bug only occurred on the production
server so far.
Feel free to tell me any idea you might have. :)
regards,
Balázs Póka