Can you please send the server log as well?
Thanks Marcel
I am not sure why the xmlBlaster does not respond. Here is the stack trace when the client gets hung waiting for a response.
"DomainCheckTimer" prio=6 tid=0x033a29b0 nid=0x182c in Object.wait() [0x03ebf000..0x03ebfbec]
at java.lang.Object.wait(Native Method)
at EDU.oswego.cs.dl.util.concurrent.Latch.attempt(Latch.java)
- locked <0x198226e0> (a EDU.oswego.cs.dl.util.concurrent.Latch)
at org.xmlBlaster.util.protocol.RequestReplyExecutor.requestAndBlockForReply(RequestReplyExecutor.java:629)
at org.xmlBlaster.client.protocol.socket.SocketConnection.subscribe(SocketConnection.java:469)
at org.xmlBlaster.client.dispatch.ClientDispatchConnection.subscribe(ClientDispatchConnection.java:282)
at org.xmlBlaster.client.dispatch.ClientDispatchConnection.doSend(ClientDispatchConnection.java:150)
at org.xmlBlaster.util.dispatch.DispatchConnection.send(DispatchConnection.java:231)
at org.xmlBlaster.util.dispatch.DispatchConnectionsHandler.send(DispatchConnectionsHandler.java:435)
at org.xmlBlaster.util.dispatch.DispatchWorker.run(DispatchWorker.java:70)
at org.xmlBlaster.util.dispatch.DispatchManager.putPre(DispatchManager.java:561)
at org.xmlBlaster.util.queue.cache.CacheQueueInterceptorPlugin.put(CacheQueueInterceptorPlugin.java:476)
at org.xmlBlaster.util.queue.cache.CacheQueueInterceptorPlugin.put(CacheQueueInterceptorPlugin.java:456)
at org.xmlBlaster.client.XmlBlasterAccess.queueMessage(XmlBlasterAccess.java:820)
at org.xmlBlaster.client.XmlBlasterAccess.subscribe(XmlBlasterAccess.java:862)
at com.orci.datagateway.ClientKit.DGDomain.subscribe(DGDomain.java:327)
at com.orci.datagateway.ClientKit.DGConnection.addDomain(DGConnection.java:146)
- locked <0x17feb2c0> (a com.orci.datagateway.ClientKit.DGConnection)
at com.orci.datagateway.ClientKit.DGDomain.connect(DGDomain.java:195)
at com.orci.datagateway.ClientKit.DGDomain.updateCurrentState(DGDomain.java:264)
at com.orci.datagateway.ClientKit.DGDomain.checkState(DGDomain.java:216)
at com.orci.datagateway.ClientKit.DGDomain$CheckDomainsTask.run(DGDomain.java:404)
at java.util.TimerThread.mainLoop(Timer.java:512) at java.util.TimerThread.run(Timer.java:462)
The problem seems to occur when using SocketSSL protocol and the server dies and comes back up. The client tries to reconnect and resubscribe to the blaster. This situation does not happen all the time, but only occasionally and unfortunately we have not been able to isolate the exact string of events that cause this problem. We are hoping that a timeout will fix the problem.
David
Marcel Ruff wrote:David Robison wrote:I noticed that the default timeout response was changed from one minute to MAX_INT in release 1.0.7. Was there a reason for this change. We have run into a situation when we issue a subscribe command but never get a response. With the MAX_INT timeout, the program hangs. I was thinking about resetting the response timeout to one minute but wanted to make sure that this would not cause additional problems.
Thanks, David Robison
Why is the response not returning?
client --> subscribe --> xmlBlaster <--- subscribeReturnQos <--
is it xmlBlaster which is not responding? I think the first step is to track down the reason for the missing response.
We have introduced the MAX_INT mainly for callbacks. If the callback carries
a huge message over a modem (taking for example half a day)
it may cause a ping to be blocked which in turn aborts the transfer and the callback dispatcher goes to polling.
Then it is retried again, fails again and so on.
This situation is not this critical anymore as the current xmlBlaster
detects the bytes transfered and ommits the ping if bytes travel over the socket.
Yes, i would say you can reduce the MAX_INT in your case,
regards, Marcel
-- Marcel Ruff http://www.xmlBlaster.org