[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [xmlblaster] ClusterNode.isConnected()
Michael Lum wrote:
Hi,
I was working on a custom load balancer plugin for clustering and
message routing. I noticed some behavior that didn't seem intuitive to
me.
I modified line 144 of
src/java/org/xmlBlaster/engine/cluster/simpledomain/RoundRobin.java:
if (log.TRACE) log.trace(ME, "Selected master node id='" +
nodeDomainInfo.getClusterNode().getId() + "' from a choice of " +
nodeDomainInfoSet.size() + " nodes");
to
if (log.TRACE) log.trace(ME, "Selected master node id='" +
nodeDomainInfo.getClusterNode().getId() + "' from a choice of " +
nodeDomainInfoSet.size() + " nodes. isConnected() = " +
nodeDomainInfo.getClusterNode().isConnected() + ", and
isPolling() = " + nodeDomainInfo.getClusterNode().isPolling());
I did this to get a better picture of the runtime state of these
variables when I shut down cluster members and simulate node crashes, so
that I could write my own plugin to dynamically reroute messages to
another cluster node of a lower stratum by adding some checks for
isConnected() on the cluster node being evaluated as a master candidate.
My setup was, two masters of stratum 0, and one 'relay' as their slave.
I start up all 3 instances, and setup a subscriber on both of the
masters. I then publish a message on the relay. All is well, as one of
the masters is chosen and picks up the message. isConnected() is true,
and isPolling() is false for that cluster node, as expected. Next, I
kill the master instance that was receiving the message. Then I
republish to the relay. However, the trace output on the relay says
that isConnected() is still true for that node! and isPolling() is also
true! I dug a little deeper and saw that its looking at the connection
state of XmlBlasterAccess, but didn't go any deeper than that. My
intuition tells me that isConnected() and isPolling() should be mutually
exclusive, but maybe my understanding of those methods is incorrect.
One potential problem that I see is in this method in ClusterNode.java:
public int getConnectionState() throws XmlBlasterException {
if (isConnected())
return 0;
if (isPolling())
return 1;
return 2;
}
which would return '0' even if the node is in the Polling state, because
in my example above, even though I killed the node, it is still
returning 'true' for isConnected(), as well as 'true' for isPolling().
As a workaround, my custom load balancer checks to see if isPolling() is
true, and then skips that node if it is true. If it doesn't find any
nodes that have isPolling() false, it will pick the first one on the
list that isPolling() so that at least the messages are queued for
delivery when it reappears.
Mike
The javadoc of I_XmlBlasterAccess states:
/**
* Has the connect() method successfully passed?
* <p>
* Note that this contains no information about the current connection state
* of the protocol layer.
* </p>
* at return true If the connection() method was invoked without exception
* at see I_ConnectionHandler#isAlive()
* at see I_ConnectionHandler#isPolling()
* at see I_ConnectionHandler#isDead()
*/
boolean isConnected();
isConnected() is correctly doing what it should.
But our usage in ClusterNode.java is buggy,
i have fixed an commited it.
Thanks for finding this bug,
Marcel
--
http://www.xmlBlaster.org