[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[xmlblaster] ClusterNode.isConnected()
Hi,
I was working on a custom load balancer plugin for clustering and message routing. I
noticed some behavior that didn't seem intuitive to me.
I modified line 144 of src/java/org/xmlBlaster/engine/cluster/simpledomain/RoundRobin.java:
if (log.TRACE) log.trace(ME, "Selected master node id='" +
nodeDomainInfo.getClusterNode().getId() + "' from a choice of " + nodeDomainInfoSet.size()
+ " nodes");
to
if (log.TRACE) log.trace(ME, "Selected master node id='" +
nodeDomainInfo.getClusterNode().getId() + "' from a choice of " + nodeDomainInfoSet.size()
+ " nodes. isConnected() = " + nodeDomainInfo.getClusterNode().isConnected() + ", and
isPolling() = " + nodeDomainInfo.getClusterNode().isPolling());
I did this to get a better picture of the runtime state of these variables when I shut
down cluster members and simulate node crashes, so that I could write my own plugin to
dynamically reroute messages to another cluster node of a lower stratum by adding some
checks for isConnected() on the cluster node being evaluated as a master candidate.
My setup was, two masters of stratum 0, and one 'relay' as their slave. I start up all 3
instances, and setup a subscriber on both of the masters. I then publish a message on the
relay. All is well, as one of the masters is chosen and picks up the message.
isConnected() is true, and isPolling() is false for that cluster node, as expected. Next,
I kill the master instance that was receiving the message. Then I republish to the relay.
However, the trace output on the relay says that isConnected() is still true for that
node! and isPolling() is also true! I dug a little deeper and saw that its looking at
the connection state of XmlBlasterAccess, but didn't go any deeper than that. My
intuition tells me that isConnected() and isPolling() should be mutually exclusive, but
maybe my understanding of those methods is incorrect.
One potential problem that I see is in this method in ClusterNode.java:
public int getConnectionState() throws XmlBlasterException {
if (isConnected())
return 0;
if (isPolling())
return 1;
return 2;
}
which would return '0' even if the node is in the Polling state, because in my example
above, even though I killed the node, it is still returning 'true' for isConnected(), as
well as 'true' for isPolling().
As a workaround, my custom load balancer checks to see if isPolling() is true, and then
skips that node if it is true. If it doesn't find any nodes that have isPolling() false,
it will pick the first one on the list that isPolling() so that at least the messages are
queued for delivery when it reappears.
Mike