XmlBlaster Logo

REQUIREMENT

client.failsafe

XmlBlaster Logo


Type NEW
Priority HIGH
Status CLOSED
Topic I_XmlBlasterAccess is the Java client side interface to xmlBlaster which supports polling for xmlBlaster and record/playback of messages during lost connections.
Des
cription

We discuss in this requirements the strategy used by clients to connect to an xmlBlaster server. The same strategy is used by an xmlBlaster server to callback to a client.


Standard client connection

Lets have a look at the states of a client connection without any fail-safe behaviour:


Standard lifecycle of a remote connection as seen by a client

On startup a client tries to connect to xmlBlaster and if xmlBlaster is not found we get an exception. Every client developer needs to code her own strategy how to poll for xmlBlaster in such a case.


Fail save mode

In fail-safe mode the status diagram of a client connection (and similar a callback connection in xmlBlaster) is a bit more complicated, but you should remember that you have to solve these problems in any distributed environment, if we don't address it here you have to invent it yourself:


Fail save lifecycle of a remote connection as seen by a client

As you can see, a client connection has three states namely connected, polling and dead.

On startup a client tries to connect to xmlBlaster and one of the state transitions (1), (2) or (3) happens: The client connects successfully, the client library polls for a connection or we receive an exception.

After a connection is established we may loose the connection and depending on our configuration parameters we start polling (4) or we give up (7), if we are tired to poll we give up as well (6).

If we are in dead state, the transitions (8) and (9) show recovery possibilities. We could change the configuration in dead state, for example we could supply another address to find xmlBlaster. Note that this feature is currently NOT implemented (marked with blue color).

The method names (marked green in the drawing) are events which the client developer may listen on to do specific handling depending on the situation.


What is a pubSessionId?

Every client which connects to xmlBlaster needs to supply a login name and optionally a public session identifier which identifies the login session. The session identifier is public as everybody may see it, in contrast there is private session id as well (which is generated by xmlBlaster) which is secret as anybody having this secret session ID could kidnap the connection.

The following figure shows the session naming conventions.


Session naming convention

A client which wants to login to xmlBlaster needs at least a subjectId. The subjectId is a login name which must be unique in a cluster. You may not choose '_', ':' or '/' characters in your subjectId.

The public session identifier (pubSessionId) identifies every login session of a specific user. The scope of a pubSessionId is the subjectId - so a client with another login name may have the same public session ID.
The pubSessionId is an integer, negative numbers are reserved for xmlBlaster generated pubSessionId whereas a client may decide to choose a well known pubSessionId itself which must be a positive integer.

Examples for valid relative names:

   joe                  // xmlBlaster generates a negative pubSessionId for us
   jack@xy.com/4        // the user 'jack@xy.com' wants to (re)connect to session 4
   averell/-2           // the second session of 'averell' (pubSessionId was generated
                        // by xmlBlaster)
      

Note that the client administrator needs to take care if it manages the pubSessionId himself. If he by mistake starts two clients with the same relative name they operate on the same server side session object instance and may produce unpredictable results.


Client connection scenarios in a cluster environment

The following figure illustrates typical environments where a client must find its server and recover failsafe from lost connections.


Failsave client connection in cluster environments

Above you see different cluster configurations which we discuss in detail later. The basic behind all scenarios is that the client need to more or less handle lost connections or lost sessions. If the client reconnects after failure there are two possibilities:

  1. The client gets a new pubSessionId which means that the session information of the client is lost in the server.
    In this case the client it needs to reinitialize all subscribes as on its startup. This sounds complicated but usually it is not too difficult to code an initialize method in the client which is invoked on startup or on reconnect.
    If the xmlBlaster server would support to mirror the session information of clients in the cluster environment the client programmer would not need to remember its current subscribes.
  2. The client reconnects to the same pubSessionId which means it found its server side session info object again. This is the case if it found the same xmlBlaster server instance (and the session is not timed out) or if the xmlBlaster server instance have mirrored the session information of the client.

Now we look at the different cluster configurations, if you have difficulties to understand the cluster node names you should read The Lord of the Rings by R.R. Tolkien first:

1. Independent server instance

A client needs to connect to only one xmlBlaster node, if the connection is lost we poll for the same server as configured on startup.

2. Cluster collective with mirrored nodes

The client polls for two servers, if it has connected to one it is happy. If the reconnect changed from say sauron to gollum the session information is mirrored and the client does not need to reinitialize its startup subscriptions.

In this case the client library must support multiple server addresses.

3. High availability (HA) cluster with mirrored nodes

If you have a smart system administrator she will install a high availability software on your servers. This allows to reuse the same IP address if one of your server hardware crashes. From the client view it always connects to the same IP even if the hardware behind changed. XmlBlaster needs to run mirrored so that all xmlBlaster server nodes have the statefull session informations. Usually the HA solution uses the same harddisks (say RAID 5 hotswappable) which makes it easy for the xmlBlaster cluster nodes to share their persistent data.

An example for a commercial HA solution is ServiceGuard from HP.

4. Master/slave cluster setup

In this setup no stateful session information is mirrored. If a client goes to the other node it needs to reinitialize its subscriptions. Note that all messages are available on both nodes as they operate in master/slave mode.

Note: The protocol used (CORBA, RMI, XMLRPC etc.) is transparently hidden. See I_XmlBlasterAccess.java for a usage description and TestFailSave.java (testsuite) for a code example how to use it.


Client side queuing during reconnect-polling

When a client has lost its connection to the server the Java and C++ client libraries support to a certain grade a client side queuing of messages from the client. The following table lists the features for queuing tailback messages.

Method Priority Comment Java C++
connect MAX A connection request is queued depending on the client configuration, see the above state charts about the connect() behavior yes yes
publish any publish() invocation messages are queued as is. The key oid is generated if none is given by the client library as a combination of the relative address and a JVM/process unique timestamp and returned for the publish() invocation (faked return)
An example for a generated key oid is gen:joe/2:345212004
yes yes
subscribe NORM subscribe() requests can be queued. The subscriptionId is generated by the client library as a combination of the relative address and a JVM/process unique timestamp and returned by the subscribe() invocation (faked return)
An example for a generated subscriptionId is __subId:joe/2:345212004
C++ note: The client side queuing is transient only. Java note: Only subscriptions done with a positive sessionId are kept in the queue, if the server is not available and a subscription is done with a negative sessionId, an RESOURCE_TEMPORARY_UNAVAILABLE Exception is thrown and the subscription is not queued.
(yes) (yes)
update any This is solved on server side yes yes
get - get() invocations are synchronous and can't be queued, an exception is thrown if you try to invoke get() without a server connection. no no
unSubscribe NORM unSubscribe() requests can only be queued if they contain a subscriptionId which we can use for the faked return values. If the unSubscribe contains a XPath query or message key oid an exception is thrown.
Note that you can't trust the returned information in this case.
(yes) no
erase MIN erase() requests can only be queued if they contain exact key oids which we can use for the faked return values. If the erase contains an XPath query an exception is thrown.
Note that you can't trust the returned information in this case.
(yes) no
disconnect MIN A disconnect can be queued as it returns void. Note that disconnect messages are never persistent to avoid an unexpected disconnect if a client restarts (with same public session ID) and there is a persistent disconnect in the client side tail back queue. yes no

The Java client queuing framework supports persistent messages and supports swapping for huge message amount. On client crash the persistent messages are recovered.

The C++ queuing framework is memory based, or if the sqlite database was compiled in, also persistent. In the first case, if the C++ client is crashing the tailback messages are lost.

Publish and update messages are priorized in the queue sorting as configured by the publisher. The priority of a connect() request is always highest whereas erase() and disconnect() has lowest priority. The other requests have norm priority.

On demand a client can register a local callback listener to receive the real return values as soon as the connection is established and the tailback messages are flushed to the server, see

I_XmlBlasterAccess.registerPostSend()

Example
Java
      
// Example with default fail-safe settings,
// you can change these on command line or in a property file.
// See configuration section below

import org.xmlBlaster.client.I_ConnectionStateListener;
import org.xmlBlaster.util.dispatch.ConnectionStateEnum;
import org.xmlBlaster.client.I_XmlBlasterAccess;
...

con.registerConnectionListener(new I_ConnectionStateListener() {
      
      public void reachedAlive(ConnectionStateEnum oldState,
                               I_XmlBlasterAccess connection) {
         ConnectReturnQos conRetQos = connection.getConnectReturnQos();
         log.info(ME, "I_ConnectionStateListener: We were lucky, connected to " +
            connection.getGlobal().getId() + " as " + conRetQos.getSessionName());
         // we can access the queue via 'connection' and for example
         // erase the entries:
         //connection.getQueue().clear();
      }
      public void reachedPolling(ConnectionStateEnum oldState,
                                 I_XmlBlasterAccess connection) {
         log.warn(ME, "I_ConnectionStateListener: No connection to " +
                  connection.getGlobal().getId() + ", we are polling ...");
      }
      public void reachedDead(ConnectionStateEnum oldState,
                              I_XmlBlasterAccess connection) {
         log.warn(ME, "I_ConnectionStateListener: Connection to " +
                  connection.getGlobal().getId() + " is DEAD");
      }
   });
 
ConnectReturnQos conRetQos = con.connect(qos, this);

log.info(ME, "Connected to xmlBlaster.");
...
      
   
Example
Java
      
import org.xmlBlaster.util.qos.address.Address;

// Example with hard coded fail-safe settings
// Here the callback methods are not shown
try {
   con = glob.getXmlBlasterAccess();

   ConnectQos connectQos = new ConnectQos(glob, loginName, passwd);

   // Setup fail-safe handling ...
   connectQos.getClientQueueProperty().setMaxEntries(1000); // queue up to 1000 messages
   Address addressProp = connectQos.getAddress();
   addressProp.setDelay(4000L);      // retry connecting every 4 sec
   addressProp.setRetries(-1);       // -1 == forever
   addressProp.setPingInterval(0L);  // switched off
   
   con.registerConnectionListener(this);

   // and do the login ...
   con.connect(connectQos, this);  // Login to xmlBlaster, register for updates
}
catch (XmlBlasterException e) {
      Log.warn(ME, "setUp() - login failed");
      fail("setUp() - login faile");
}
 
      
   
Example
Java

Here are some command line examples which you can use to play and test reconnecting to a same session without loosing the subscriptions. Messages published during the subscriber is offline are

// set your CLASSPATH to xmlBlaster.jar

// Start the server, type 'd' to get an internal dump:

java org.xmlBlaster.Main

// Start a publisher in interactive mode (more options are written out to play with):

java javaclients.HelloWorldPublish -interactive true -numPublish 10 -oid Hello
                                   -persistent true -session.name publisher/1 

// Start a subscriber in interactive mode (more options are written out to play with):

java javaclients.HelloWorldSubscribe -session.name joeSubscriber/5 -oid Hello
                               -dispatch/callback/retries -1  -persistentSubscribe true

// - Now you can kill 'HelloWorldSubscribe' (without unSubscribe/disconnect)
// - Publish some messages with the still running 'HelloWorldPublish'
// - Restart 'HelloWorldSubscribe' and you should receive the queued messages without
//   subscribing again.
// - Type 'd' in the server console again and look at the dump
    

Note the command line settings of HelloWorldSubscribe, we have set -dispatch/callback/retries -1 which tells xmlBlaster to poll forever if the clients callback server is not reached. If you don't do that, the default is for xmlBlaster to kill the clients session if the callback server is not responding on pings anymore.

With -persistentSubscribe true the subscription is persistently saved on the server, so after a subscriber or a server crash (or both crash) the subscription is still there and no messages are left out.

The other important setting is -session.name joeSubscriber/5, we have not only passed a subjectId (loginName) joeSubscriber but also forced to use the publicSessionId 5. This way we can always reconnect to the session joeSubscriber/5 as long as this session is still alive.

The publisher will queue messages during the server is not available in a persistent client side queue.

Configure

These parameters allow to configure the clients fail-safe behavior.

Example:

java HelloWorld4 -queue/callback/maxEntries 20000 -retries -1 -session.name joe/2

Property Default / Example Description Impl
pingInterval 10000 Ping every given milliseconds if xmlBlaster connection is OK yes
retries -1 (forever) How often to retry to establish a new connection to xmlBlaster on failure. yes
delay 4000 Delay between connection retries in milliseconds
A delay value > 0 switches fails save mode on, 0 switches it off
yes
queue/connection/maxEntries 1000 The maximum allowed number of messages in the client side queue
0 switches recording of invocations (tail back messages) off.
-1 sets it to unlimited.
yes
dispatch/callback/retries 0 How often the xmlBlaster server shall retry to reach our client side callback server on failure. Defaults to 0, which destroys the clients session if the clients callback server is not reachable. -1 sets it to try forever and preserve the client session even if the client is offline. yes
dispatch/callback/delay 60000 See delay above, but this value configures the delay of xmlBlaster server callback framework. yes
queue/callback/maxEntries 1000 The maximum allowed number of messages in the server side callback queue if this client
-1 sets it to unlimited.
yes

NOTE: Configuration parameters are specified on command line (-someValue 17) or in the xmlBlaster.properties file (someValue=17). See requirement "util.property" for details.
Columns named Impl tells you if the feature is implemented.
Columns named Hot tells you if the configuration is changeable in hot operation.

See REQ queue
See REQ client.failsafe.ping
See REQ cluster
See REQ util.recorder
See REQ util.recorder.persistence
See API org.xmlBlaster.client.I_XmlBlasterAccess
See TEST org.xmlBlaster.test.authenticate.TestSessionReconnect
See TEST org.xmlBlaster.test.client.TestFailSafe
See TEST org.xmlBlaster.test.client.TestFailSafeAsync

This page is generated from the requirement XML file xmlBlaster/doc/requirements/client.failsafe.xml

Back to overview