[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [xmlblaster] xmlBlaster performance issues



Nicholas Piegdon wrote:
Hello,

We've been using xmlBlaster for over a year now in one of our projects. We've slowly been adding connected clients, increasing the number of messages we send, and growing the size of the messages. Just recently we seem to have hit the limits of the xmlBlaster server. However we're still dramatically under the performance listed on the "performance" page of the xmlBlaster.org site.

Here's our usage model:
- 4-5 clients using XmlBlasterAccessUnparsed in the C API.
- 2-3 clients using the Java API.
- A couple of the C clients actually make more than one connection to the server. (The system architecture for our project dictates a few "conceptual" components that don't match 1:1 with our actual running processes.)


We have 4 high-end machines (2-4 GB RAM, dual-core, dual-processor Xeon/64-bit Opteron depending) running both Linux and Windows. All 4 have gigabit Ethernet connected to a single switch. We spread our applications between the machines relatively evenly.

The messages are all point-to-point save for a single "monitoring" component that just starts up and subscribes to '*' with a graphical front end to verify the correctness of the information we're passing around. We believe we're using the Socket protocol in all of our components.

The messages have a minimal key but contain an average of maybe 1k-4k in their content section (maxing out around maybe 10k).
10k * 10 clients * 300 mps = 30 MByte/sec = 240 MBit /sec
so i expect the GBit Ethernet to be sooner or later be a bottleneck,
and ethernet latency will have impact as well.
Using zlib compression will help, using publishArr will help.
Using publishOneway will help (if you can afford it to have no ACK).
For the subscribers, using updateOneway will help and burst mode should help.

- We send messages in the C client using publish (vs. publishMsgArr).
- We altered xmlBlaster.properties by uncommenting the first 7 lines, and we saw a large performance gain.
Yes, the hard disk is the biggest bottleneck, if transient messages are OK for you this will help.
- We defined XB_USE_PTHREADS and saw a performance gain.
Here i wouldn't expect any change.

- We tried to enable burst mode on a few of the C clients but didn't see much of a difference. (The "Client features" page says that burst mode isn't implemented in the C client though.)
It is a feature of the xmlBlaster server - the callback will collect messages for the given millis and
send them in a bunch - this saves a lot of overhead but failure handling is all or nothing - again
its up to the use case if you can afford this.



We ran the first test you have listed on the performance page (org.xmlBlaster.test.stress.LoadTestSub):
You: (600 MHz K7) 672 mps.
Us: (3.0 GHz dual-Xeon) 33 mps.
Us: (after uncommenting 7 lines in xmlBlaster.properties) 300 mps.
Yes the LoadTestSub code and the performance page is outdated, we should urgently update
this to avoid misleading figures.

The 7 lines did help, but in our actual system, we're still not seeing anything close to that 300 mps. In particular, the C client responsible for most of the messages in the system appears to sleep (CPU usage near 0) whenever delivering a lot of messages back-to-back (even with XB_USE_PTHREADS). This causes the GUI in that app to refresh at only a frame every second or two.


We suspected the single "subscribe '*'" client might have been the culprit for hitting the performance, but taking it out of the equation doesn't help much.

I was wondering if you have a list of tips for squeezing more performance out of the client and server:
- Is there a way to enable asynchronous send?
- Should we use "publish message array" in the C client?
- Should we use "publish one way"?
- Does compiling in release mode speed things along significantly? (We tried on the C clients, but didn't see much difference.)
- Will compression help? (I suspect the message-send overhead is worse than any kind of bandwidth limitations we might be running into.)
- Is there anything else we're missing?


Thanks for any assistance you can offer.
I believe the bottleneck is the xmlBlaster server and probably the network,
the C clients don't use remarkable CPU time so i wouldn't spend time on C for performance issues.
Additionally, the java based subscribers may consume quite a lot of the CPU.


For the xmlBlaster server, the last years where adding more and more features,
but this is more or less done now (most is handled by specific plugins which are only slowing
down if switched on), and it settles to be more and more stable (currently
there are some 4 or 5 important open issues i know of, which happen in specific configuration
combinations).


I think the next round is to have an eye on performance again, a performance analysis and tuning
the major bottlenecks in the server should significantly
increase message throughput, but i doubt that we can reach the former numbers again,


regards
Marcel