Stability/Performance Benchmarks

Hi,

We are running a large ejabberd installation (10+ distributed nodes, several 10K online users). We are currently dealing with some severe stability and performance issues which are very difficult to track down (if you have ever used Erlang and ejabberd, you know what I’m talking about).

Is there any Openfire installation with similar userbase dimensions? It would be great to hear about your experience in matters of stability and performance.

Thanks,

Helge

Hi Helge,

I assume that you did read http://www.igniterealtime.org/about/OpenfireScalability.pdf and took a look at http://www.igniterealtime.org/community/poll.jspa?pollID=1006 (concurrent or registered users?, not clear within this poll)

Clustering for Openfire is not yet available and if it is it will be an Enterprise Feature.

So I guess that you may be the first one to have a running server with 30K+ users if you give this a try, actually I think that you want to ask Jivesoftware for some support before trying this.

LG

Hey Helge,

We have been working on adding clustering to Openfire for the last few months. The work is already done and things are looking great. If you need to handle 10K concurrent users you may be fine even when not running in cluster mode. That means that you can even use the open source version to handle that load (note that clustering is a feature available in Openfire Enterprise). In our load tests we were able to hit 150K concurrent users in a single JVM and stopped because we ran out of users in the system.

However, if you want to avoid having a single point of failure then you are correct and you should use a cluster solution. Distributing eggs in many baskets is a good way to reduce risk. Anyway, I guess that having 2 nodes is enough for 10K. Being the installation of Openfire a breeze I’m happy to say that having clustering is just as easy as installing the standalone version. Just install Openfire in as many nodes as you want using the standalone installer, drop the enterprise plugin and enable clustering from the admin console.

In a few weeks we are going to release the first beta of Openfire 3.4.0 and Openfire Enterprise 3.4.0 (including clustering). FYI, we have being using Openfire Enterprise 3.4.0 in jivesoftware.com, igniterealtime.org and many other public servers and things are working just fine. Let me know if you want to help us test the new clustering feature.

Regards,

– Gato

LG, Gaston,

Thank you very much for your replies.

Our company’s product management has plans to scale our instant messaging service up to our complete user base within the next 12-18 months. That means we’re going to have to deal with millions of IM users, so a large cluster solution surely will be the only way to go.

Gaston, I would be interested in getting some more detailed information about the clustering you’ve built into Openfire. Did you use your own clustering framework, or did you use an existing framework like Terracotta? What about the gateways to other IM services? We’re currently running gateway instances together with ejabberd nodes on all of our servers, but we’re thinking about setting up dedicated gateway nodes to separate the gateways from core ejabberd. Would this also be a configuration scenario with Openfire Enterprise? Next question: How do you handle database connectivity? Are you using database clustering middleware like e.g. C-JDBC?

Regards,

Helge

Hey Helge,

stahlmann wrote:

Gaston, I would be interested in getting some more detailed information about the clustering you’ve built into Openfire. Did you use your own clustering framework, or did you use an existing framework like Terracotta?

Building our clustering framework from scratch would have been a Pharaonic task. We instead decided to go with Coherence from Tangosol (now part of Oracle). As I said in the clustering podcast, having a clustering framework was just the beginning since a huge amount of refactoring work was still required.

What about the gateways to other IM services? We’re currently running gateway instances together with ejabberd nodes on all of our servers, but we’re thinking about setting up dedicated gateway nodes to separate the gateways from core ejabberd. Would this also be a configuration scenario with Openfire Enterprise?

Gateways are just like any other plugin for Openfire. Plugins may provide zero, one or many XMPP services. Plugins may be installed in one, many or all cluster nodes. When you install the plugin in all cluster nodes then the services provided by the plugin will be available in all cluster nodes. However, if you install the plugin in some cluster nodes then even though the service will be available for all cluster nodes those nodes that are not actually hosting the service will need to forward requests to the closest cluster node hosting the service. So lets say that you have a cluster of 10 nodes and the plugin is installed in only 2. End users connected to any cluster node will be able to use the service. However, users that are not connected to cluster nodes hosting the actual service will need to forward their traffic to the closest cluster node hosting the service.

In the case of gateways I think that it makes sense to have the gateway plugin installed in all cluster nodes. That is the perfect way to reduce the load on each cluster node since each local user will use the local gateway service thus distributing the load across the cluster. Even if you prefer to move the gateway service to a service grid I think that that could not be most efficient way of taking advantage of the hardware. Let me give you an example to illustrate what I’m saying.

Let’s say that you have 10 machines available. You use 5 machines to run the XMPP server and then 5 machines to run the gateway plugin. Now let’s assume that your 1M users use a lot the gateway service. In this architecture you end up with 5 XMPP servers doing a lot of remote calls to the service grid that has only 5 machines to attend the requests of your users. On the other hand, if you are using the 10 machines to host the XMPP server and the gateway plugin then you are avoiding lots of remote calls (i.e. traffic and finally throughput) and you now also have 10 machines doing gateway work. Moreover, afaik some legacy networks (ie. AOL, MSN, etc.) place a limit in the number of connections that a single IP address may generate. Having more cluster nodes hosting the gateway service is another way of avoiding that limitation.

Next question: How do you handle database connectivity? Are you using database clustering middleware like e.g. C-JDBC?

We handle DB connectivity just like you handle it with any application server. So lets say that in a standard web application you have 3 layers (front layer that is browsers, middle layer that is application server like Websphere, and the last layer that is the DB). You can then enable clustering in the DB layer and not in the application layer if your bottleneck is the DB. Or you can enable clustering in the application server layer and not in the DB (because your DB does not support it or it is too expensive for your project) but you are still offering a better experience to the end users in case an application server goes down. Or you can go with the whole package and enable clustering in the application server layer and the DB layer.

Having said all that, Openfire does not care if your DB is a single machine or if there is a cluster of machines. If your DB does not have native support for clustering you can then use C-JDBC with Openfire.

Regards,

– Gato

Hi Gato,

will it be possible to run two different Openfire versions within a cluster? I just think of updates without a downtime, especially if one has a large user base.

LG

Hi,

I’m a bit concerned by the news of clustering only available in Openfire Enterprise edition. Clustering generally makes sense when you have a lot of users (and you care about availability of course).

At 15$/user/year, if you have say 10k+ users, that will add-up to quite a sum. The prices of Clearspace X are more reasonnable, but as I understand it, Clearspace is not a XMPP RTC server.

Will there be a specific offering for openfire + clustering for a lot of users ?

We are also very concerned about the $15/user/year fee for enterprise edition. I can see this pricing model is targetted at people who have < 100 or so users, but we have at least 1500 users concurrently at any given time, peaking at about 2200 on most days. We have 50,000+ registered users on our system (which is growing rapidly), are we really expected to pay $750,000 a year for Openfire? We have no problem paying for clustering, but the per-user-per-year pricing seems like a deal-breaker for our purposes.

Hey guys,

No need to be concerned. I’m sure we can work something out so that you (installations with big number of users) can be happy with Openfire Enterprise. Feel free to send me a private message and I will redirect you with someone that could help you or just send an email to sales at jivesoftware.com and see what we can do.

Regards,

– Gato

Awesome, thanks for the quick response, we’ll be in contact with your sales department! We are really looking forward to clustering!