Introducing Hazelcast ... a new way to cluster Openfire!

@@Dele Olajide ; @@Alex : Please tell me what did you do to make this plugin work. I installed it on my 2 Openfire servers and it doesn’t show other cluster members than current host. Thanks in advance !

The default cluster configuration uses multicast (UDP) to discover member nodes, which can be problematic in certain deployments. If you are unable to use multicast due to your network configuration, you can configure the unicast (TCP) settings as documented in the Hazelcast plugin’s readme file:


*The Hazelcast plugin uses the XML configuration builder to initialize the cluster from the XML configuration file (hazelcast-cache-config.xml). By default the cluster members will attempt to discover each other via multicast at the following location: *

  • IP Address: 224.2.2.3
  • Port: 54327
  • Note that these values can be overridden in the plugin’s /classes/hazelcast-cache-config.xml file (via the multicast-group and multicast-port elements). Many other initialization and discovery options exist, as documented in the Hazelcast configuration docs noted above. For example, to set up a two-node cluster using well-known DNS name/port values, try the following alternative: *
*...
<join>
   <multicast enabled="false"/>
   <tcp-ip enabled="true">
     <hostname>of-node-a.example.com:5701</hostname>
     <hostname>of-node-b.example.com:5701</hostname>
   </tcp-ip>
   <aws enabled="false"/>
</join>
...
*

Hope that helps!

2 Likes

Thank you very much for your reply ! I did the configuration you suggested in *hazelcast-cache-config.xml *file on both servers. Unfortunately it still doesn’t work for me. This is weird tough because there is no firewall in between and I can connect using telnet to port tcp 5701 and ping hostnames from both servers.

L.E.: finnaly got the plugin working by enabling multicast in local network.

I was thinking about something we could check out and integrate into a build process. I have added the JAR for now, hopefully we won’t have to look at the source code…

I have a problem with that: the configuration file must be inside the plugin JAR file. Thus, when nodes are added, removed, we have to edit the file inside the JAR. Openfire exposes a property to specify a different configuration file, but this seems to be useless, because the plugin class loader will not look outside the JAR in the first place.

Am I missing something here? Is it possible to use a file outside the JAR?

I installed the plugin this morning for my domain. When I enable the plugin, users from my server don’t see online contacts of the other servers. Besides, they can’t join MUC from other servers. Finaly, PubSub doesn’t work, we can’t fetch our network to see the messages but we can see publication from other servers.

I am wondering how the traffic is distribued. For now, all connections go to one server, the first I installed, never to the second one.

It costs more than two times more memory when I enable the plugin.

I was in a similar situation, but was able to fix it by doing two things:

  1. enable clustering from OF console (it’s disabled by default, even if you have the plugin installed)

  2. make sure all servers are using the same DB (cluster), as some session info is saved there (e.g. offline status)

There are a few options for configuring Hazelcast:

  1. The default configuration for the Hazelcast plugin allows cluster members to join and leave the cluster dynamically, using multicast messages (UDP) to announce these membership changes. However, as described above this approach may no be ideal for every deployment.
  2. After the Hazelcast plugin is installed, a custom configuration file can be copied into the plugin’s /classes/ directory. Note that the “hazelcast.config.xml.filename” system property should be set to the name of this file.
  3. The Hazelcast clustering plugin uses a custom class loader that searches the combined classpaths of all installed plugins for classes and other resources. As such, a simple plugin that includes a custom cluster configuration could be deployed. This is in fact the approach we use in our environment. Our plugin further customizes its plugin class loader to add an external directory to the classpath where we manage our various configurations (e.g. test vs. prod).

Hope that helps!

I tried #2, but the folder is erased and recreated on each restart. Copying a configuration file inside the JAR isn’t very promising either. I should probably take a closer look at #3, but it still looks like the configuration file must be placed inside a JAR (or not, if the plugin can add a folder to the classpath). Maybe it makes more sense to patch Openfire to include the conf folder to the plugins classpaths? It would certainly be much cleaner than placing configuration files inside archives.

Edit: It seems Hazelcast already offers an alternative: set the configuration file by setting the hazelcast.config system property. Unfortunately, this is bypassed by the clustering plugin, which goes directly to ClasspathXmlConfig.

The plugin is enable on both OF nodes and they use the same DB : the list of users is the same on both servers.

Other ideas ?

Edit: it’s better without French in the message ^^

Not sure what ‘plugin is enabled’ means, but I was talking about going into OF’s console (the one that runs on ports 9094/9095) and making sure clustering is enabled. Simply copying the JAR under plugins folder won’t do it.

Ne t’inquiete pas, je compred un peu

I meant I uploaded the jar file then enabled the clustering in the tab “Clustering” (By default, OF admin panel runs on ports 9090 and 9091).

Check nohup.out. You should find something like this:

INFO: [10.46.37.118]:5701 [OF37-cluster]

Members [1] {

Member [AA.BB.CC.DD]:5701 this

}

If nodes can see each other, there should be more than one member in the group.

I don’t find this file.

Strange, if you start openfire with service openfire start, this gets created under /opt/openfire/logs

Alternatively, the info we’re looking for should be on stdout. Check whether you have this redirected.

I installed the .deb provided by the OF team.

I’ve got the logs in /var/log/openfire : debug.info, erro.log, info.log and warn.log. But none of this files have an entry like that.

Where is stdout ?

Well, stdout is the console. On Fedora/CentOS/RHEL, it goes to nohup.out. But I don’t know about Debian. Maybe you should remove --quiet from the start command line?

Hi,

I have been looking for a clustering plugin for some time now and it is was really good news to see this post the other day.

2 questions:

So I have 2 Openfire Servers with exacty the same setup, I have installed the clustering plugin on both and I see the local node on each, but I don’t see a second node in any of them!

What do I need to configure else to see the other openfire node?

Currently they use the same database that is pointing to the same data node in a MySQL Cluster Replication (synchronized - Galera).

But at some point I want one openfire server to point to one data node and another openfire server to point at another data node so in case the data node fails I am still in business.

Is this kind of setup with a database cluster setup possible with the clustering plugin?

I really appreciate any feedback.

Thank you!

To ensure that the member nodes in an Openfire cluster are able to find each other, try using TCP-based discovery in lieu of the default UDP/multicast configuration (described above) and see if you get a better result. Upside for this approach is a reliable point-to-point communication path between servers; the main drawback for using TCP is the need for static configuration for the well-known cluster member(s).

As for your database cluster, in theory it should work just fine, but I personally have not used a multi-master setup for MySQL. Perhaps you could give it a whirl and report back with your findings.

Please also note that the new Hazelcast clustering plugin is only compatible with Openfire 3.7.2 Beta (not yet released) and newer. If you would like to try it using an older Openfire, try using this custom build which has been back-ported by Dele (one of our helpful @community advocates).

I met a critical bug while using hazelcast based cluster.

I always got the following exception. Then openfire server hangs, which means the process still exists but it does not work.

I use openfire 3.7.2 beta, hazelcast 2.3.1 (or hazelcast 2.4 still has this error.)

2012.10.29 14:57:47org.jivesoftware.util.cache.CacheFactory - Hazelcast Instance is not active!

java.lang.IllegalStateException:Hazelcast Instance is not active!

atcom.hazelcast.impl.FactoryImpl.initialChecks(FactoryImpl.java:711)

atcom.hazelcast.impl.MProxyImpl.beforeCall(MProxyImpl.java:102)

atcom.hazelcast.impl.MProxyImpl.access$000(MProxyImpl.java:49)

atcom.hazelcast.impl.MProxyImpl$DynamicInvoker.invoke(MProxyImpl.java:64)

at$Proxy0.getLocalMapStats(Unknown Source)

atcom.hazelcast.impl.MProxyImpl.getLocalMapStats(MProxyImpl.java:258)

atcom.jivesoftware.util.cache.ClusteredCache.getCacheSize(ClusteredCache.java:1 40)

atorg.jivesoftware.util.cache.CacheWrapper.getCacheSize(CacheWrapper.java:73)

atcom.jivesoftware.util.cache.ClusteredCacheFactory.updateCacheStats(ClusteredC acheFactory.java:344)

atorg.jivesoftware.util.cache.CacheFactory$1.run(CacheFactory.java:636)

Can anybody give my some advice?