Introducing Hazelcast ... a new way to cluster Openfire!

amazing work ! thanks

Very nice! Glad that openfire is gaining traction again.

I might have a look at it again… If I won’t encounter memory issues as when I tried the last time it might be a good alternative.

I just deploy a second server for my domain (forumanalogue.fr). I already tried the previous plugin but it wasn’t simple and didn’t work so good.

This new plugin is easy to install and work immediatly, if you don’t forget to use the same database for the two nodes ^^

The two servers run OpenFire 3.7.1 on Ubuntu 12.04 LTS with a mysql database. The both are on my local network but are opened server (free registration : cf http://xmpp.net/). I’ve got over 460 users.

Edit: It don’t work that well after word, cf : http://community.igniterealtime.org/message/225398#225398

I met a critical bug while using hazelcast based cluster.

I always got the following exception. Then openfire server hangs, which means the process still exists but it does not work.

I use openfire 3.7.2 beta, hazelcast 2.3.1 (or hazelcast 2.4 still has this error.)

2012.10.29 14:57:47org.jivesoftware.util.cache.CacheFactory - Hazelcast Instance is not active!

java.lang.IllegalStateException:Hazelcast Instance is not active!

atcom.hazelcast.impl.FactoryImpl.initialChecks(FactoryImpl.java:711)

atcom.hazelcast.impl.MProxyImpl.beforeCall(MProxyImpl.java:102)

atcom.hazelcast.impl.MProxyImpl.access$000(MProxyImpl.java:49)

atcom.hazelcast.impl.MProxyImpl$DynamicInvoker.invoke(MProxyImpl.java:64)

at$Proxy0.getLocalMapStats(Unknown Source)

atcom.hazelcast.impl.MProxyImpl.getLocalMapStats(MProxyImpl.java:258)

atcom.jivesoftware.util.cache.ClusteredCache.getCacheSize(ClusteredCache.java:1 40)

atorg.jivesoftware.util.cache.CacheWrapper.getCacheSize(CacheWrapper.java:73)

atcom.jivesoftware.util.cache.ClusteredCacheFactory.updateCacheStats(ClusteredC acheFactory.java:344)

atorg.jivesoftware.util.cache.CacheFactory$1.run(CacheFactory.java:636)

Can anybody give my some advice?

I tried this on AWS Ec2. Everything good but when i place these instances under AWS LoadBalancer , unable to connect to nodes via loadbalancer . I opened all nessary ports on LoadBalancer and if anyone worked with AWS instances and LoadBalancer then pls help me

I have not used the AWS load balancer, but here are a few things to keep in mind for load balancing in general:

  • Depending on which protocol you are using, you will need to configure the load balancer to use TCP (5222), HTTP (7070), or HTTPS (7443) to allow XMPP clients to connect, plus additional port(s) for the admin console (9090/9091) or S2S connections (5269), etc. as needed for your particular deployment.
  • You will need to have a valid health check for each clustered member. In our case we use a simple index.html on the BOSH port for this purpose (served from the {openfire_home}/resources/spank directory). Without a valid health check all the members will be marked as unavailable.
  • Configure your application to send traffic to the Openfire cluster via the DNS name assigned by AWS to your load balancer.

Hope that helps … let us know how it works out.

We have two Openfire 3.7.1 (Base version) Servers running and both are configured as domain - xmppserver. Both servers are on Windows Server and we have modified the hosts file (C:\windows\System32\drivers\ect\host) to include 127.0.0.1 xmppserver. We have clustered the two servers using Hazelcast and their ipaddresses and they are working fine. We have our own developed components like conference which attach to the domain and they are all working well. they work well because they only provide custom entries like search which do not depend on the members, However when we try and run the group chat which uses conference.xmppserver, we have encountered many issues. We get issues like not authorised - will be posting the actual details later.

Issue -

We think that there are two conference servers running and that each server is still managing its own conference. Is there some configurtion that we have done wrongly or should we upgrade to Openfire 3.7.2Beta ? Is this something that is known and fixed. Would appreciate if anyone could point us in the right direction.

Thanks for any ideas. Happy new year guys.

Below is the error - if we uncluster the servers it starts to work again -
Below are the three scenarios we get very consistently.

When we restart the openfire server - we get this

SEND at 2:26:58 PM :

SEND at 2:26:58 PM :

RECV at 2:27:01 PM :This">ituser1@xmppserver/ Messenger">This room is now unlocked.

SEND at 2:27:06 PM :01101000011xmppserverAdmins1text

RECV at 2:27:06 PM :

01101000011xmppserverAdmins1textroom2@conference.xmppserver

However after we have tried to create a room after restart - the second attempts onwards


Trying to create room2 again

SEND at 2:20:55 PM :

SEND at 2:20:55 PM :

RECV at 2:20:57 PM :

When we tried to create another room1 by ituser1 - we got an error on presence and then the error on room creation


SEND at 2:21:06 PM :

SEND at 2:21:06 PM :

RECV at 2:21:06 PM :

RECV at 2:21:06 PM :

Also we encountered an error with the openfire admin when looking at the client session -

Exception:

java.lang.IllegalStateException: Requested node [B@7916bd not found in cluster

at com.jivesoftware.util.cache.CoherenceClusteredCacheFactory.doSynchronousCluster Task(CoherenceClusteredCacheFactory.java:325)

at org.jivesoftware.util.cache.CacheFactory.doSynchronousClusterTask(CacheFactory. java:538)

at com.jivesoftware.openfire.session.RemoteSession.doSynchronousClusterTask(Remote Session.java:171)

at com.jivesoftware.openfire.session.RemoteSession.isSecure(RemoteSession.java:130 )

at org.jivesoftware.openfire.admin.session_002dsummary_jsp._jspService(session_002 dsummary_jsp.java:362)

at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:530)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1216)

at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:11 8)

at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:52)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1187)

at org.jivesoftware.util.LocaleFilter.doFilter(LocaleFilter.java:74)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1187)

at org.jivesoftware.util.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingF ilter.java:50)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1187)

at org.jivesoftware.admin.PluginFilter.doFilter(PluginFilter.java:78)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1187)

at org.jivesoftware.admin.AuthCheckFilter.doFilter(AuthCheckFilter.java:165)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1187)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:425)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)

at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:494)

at org.eclipse.jetty.server.session.SessionHandler.handle(SessionHandler.java:182)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:93 3)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:362)

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:867 )

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)

at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:245)

at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.jav a:126)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:113)

at org.eclipse.jetty.server.Server.handle(Server.java:334)

at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:559)

at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConne ction.java:992)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:541)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:203)

at org.eclipse.jetty.server.HttpConnection.handle(HttpConnection.java:406)

at org.eclipse.jetty.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:4 62)

at org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:436)

at java.lang.Thread.run(Unknown Source)

Based on the messages listed, it appears you are getting a system error (500). In this case, the best source of information would be the error log from the corresponding cluster member. You would expect to see a stack trace detailing the specific error - this would be helpful for isolating the root cause.

Also, you can confirm whether the cluster members have correctly identified one another in two ways. First, using the “Clustering” tab in the admin console, you will see a list of all the members known to be a part of the cluster. If you only see one server listed, the cluster is not properly configured. The second/alternative approach is to view the system console log (typically nohup.out) where you can see the system messages emitted by the clustering component as members join/leave the cluster.

The error you observed via the “Sessions” page in the admin console is a known issue (fixed in the 3.7.2 nightly build) that may occur if a cluster member has recently been restarted. A possible workaround is to view the client sessions from the admin console running on the other cluster member.

Hi Tom,

Thank you for great plugin. I am using the plugin version for Openfire 3.7.1 in production. Cluster works good but I have this error in logs:

2013.01.08 06:32:56 org.jivesoftware.openfire.interceptor.InterceptorManager - Error in interceptor: org.jivesoftware.openfire.plugin.BroadcastingPlugin@7dbd9d76 while intercepting:

java.lang.NullPointerException

Any ideea about what might cause this ?

Thanks in advance !

Hi Don -

Best bet for troubleshooing an NPE would be to look at the stack trace that follows the given message in your error log. I’m not too familiar with the BroadcastingPlugin, but on a first glance it appears to expect the “to” attribute, which is missing in the given inbound presence packet.

You can also try disabling/removing the broadcasting plugin (if possible) as a workaround.

Cheers,

Tom

Thank you very much for answering. Indeed removing broadcasting plugin eliminated the error. But now I am facing another issue. Conferencing isn’t working on my cluster.

sending: 4

async recv:

Hi

I just tried out the hazelcast plugin this afternoon.

I’ve already added a systems property indicated “hazelcast.config.xml.filename” system property I also edited the file and added the specific addresses to the know host list.

I am getting an error that some of you might be familiar with. Can someone give me some advise on what I am doing wrong?

com.jivesoftware.util.cache.ClusteredCacheFactory - Unable to start clustering - continuing in local mode. java.lang.NullPointerException

at com.jivesoftware.util.cache.ClusterClassLoader.getResource(ClusterClassLoader.j ava:79)

at java.lang.ClassLoader.getResourceAsStream(ClassLoader.java:1159)

at com.hazelcast.config.ClasspathXmlConfig.(ClasspathXmlConfig.java:39)

at com.hazelcast.config.ClasspathXmlConfig.(ClasspathXmlConfig.java:33)

at com.jivesoftware.util.cache.ClusteredCacheFactory.startCluster(ClusteredCacheFa ctory.java:121)

at org.jivesoftware.util.cache.CacheFactory.startClustering(CacheFactory.java:622)

at org.jivesoftware.openfire.cluster.ClusterManager.startup(ClusterManager.java:28 5)

at org.jivesoftware.openfire.cluster.ClusterManager$1.xmlPropertySet(ClusterManage r.java:65)

at org.jivesoftware.util.PropertyEventDispatcher.dispatchEvent(PropertyEventDispat cher.java:98)

at org.jivesoftware.util.XMLProperties.setProperty(XMLProperties.java:460)

at org.jivesoftware.util.JiveGlobals.setXMLProperty(JiveGlobals.java:435)

at org.jivesoftware.openfire.cluster.ClusterManager.setClusteringEnabled(ClusterMa nager.java:324)

at org.jivesoftware.openfire.admin.system_002dclustering_jsp._jspService(system_00 2dclustering_jsp.java:103)

at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1359)

at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:11 8)

at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:52)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.util.LocaleFilter.doFilter(LocaleFilter.java:74)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.util.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingF ilter.java:50)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.admin.PluginFilter.doFilter(PluginFilter.java:78)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.admin.AuthCheckFilter.doFilter(AuthCheckFilter.java:164)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:478)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)

at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:520)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:22 7)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:94 1)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186 )

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875 )

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)

at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:250)

at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.jav a:149)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)

at org.eclipse.jetty.server.Server.handle(Server.java:349)

at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:441)

at org.eclipse.jetty.server.HttpConnection$RequestHandler.content(HttpConnection.j ava:936)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:801)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:224)

at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:51 )

at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.jav a:586)

at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java :44)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:598 )

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:533)

at java.lang.Thread.run(Thread.java:636)

Does it work if you remove the hazelcast.config.xml.filename system property and just edit the file in plugins/hazelcast/classes/?

I’ve not tried using the system property - Will try it today and make sure it works.

Hi David,

I tried it without the property and I get this error. I just added it after reading about it in this thread and it didn’t seem to help.

Best Regards,

Stevenson Lee

Hello,

I would request shipping with the hazelcast-cloud.jar for AWS integration. I’ve tested w/ 2.5.1 hazelcast.jar and hazelcast-cloud.jar under 3.8.2 Openfire and seems to work well enough.

http://www.hazelcast.com/docs/1.9.4/manual/multi_html/ch11s02.html

Hi David -

Thanks for the suggestion … we have had a few inquiries about running Openfire+Hazelcast on AWS but did not have a good answer. Beyond the need for the JAR file and AWS configuration settings, were there any other steps you needed to take to get clustering working on AWS?

I will update the plugin to include the hazelcast-cloud.jar file in the next revision.

Cheers,

Tom

Hi Tom,

Thanks very much for including the .jar. There were no other steps required other than ensuring that you create a distinct security group for the EC2 cluster and permit ports required for the cluster nodes to communicate.

I did note one issue: when I enable a cluster node from the Openfire admin UI it appears to hang, I let it go for a few minutes. Amazon API can be super slow so perhaps a timeout in your plugin code is affected, I saw the cache successfully instantiated in the logs. Restarting openfire after enabling clustering made everyone happy. I’m using both the security group and tag options in the Hazelcast config, which may add to the delay. Otherwise I’ve seen no errors running in this method, this is with a simple 2-node deployment.

I’d be happy to communicate setup docs once I work out any issue with the setup for the ELB and do more thorough testing if you’d like.

Hi all,

I am trying to use Hazelcast plugin v 1.2.2 with Openfire v 3.9.3, hosted in rackspace cloud.

The config file is using TCP to identify the cluster members.

Server1 lists both itself and Server2 in the Clustering panel, but Server2 only lists itself and intermittently lists Server1. Whenever it can not lists Server1, I checked the log, has the below error stack, Can anybody please help me in guiding how to get this resolved:

2014.12.06 12:12:46 com.jivesoftware.util.cache.ClusteredCacheFactory - Failed to execute cluster task within 30 seconds

java.util.concurrent.TimeoutException

at com.hazelcast.spi.impl.InvocationImpl$InvocationFuture.resolveResponse(Invocati onImpl.java:466)

at com.hazelcast.spi.impl.InvocationImpl$InvocationFuture.get(InvocationImpl.java: 314)

at com.hazelcast.util.executor.DelegatingFuture.get(DelegatingFuture.java:66)

at com.jivesoftware.util.cache.ClusteredCacheFactory.doSynchronousClusterTask(Clus teredCacheFactory.java:334)

at org.jivesoftware.util.cache.CacheFactory.doSynchronousClusterTask(CacheFactory. java:586)

at org.jivesoftware.openfire.admin.system_002dclustering_jsp._jspService(system_00 2dclustering_jsp.java:123)

at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:97)

at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)

at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:547)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1359)

at com.opensymphony.module.sitemesh.filter.PageFilter.parsePage(PageFilter.java:11 8)

at com.opensymphony.module.sitemesh.filter.PageFilter.doFilter(PageFilter.java:52)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.util.LocaleFilter.doFilter(LocaleFilter.java:74)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.util.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingF ilter.java:50)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.admin.PluginFilter.doFilter(PluginFilter.java:78)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.jivesoftware.admin.AuthCheckFilter.doFilter(AuthCheckFilter.java:164)

at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.ja va:1330)

at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:478)

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:119)

at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:520)

at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:22 7)

at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:94 1)

at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:409)

at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:186 )

at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:875 )

at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:117)

at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandler Collection.java:250)

at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.jav a:149)

at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:110)

at org.eclipse.jetty.server.Server.handle(Server.java:349)

at org.eclipse.jetty.server.HttpConnection.handleRequest(HttpConnection.java:441)

at org.eclipse.jetty.server.HttpConnection$RequestHandler.headerComplete(HttpConne ction.java:919)

at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:582)

at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:218)

at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:51 )

at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.jav a:586)

at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java :44)

at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:598 )

at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:533)

at java.lang.Thread.run(Thread.java:745)

Best Regards,

SM