External component and shutting down one Openfire node of Hazelcast cluster

Hello.

I’m using Hazelcast cluster of two Openfire nodes and HAProxy as a load balancer (you can see my configuration in Re: java.lang.NullPointerException when user logging in to one of two clustered servers). I also have external component, which serves client requests.

HAProxy is configured to balance nodes: nodes port 5222 for c2s connections, nodes port 5275 for external component connections. Both balance strategies are roundrobin.

While both nodes are running, client serving through external component works ok (for both nodes), and external component is shown on component-session-summary.jsp page on both nodes.

But when I stop the node, to which external component was connected by LB:

  • On the remaining node, external component disappears from component-session-summary.jsp.
  • External component loses connection and tries to reconnect. LB redirects to port 5275 of alive node. But Openfire node sends stream:error->confict in reply on registration request. That seems strange because *component-session-summary.jsp *does not show component as registered. Logs from Openfire console (produced by *xmldebugger *plugin):
ExComp - RECV (23020725): <stream:stream xmlns="jabber:component:accept" xmlns:stream="http://etherx.jabber.org/streams" to="myExtComp.myExtCompDomain">
ExComp - SENT (23020725): <?xml version='1.0' encoding='UTF-8'?><stream:stream xmlns:stream="http://etherx.jabber.org/streams" xmlns="jabber:component:accept" from="myExtComp.myExtCompDomain"><stream:error xmlns:stream="http://etherx.jabber.org/streams"><conflict xmlns="urn:ietf:params:xml:ns:xmpp-streams"/></stream:error>
ExComp - SENT (23020725): </stream:stream>

After restarting the node, which has been stopped:

  • External component successfully connects back to **restarted **node! Again, *component-session-summary.jsp *shows my external component on both nodes.
  • When client connects to the restarted node, client is successfully serviced by external component.
  • When client connects to the other node, connection from Openfire to external component fails.

So, the questions are:

  • Does sharing of external components work correctly in hazelcast cluster?
  • Is external component redistribution on node shutdown expected to work correctly?
  • Is there a way to prevent stream:error - *conflict *error while external component connects?
  • What are the probable mistakes of my cluster + LB configuration?
  • Is there any working example of configuring LB (preferably HAProxy) for serving c2s connections through external component, including node stopping?

We are experiencing the same issue, albeit with some differences…

We operate HAProxy and two Openfire 3.10.2 clustered nodes with latest Sep 2015 hazelcast build. The nodes operate correctly when we directly attach clients or components onto them.

However, when we use HAProxy, either with sticky-sessions loadbalanced OR with active/passive [backup] server scenarios, we have these issues:

(1) Clients connect and then after about 10 seconds disconnect and try to reconnect. We cannot maintain an active XMPP client session

(2) External components when disconnected due to a loadbalacer node failure in the second scenario (i.e. we make the backup node active), they do not reconnect. When we use loadbalanced sticky sessions for external components, they simply cannot attach to the Openfire cluster.

Has anyone a valid HAProxy configuration active? Have the issues of the previous post been addressed?