I have creates a simple cluster of two openfire instances which work fine but I have realizes that the xmpp.fqdn of second server has been changed to the first server so both servers now have same xmpp.fqdn. The questions are:
How can I use the cluster with this configuration? I used to connect to each server using the IP address? Would that enough now to just connect to one of the servers and that will automatically share the load in the cluster?
I use SSL certificates. My certificates are bound to xmpp.fqdn and the real IP address of the server. Now when the xmpp.fqdn for one server differs from its IP address I am not sure how to use certificate for that server. Should I use certificate with cluster config for each server?
Do I need a load balancer to share the load over all openfire servers in a cluster or this will be done automatically? In that case would it differ to connect to what server in the cluster?
a) There was a bug whereby every server in a cluster incorrectly shared the same FQDN. That’s now fixed, the FQDN is now stored in the node-specific openfire.xml rather than the shared in the cluster database. https://issues.igniterealtime.org/browse/OF-1548
b) Every node in the cluster should share the same XMPP domain - xmpp.domain
c) Ideally you should connect using the XMPP doman, and have DNS set up appropriately. A single TLS certificate for xmpp.domain would then suffice for all servers.
d) Given a decent client, and appropriate DNS setup (i.e. xmpp.domain listing all servers in the cluster) there should be no need for a load balancer.
I use a single wildcard certificate for all nodes in the cluster and it seems to work. The item (d) is however interesting. I have setup all SRV DNS records for all nodes so there are no warnings about the records so I assume that I have set DNS records up correctly. I have added following DNS SRV records:
_xmpp-client._tcp.mydomain.com
_xmpps-client._tcp.mydomain.com
_xmpp-server._tcp.mydomain.com
_xmpps-server._tcp.mydomain.com
I am using OpenFire 4.4.3 and the problem is that Hazelcast does not work (I could setup a cluster easily using older versions of OpenFire). I am wondering if the cluster was setup correctly, then would that be enough if my clients connect directly to _xmpps-client._tcp.mydomain.com for a secure connection instead of directly connect to the server IP address?
If the above statement is correct, then would that be as effective as having a load balancer in front of all servers? I mean a NLB seems to be very effective for sharing the load comparing to SRV records which as far as I know use a simple Round Robin scheme. Would that be correct?
You really need to fix that to get it working. What’s your config - are you using the (preconfigured) multicast setup? Are all your nodes on the same LAN? Have you tried setting up the unicast config?