You should start by verifying that both servers (case 1 and case2) are joined to a single cluster. The easiest way to confirm this is using the “Clustering” page on the admin console. You should see two servers listed on this page. You can also see this information in the system log file (nohup.out) for each of the servers.
If the servers are not joined as a single cluster, you may need to modify the default Hazelcast configuration file to use TCP (unicast) rather that UDP (multicast) for the discovery protocol (depending on your network configuration).
If the servers are properly joined in a single cluster, you should check the error and warning log files to determine whether you have some other type of configuration problem. Feel free to post further details here.
We have here exactly the same problem (2 nodes with Openfire 3.8.1, Oracle database 11.2, Hazelcast 1.0.4 with TCP/IP configuration).
All the users are distributed on the 2 nodes, the chat is working perfectly but the pubsub items are dynamically received only by users on the same node as the publisher. The users of the second node do not receive anything and when this users cut and reconnect to the same “bad” node they receive the last 2 missed messages.
It seems like Hazelcast transmit the items to the second node (We can see it in the console in the “clustering” item) but, on the second node, it do not seems to “send” the signal to the “pubsub engine” to treat this ones has if they were just published…
:
It seems like Hazelcast transmit the items to the second node (We can see it in the console in the “clustering” item) but, on the second node, it do not seems to “send” the signal to the “pubsub engine” to treat this ones has if they were just published…
It isn’t supposed to. The second node is simply supposed to deliver the message to the user.
This obviously doesn’t fix your problem, but I thought I would clarify that for you. The node which receives the publish does all the processing and routes to all the subscribers. The hazelcast messages you see would be that routing to users logged in to the second node. The second node in this case would not need to do any pubsub processing, just message delivery. Which in your case doesn’t appear to be happening.
we just have tried back the same configuration with OF 3.8.2 and HA 1.0.6 and we still have the same problem (If we publish on #1 the clients on #2 catch no messages).
For onformation we have no error in any of the logs files and HA see correctly the cluster.
Does anybody has experimented OF clustering with PUBSUB Items ?