powered by Jive Software

Session tab hangs and doesnt respond and “cluster” window shows N/A in the overview

we have two Openfire nodes running in a cluster, using the Hazelcast Clustering plugin.
after some random time while viewing the the sessions tab and refreshing it every once and a while or with the “auto refresh” enabled, the tab stops to respond/it takes some time for it to actually load, same goes with the “cluster” window, it takes some time for it to respond and when it does the overview details shows N/A instead of numbers.
while this happens users get disconnected and/or it takes some time for them to receive messages, i can also see the following errors in the log files:
ERROR [hz.openfire.cached.thread-10]: org.jivesoftware.openfire.plugin.util.cache.ClusteredCacheFactory - Failed to execute cluster task within org.jivesoftware.util.SystemProperty@62654355 seconds

java.util.concurrent.TimeoutException: MemberCallableTaskOperation failed to complete within 30 SECONDS. Invocation{op=com.hazelcast.executor.impl.operations.MemberCallableTaskOperation{serviceName=‘hz:impl:executorService’, identityHash=1047552760, partitionId=-1, replicaIndex=0, callId=5436057, invocationTime=1608577817524 (2021-01-11 13:10:17.524), waitTimeout=-1, callTimeout=30000, name=openfire::cluster::executor}, tryCount=250, tryPauseMillis=500, invokeCount=1, callTimeoutMillis=30000, firstInvocationTimeMs=1608577817525, firstInvocationTime=‘2021-01-11 13:10:17.525’, lastHeartbeatMillis=1608577844979, lastHeartbeatTime=‘2021-01-11 13:10:44.979’, target=[]:5701, pendingResponse={VOID}, backupsAcksExpected=0, backupsAcksReceived=0, connection=null}

we preformed the following steps to try and resolve the issue (nothing worked):

  1. changed the OPENFIRE_OPTS.
  2. set the following caches “max size” to unlimited:
  • Client Session Info Cache
  • Roster
  • Offline Presence
  1. updated Openfire version to 4.5.4

system details:
OS: CnetOS7
Openfire version: 4.5.2 (with 6GB xmx/xms configured on each node)
Hazelcast version: 2.5.0 (configured to use tcp-ip+server’s ip listed)