We’re doing some tests with clustering Openfire with the hazelcast plugin. Our setup has ~11k connected clients and when running non-clustered, the Sessions tab in the admin UI loads in less than a second. When we activate clustering the Sessions tab takes minutes to load. Is this normal or is it indicative of other problems?
We’re running Openfire v4.6.6 in Docker, which in turn runs on Amazon linux on t3a.large instances on AWS. Hazelcast version is 2.6.0. Load is moderate.
We have tried updating to the latest version (which is 4.7.1 at time of writing) but the problem persists.
I do not expect that to happen. It sounds like you uncovered an inefficiency in the code.
Could you please provide a Java thread dump taken at the time when the session page is loading. This likely will help us pinpoint the source of the issue.
There are many ways to create a thread dump. For convenience, an Openfire plugin that is named “thread dump plugin” is made available to allow you to do this through the Openfire admin console.
Thank you. I suspect that this is the culprit. I’ve created a new issue in our issue tracker for us to have a look at this: [OF-2439] - Ignite Realtime Jira
"Jetty-QTP-AdminConsole-7201" #7201 prio=5
java.lang.Thread.State: WAITING
at sun.misc.Unsafe.park(Native Method)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
at com.hazelcast.spi.impl.AbstractInvocationFuture.get(AbstractInvocationFuture.java:161)
at com.hazelcast.map.impl.proxy.MapProxySupport.containsKeyInternal(MapProxySupport.java:662)
at com.hazelcast.map.impl.proxy.MapProxyImpl.containsKey(MapProxyImpl.java:299)
at org.jivesoftware.openfire.plugin.util.cache.ClusteredCache.containsKey(ClusteredCache.java:106)
at org.jivesoftware.util.cache.CacheWrapper.containsKey(CacheWrapper.java:117)
at org.jivesoftware.openfire.spi.RoutingTableImpl.isAnonymousRoute(RoutingTableImpl.java:866)
at org.jivesoftware.openfire.SessionManager.isAnonymousRoute(SessionManager.java:794)
at org.jivesoftware.openfire.plugin.session.RemoteClientSession.isAnonymousUser(RemoteClientSession.java:85)
at org.jivesoftware.openfire.SessionResultFilter$SessionComparator.compare(SessionResultFilter.java:235)
at org.jivesoftware.openfire.SessionResultFilter$SessionComparator.compare(SessionResultFilter.java:214)
at java.util.TimSort.mergeLo(TimSort.java:717)
at java.util.TimSort.mergeAt(TimSort.java:514)
at java.util.TimSort.mergeCollapse(TimSort.java:441)
at java.util.TimSort.sort(TimSort.java:245)
at java.util.Arrays.sort(Arrays.java:1512)
at java.util.ArrayList.sort(ArrayList.java:1464)
at java.util.Collections.sort(Collections.java:177)
at org.jivesoftware.openfire.SessionManager.getSessions(SessionManager.java:865)
at org.jivesoftware.openfire.admin.session_002dsummary_jsp._jspService(session_002dsummary_jsp.java:240)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1452)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:791)
at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.jivesoftware.util.LocaleFilter.doFilter(LocaleFilter.java:73)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.jivesoftware.util.SetCharacterEncodingFilter.doFilter(SetCharacterEncodingFilter.java:49)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.jivesoftware.admin.PluginFilter.doFilter(PluginFilter.java:226)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.jivesoftware.admin.AuthCheckFilter.doFilter(AuthCheckFilter.java:234)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:201)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:548)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:602)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)
at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)
at org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)
at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1435)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)
at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)
at org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)
at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1350)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)
at org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:191)
at org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:146)
at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)
at org.eclipse.jetty.server.Server.handle(Server.java:516)
at org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:388)
at org.eclipse.jetty.server.HttpChannel$$Lambda$205/865376010.dispatch(Unknown Source)
at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:633)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:380)
at org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:273)
at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)
at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)
at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:336)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:313)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:171)
at org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:129)
at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:375)
at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:773)
at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:905)
at java.lang.Thread.run(Thread.java:748)
I don’t know when this bug will be addressed. It’s now on a heap of issues that is being tackled by volunteers whenever they are available and willing to invest some time.
Assuming that this is indeed the cause of the slowdown: the inefficiency appears to be with looking up a characteristic of a session that ‘lives’ on a different cluster node than the one on which the lookup is performed. In theory, that affects many actions, but in practise I suspect that it’s noticeable only when many sessions are being evaluated at the same time (such as when a admin console is used to show a listing of all sessions).