High CPU thread on 5.0.1

This thread shows in high CPU consuming threads from thread dumps.
However, it is not easy to find out the issue since there is no log printed. Also, instead of throw/catch the UserNotFoundException - we directly check the if the node is null and return false. That could avoid the thread stack creation.

Jetty-QTP-BOSH-281” #281 prio=5 os_prio=0 cpu=104818.44ms elapsed=66582.30s tid=0x00007f39bc9f4df0 nid=0xfb7b runnable  \[0x00007f396d924000\]

java.lang.Thread.State: RUNNABLE
    at java.lang.Throwable.fillInStackTrace(java.base@17.0.10/Native Method)
    at java.lang.Throwable.fillInStackTrace(java.base@17.0.10/Throwable.java:798)
    - locked <0x00000007a3745af0> (a org.jivesoftware.openfire.user.UserNotFoundException)
    at java.lang.Throwable.<init>(java.base@17.0.10/Throwable.java:271)
    at java.lang.Exception.<init>(java.base@17.0.10/Exception.java:67)
    at org.jivesoftware.openfire.user.UserNotFoundException.<init>(UserNotFoundException.java:36)
    at org.jivesoftware.openfire.user.UserManager.getUser(UserManager.java:273)
    at org.jivesoftware.openfire.user.UserManager.isRegisteredUser(UserManager.java:438)
    at org.jivesoftware.openfire.IQRouter.handle(IQRouter.java:371)
    at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:106)
    at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:74)
    at org.jivesoftware.openfire.net.StanzaHandler.processIQ(StanzaHandler.java:392)
    at org.jivesoftware.openfire.net.ClientStanzaHandler.processIQ(ClientStanzaHandler.java:90)
    at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:334)
    at org.jivesoftware.openfire.net.StanzaHandler.processStanza(StanzaHandler.java:222)
    at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:114)
    at org.jivesoftware.openfire.websocket.WebSocketClientConnectionHandler.onTextMethod(WebSocketClientConnectionHandler.java:153)
    at java.lang.invoke.DirectMethodHandle$Holder.invokeVirtual(java.base@17.0.10/DirectMethodHandle$Holder)
    at java.lang.invoke.LambdaForm$MH/0x00007f3a68007c00.invoke(java.base@17.0.10/LambdaForm$MH)
    at java.lang.invoke.LambdaForm$MH/0x00007f3a68775c00.invoke_MT(java.base@17.0.10/LambdaForm$MH)
    at org.eclipse.jetty.websocket.core.messages.StringMessageSink.accept(StringMessageSink.java:67)
    at org.eclipse.jetty.ee8.websocket.common.JettyWebSocketFrameHandler.acceptMessage(JettyWebSocketFrameHandler.java:297)
    at org.eclipse.jetty.ee8.websocket.common.JettyWebSocketFrameHandler.onTextFrame(JettyWebSocketFrameHandler.java:360)
    at org.eclipse.jetty.ee8.websocket.common.JettyWebSocketFrameHandler.onFrame(JettyWebSocketFrameHandler.java:212)
    at org.eclipse.jetty.websocket.core.WebSocketCoreSession$IncomingAdaptor.lambda$onFrame$0(WebSocketCoreSession.java:649)
    at org.eclipse.jetty.websocket.core.WebSocketCoreSession$IncomingAdaptor$$Lambda$680/0x00007f3a68777238.run(Unknown Source)
    at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1524)
    at org.eclipse.jetty.server.handler.ContextHandler$ScopedContext.run(ContextHandler.java:1511)
    at org.eclipse.jetty.websocket.core.server.internal.AbstractHandshaker$1.handle(AbstractHandshaker.java:179)
    at org.eclipse.jetty.websocket.core.WebSocketCoreSession$IncomingAdaptor.onFrame(WebSocketCoreSession.java:649)
    at org.eclipse.jetty.websocket.core.ExtensionStack.onFrame(ExtensionStack.java:113)
    at org.eclipse.jetty.websocket.core.WebSocketCoreSession.onFrame(WebSocketCoreSession.java:465)
    at org.eclipse.jetty.websocket.core.WebSocketConnection.onFrame(WebSocketConnection.java:255)
    at org.eclipse.jetty.websocket.core.WebSocketConnection.fillAndParse(WebSocketConnection.java:448)
    at org.eclipse.jetty.websocket.core.WebSocketConnection.onFillable(WebSocketConnection.java:333)
    at org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:322)
    at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:99)
    at org.eclipse.jetty.io.SelectableChannelEndPoint$1.run(SelectableChannelEndPoint.java:53)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.runTask(AdaptiveExecutionStrategy.java:480)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.consumeTask(AdaptiveExecutionStrategy.java:443)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.tryProduce(AdaptiveExecutionStrategy.java:293)
    at org.eclipse.jetty.util.thread.strategy.AdaptiveExecutionStrategy.run(AdaptiveExecutionStrategy.java:201)
    at org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:311)
    at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:981)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.doRunJob(QueuedThreadPool.java:1211)
    at org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1166)
    at java.lang.Thread.run(java.base@17.0.10/Thread.java:840)

Hi Dhina, thanks for your report. I’m sorry that you’re running into this issue.

From your data:

cpu=104818.44ms
at java.lang.Throwable.fillInStackTrace

I think this means:

  • A single thread has burned ~105 seconds of CPU
  • Not waiting
  • Not blocked
  • Just building stack traces

That is wild.

As so for this single stack trace shows, a culprit is an exception-based control flow in the check if a particular value represents a registered user. Exception-based control flow adds some overhead: creating the exception takes some resources. This is somewhat of a code-smell, but it has been in Openfire forever. You’re the first one calling this out.

Given that this is apparently causing you real-world problems, I think that this inefficiency should be addressed in Openfire. I’ve created the following ticket for the issue: OF-3164.

That said: this kind of issue is often not the original problem, but an amplifier. The real upstream cause might be something like:

  • Misbehaving clients
  • Bots / scanners
  • Federation misconfiguration
  • Unexpected traffic patterns

It might be worth looking into this, to see if that affects your environment.