Openfire 4.3.2 cpu goes at 100% after a few hours on linux

I have logged an issue with Apache MINA: https://issues.apache.org/jira/browse/DIRMINA-1111

1 Like

Openfire 4.4.0 has yet to be released, which exact nightly build / master build are you using?

Hi !

I’m a MINA committer. A few comments:

  • a RUNNABLE thread may just do nothing, if it’s executing a native method. The JVM has no way to know what’s being done in a native call, so it just mark the thread as RUNNABLE. It may just wait for a resource.
  • It would be useful to do a ’ top -H -p ’ followed by a 'jstack ’ to get more information about the thread burning your CPU.

MINA 2.1. is pretty much the same as MINA 2.0, we just have added an event method in the API, and fixed an issue caused by the presence of a Compression filter in the chain.

I’m following this thread and the MINA JIRA thread. Side note: Netty is facing the exact same problem, and it’s frequently due to external causes.

3 Likes

The master branch . I build from source as I extended some openfire core like :

  • Persisting the room creator so no other owners can remove or change the affiliation of the room creator.
  • Check Brute force for not allowing password guessing
  • Limit each IP address how many sessions could be opened at once (Mainly to prevent flooding)
  • Set the time between each MUC message can an occupant send ( For example an occupant can not send two messages during 500ms period.

Hi,
I have followed this guide as I firstly thought that the Java GC was causing this problem but all the threads that burning CPU were from Nioprocessor .

@suf126a Just because NioProcessor is burning does not mean MINA is at fault. NioProcessor executes the Non-MINA Application code. Please generate a flamegraph to help us determine which methods are utilizing the CPU time. https://github.com/brendangregg/FlameGraph

I’ve worked with @Emmanuel_Lecharny and @johnnyv from the MINA team. A fix has been applied to the 2.1.X branch, that should prevent the 100% CPU issue from occurring.

@suf126a would you mind testing this fix? The commit of interest is 9274ddad3edce5b8796d98fdb0a9ccbe487a9b9e. I have built these MINA libraries that include this fix (but feel free to build your own, if you prefer):

mina-core-2.1.3-SNAPSHOT.jar (651.7 KB)
mina-filter-compression-2.1.3-SNAPSHOT.jar (13.0 KB)
mina-integration-beans-2.1.3-SNAPSHOT.jar (40.5 KB)
mina-integration-jmx-2.1.3-SNAPSHOT.jar (28.6 KB)
mina-integration-ognl-2.1.3-SNAPSHOT.jar (15.8 KB)

I’ve observed that that the 100% CPU issue is prevented with a very similar fix applied to another environment that suffers from the same problem. Of interest is that the bug that causes the 100% CPU fix is triggered primarily when an irregular situation occurs (in case of the other environment, it appears to be triggered by events being timed out - although we’re still investigating). In other words, it is not unthinkable that with this fix applied, another issue arises, which would be the ‘root cause’ of the problems that you’re seeing.

2 Likes

Thanks for your efforts to fix this bug. I will test this solution today and tell you the results.

1 Like

Hello,
I have tested and I confirm the problem has been fixed. Now the server is running from 8 hours and active users above 800 and CPU usage is normal.
Thank you for your efforts.

1 Like

Excellent! Thanks for testing.

I have filed a ticket for this https://issues.igniterealtime.org/browse/OF-1786

The new version of MINA (2.1.3) has been released. Openfire will be using this version in version 4.4.0.

1 Like