Re: Openfire 3.9.3 severe memory leak?!

Hello,

I upgraded (Linux i386 RPM, CentOS 5.6) last night and immeadiately noticed a memory leak in OpenFire. At first I thought it was because I was using Oracle’s JRE 1.7 u55, however, changing to Openfire’s default didn’t yield any better results. I also tried the new 1.8 u5 JRE and that also didn’t ameliorate the issue.

What log files can provide?

Thanks,

Ryan

1 Like

Screen shots of the behavior:

https://sesp.box.com/s/tkcb8sxb23epyq9ilwo5

https://sesp.box.com/s/37k3qoepbwmtc64munqu

In my experience, memory leaks are among the hardest bugs to find :frowning:

Useful info:

  • What was your previous OF version, that worked without leaks?

  • Did you / the users do some specific stuff? Like a lot of MUC or PubSub?

If possible provide a heap dump, maybe it’s useful.

Previously we were on OpenFire 3.9.1. I’ve reverted all the way back to OpenFire 3.8.2 for now, just because that was the last version I had handy. I also upgraded the plug-ins incase those were the culprits in the memory leak. Upgrading those didn’t help either. Just FYI, OpenFire 3.8.2 seems to work just fine with Java 1.8 u5.

To the best of my knowledge my users we not doing anything MUC or PubSub, but then again, I’m not even sure I know what those are.

That Openfire uses 900MB after two minutes is odd.

Which plugins are you using?

Are you using an external database and how big is the database? Are there a lot of archived messages?

In my experience, memory leaks are among the hardest bugs to find :frowning:

They are sometimes hard to understand, but most JVMs make it pretty easy to find the objects that use up the memory: Simply create a heap dump and analyze the heap.

I am aware that this is maybe to much to expect from the average user, but on the other hand it’s just a matter of calling

jmap -head,format=b

and making the resuling file available to us.

Obtaining (and maybe even analyzing) a heap dump would be a great topic for a community document.

In addition to the standard plug-ins, I was using the Jitsi plug-in, but I disabled it. The problem persisted, even after disabling Jitsi and updating the others.

We’re using MySQL 5.0.5 on localhost and we don’t archive messages.

Sounds good. I’ll have to do this over Memorial Day Weekend or some other non-critical time.

You could also enable JMX via the admin console and then monitor the running server via the JConsole application (which ships with the JDK). This will give you realtime feedback on number of theads, memory/GC patterns, etc. for your Openfire instance.

This will give you realtime feedback on number of theads, memory/GC patterns, etc. for your Openfire instance.
Right, but the imporant information about which object instances consume the heap can not be retrieved via JConsole.

Here’s another similar, more visual (and relative easy) approach to check for memory leaks:

  1. Start Openfire.

  2. Start VisualVM (on the same machine), which you can find in the JDK/bin folder (jvisualvm.exe on Windows).

You should see the Openfire process on the left hand side. Double click to see more info.

  1. Click “Sampler”, click “Memory”, click “Heap Dump” (maybe also click “Perform GC” first, just to be sure).

Do this right after the start of Openfire.

  1. Then wait some time, until the memory has increased and repeat the step (make another heap dump).

  2. Then click on the second heap dump, click on “Classes” and click “Compare with another heap dump”. In the dialog you select the first heap dump.

You now see the “delta” between both heap dumps and see which classes have increased. Maybe also sort by name and look for org.jivesoftware classes.

Any suspicous increase is visible through the red line in the instances [%] column or generally any unusually high amount of instances of one class is suspicous for a leak.

Not sure if this is a leak or something else, but usually i had similar issues because of low JVM memory before. 3.9.1 was running fine for a month. I have upgraded to 3.9.3 6 days ago. Today all clients were disconnected and i had to reboot the server. Admin Console was also unavailable. Can’t say now what the problem was. Will watch JVM memory now.

On a side note. Logs are still not working for me (all log files a zero KB). Thought Daryl has fixed this with https://igniterealtime.org/issues/browse/OF-640

This is exactly what I saw. Has to be a memory leak of some sort.

Yep. After a few days JVM now peaks at 400 MB (out of 740 available). A few more days and it will hang again. I haven’t updated or added anything else, just the Openfire. Will have to downgrade to 3.9.1.

It’s a cmd only linux box. Not sure what can i check. I can probably generate a dump or something before downgrading. If someone gives exact instructions

jmap -dump:format=b,file=/1gb-fs/file.bin openfire-pid

Maybe you need to run this as daemon.

openfire-pid would be java pid? I’m running Openfire as a daemon openfired, which runs /bin/openfire.sh

Jmap is not recognized as a command. Anyway, i wasn’t able to take a dump. Downgraded already.

jmap included in the Jdk. I assume you use the OF Jre.

Actually i’m using the one bundled with Spark.tar.gz version Arch linux removed Oracle’s java from repositories, so i decided to update my Java manually just replacing the old java folder.

I believe the change for OF-764 may be adversely affecting memory utilization in some circumstances. Can you confirm whether you have significant MUC traffic (or history) with rooms that have conversation logging enabled?

The change I made was to load full history for MUC rooms by default rather than limiting it to two days, and also provided a new system property (“xmpp.muc.history.reload.limit”) to optionally limit the history to a given number of days. I am planning to modify this slightly for the next release to restore the original limit of two days by default, while still allowing an override via the system property.

Note that this issue would only affect deployments that use MUC rooms with conversation logging enabled.