Tracking down a memory leak in 3.7.0

I’m trying to track down a memory leak in Openfire 3.7.0. As I am not a Java expert, I’m looking for advice on tools you all use for this – specifically ways to compare deltas between heap dumps.

VisualVM seems to be pretty nice, but I don’t see a way to directly compare between two different heap dumps – this seems to be possible by profiling and taking snapshots, but I’d prefer to rely on heap dumps if possible (in my brief tests, the profiling sometimes seems to be disruptive).

FWIW, my top memory consumer appears to be char[] objects and if I use VisualVM to find a GC root for a randomly selected one of these char[] objects, I do see PEPService objects in the vicinity of the TaskQueue object that shows up as the GC root.

I’ll probably look to disable PEP at my next Openfire restart and see how this impacts my heap usage, but looking for thoughts on the above.

Openfire 3.7.0 on RHEL 5.7 running on Sun JRE 1.6.0_26. Only about 100-200 clients connected, but am eating through 1.5GB of heap space in 48 hours or so.

I have attached a 12H view of heap usage on our Openfire server. It should bear out a little more clearly after another 12H or so, but I do believe this is showing a classic memory leak with utilization slowly climbing between GC periods.

I also notice a relatively small amount of int objects (13K or 0.3% of objects in heap) taking up 13.2% of my heap space.

These int objects appear to have no references (per VisualVM) though it’s probaby only the top 35-50 of them that are consuming most of the space with the rest of the objects being fairly small.

PEP could be the reason. You probably didn’t notice this announcement on top of the forums:

Openfires up to and including version 3.6.4 (and looks like 3.7.0 too) suffer from a memory leak in its PEP component. If your Openfire server is crashing with OutOfMemoryExceptions, you might be having this problem.

As a workaround, you can disable PEP, by setting the Openfire property xmpp.pep.enabled to false.

More information can be found in this discussion: Openfire 3.6.4 memory leak with Empathy

Thanks, I’d missed that announcement! D’oh.

It actually seemed that when I disabled that property, even without restarting the server, the issue went away.

In any case, things seem more stable now. Will keep an eye on the other thread and this bug.

(xposted to two other discussions)

I figured out how to capture a memory leak report on the latest build (September 30, 2011). Memory usage was heavily concentrated in PEPService, TaskQueue, and the language modules. 490/leak_report_09302011.pdf