powered by Jive Software

Memory Leak in Openfire 4.6.0-beta

Running Openfire service for a long time, with clients connecting and disconnecting, I realized that the memory was increasing up to the point the garbage collector could not do anything more to release memory (the CPU load was on top!).

I got a dump and analysed in JProfile and saw that the object EntityCapabilitiesManager.allUserCapabilitiesListeners was with about 7000 listeners. The instance of this listeners was PEPService.

Looking more deeply in code, I saw the PEPService are created/loadedFromDB and inserted in PEPServiceManager.pepServices cache. In the creation/loadFromDB the call pubSubEngine.start(pepService) inserts the PEPService in the EntityCapabilitiesManager.allUserCapabilitiesListeners.

However, the PEPService is evicted from cache every 30 minutes, but it still remains in EntityCapabilitiesManager. In the next call of PEPServiceManager.getPEPService() another instance of PEPService is inserted in EntityCapabilitiesManager.allUserCapabilitiesListeners for the same user.

A solution I did was to implement equals and hashCode in the class PEPService. This because EntityCapabilitiesManager.allUserCapabilitiesListeners is a Set. In the next call PEPServiceManager.getPEPService() the old PEPService is replaced out from the set and allowed to be collected by GC.

The code:

 @Override
    public boolean equals(Object obj) {

        if (obj == this)
            return true;

        if (!(obj instanceof PEPService))
            return false;

        PEPService other = (PEPService) obj;
        return Objects.equals(serviceOwner, other.serviceOwner);
    }

    @Override
    public int hashCode() {

        int result = 17;
        if (serviceOwner != null) {
            result = 31 * result + serviceOwner.hashCode();
        }
        return result;
    }

Thanks, would you kindly submit this as a Pull Request against our github repository?

cc: @guus

This is a good find, thanks. I’ve created this issue to track the problem: https://issues.igniterealtime.org/browse/OF-2092

I wonder if we can improve on your fix. When we keep a reference to the PEP Service in the EntityCapabilitiesManager.allUserCapabilitiesListeners collection, then it can’t be garbage collected, and will remain in memory. That makes the eviction policy in that cache rather pointless, doesn’t it? Maybe we should turn the cache into an infinite cache?

One thing to note is that we then should probably add some code to clean up PEP service instances when they’re no longer used (when the owner drops offline, maybe? Or some kind of reaper that checks for activity?)

Totally agree with you.

The fact the listeners to be garbage collected is a wrong in my opinion.
The logic of PEPService as a listener only makes sense for online users (connected or detached).

So I think the PEPService cache should be infinity (consequently not purgeable) and that the PEPService should be removed from cache and EntityCapabilitiesManager.allUserCapabilitiesListeners once the user is completely offline.

My fix with equals and hashCode only fixes the leak, but the logic still remains bad.

Haven given this more thought: I don’t think that it’s really appropriate to keep a reference to PEPService in EntityCapabilitiesManager at all. It is not true that when its owner drops offline, a PEPService instance goes unused. Subscribers to the service (typically: subscribers to the presence of its owner) might still use it.

I’m working on a change in the implementation, that removes the EntityCapabilitiesListener interface from the PEPService implementation. I’m thinking of moving it to PEPServiceManager, which can then delegate to the appropriate PEPService instances. As far as the PEP functionality is concerned, this replaces many listeners with just one. Almost as a byproduct, this will prevent the memory leak that started all of this.

A change as described in my last comment has been merged, and will be part of the 4.6.0 release. That should resolve the memory leak.

With 4.6.0 now released, are you able to verify the leak is now gone?