Slow roster loading

Ever since upgrading to Openfire 3.7.0 our users rosters take about 15 minutes to show after logging on. If they log on earlier in the day before 30+ users are on the server it is instant. How can we determine the performance issue?

How many users do you have and how much memory on the server?

Also, read this and maybe disable the PEP service

Openfires up to and including version 3.6.4 (and looks like 3.7.0 too) suffer from a memory leak in its PEP component. If your Openfire domain is crashing with OutOfMemoryExceptions, you might be having this problem.

As a workaround, you can disable PEP, by setting the Openfire property xmpp.pep.enabled to false.

More information can be found in this discussion: Openfire 3.6.4 memory leak with Empathy

We have 40 users on the system at max throughout the day. The issue doesn’t occur when there are less than 25 or so. The server currently has 512 assigned to it but its a VM so I will reboot it and bump it to 1GB. The following issues are occurring.

  1. Slow logon during peak business hours. It takes approximately 15 minutes for the roster to load and if you log off the user stays active in the console for 5 minutes.

  2. Users cannot connect to chat rooms (conference rooms) unless they are invited

  3. Lag when messaging users. Sometimes the message takes about 30 seconds to delivery.

Also, xmpp.pep.enabled is not in my system properties.

I applied the xmpp.pep.enabled = false, upgraded the VM to 1GB and upgraded the processors to 2 and the problem still occurs.

What’s in you errors log? Any out of memory errors? Though i doubt this is a memory related issue. Looks more like some network issue. Maybe there are some other VMs on that server taking all the bandwidth?

The logs unfortunately aren’t logging so I can’t tell what errors may be occuring. If someone knows how to fix that as well. As for bandwidth, there are no resource issues for the vm.

I’ve since rebuilt another jabber server and experienced the same issue on it. We have correlated the issue to occuring after a specific user logs onto the server. His account has an odd “testaccount” object associated with it and he happens to also be the first user alphabetically in the list. I’ve verified that when the user is not signed on people log on and instantaneously receive a roster and conferences tab. If the user is on it takes about 10 minutes.

Where do you see that “testaccount” object? Can you delete and create that user again? Maybe this will remove all the odd objects.

This is the error we were experiencing.

2011.08.09 16:29:17 Groups ([PitSparkUsers]) includenon-existent username (testandy)

One of our AD administrators removed the above mentioned object and then I cleared the roster and vcard cache on the server. I haven’t seen the problem reoccur since clearing the cache.