100% CPU usage and very high load average (Linux)

Hello,

We have a serious problem with Wildfire and also with the new version of Openfire 3.3.0 on Linux / Debian and Gentoo. On Gentoo we’'re using Java 1.5.11 and on Debian 1.6.1 (6u1).

The CPU Usage is at 100% for hours now and the the load average shows 3.30, 3.77, 3.77. The web interface is not responding on port 9090 and 9091. The logs do not show any critical informations. I’'ve tailed them and while the cpu usage is extremly high. Connecting to the jabber service is working. Only the system is very laggy.

On Wildfire we’'re using the pyicq and pymsn transport.

And on Openfire the IM Gateway Plugin, the broadcast plugin and Search Extension.

That problem exists for 6 months now, and unfortunately with 3.3.0 nothing has changed

I hope you have an idea.

Kind regards,

Claus

EDIT: Connecting with client is no longer possible. No answer from jabber.

Message was edited by: Cloonix

Cloonix,

In this situation thread dumps would be extremely helpful to determine what part of Openfire is consuming all the CPU. Also you if you could some thread dumps of when the machine is acting normal it would be good for comparing to the high load thread dumps.

Off the top my head I’'ve heard of people experiencing a pegged CPU while using transports. You might have some luck searching the issue tracker for problems related to this :

http://www.igniterealtime.org/issues/secure/Dashboard.jspa

Cheers,

Nate

Hi,

this could be a resource problem.

How much users were connected, how much memory did you allow Openfire to use?

Does a restart fix this problem, and yes for how long?

LG

PS: I wonder why you did wait 6 months to post that you have a problem.

Hello,

only 10 users. It is a virtual machine with 256MB of RAM.

Restart fixes it, but only for 24-72 hours.

We will do a thread dump as natep mentioned.

We waited, because we thought that the next version will fix it. Also we tried to locate the problem by our own.

Thank you,

Claus

I think the keyword is “virtual machine”. That may be why it can’‘t handle well - check it’'s settings for processes and such.

I don’‘t know if Virtual Machine has much to do with it. We’'re running it in house using FreeBSD 6.1-REL inside VMWare ESX and it has been running for months without any hiccups whatsoever. Processor loads show all zeroes all the time.

Well, the OP didn’‘t specify what kind of “virtual machine”. If it’'s a VPS, I have had the experience with it not being able to handle Java and Wildfire…

It’‘s a User Mode Linux (UML) on Kernel 2.6.19. At the moment we watch the mentioned Gentoo system. We’'ve disabled the built in IM Gateway and now Wildfire runs for 3 days now with pyicq and pymsn.

After starting the service, three days ago, the Load Average and CPU were high for 24 hours. Now everything is clean.

The problem with Openfire and the IM Gateway we have not watched further for now. But we’'ll do.

I had a similar problem with the yahoo transport in the IM gateway. Im using ubuntu in a VM. Try disabling it and see if all is happy.