Timer Threads never killed

Hello there,

Openfire 3.6.3 (563 subsciptions)

IMGateway 1.2.4d (msn,aim, yahoo enabled)

I’ve noticed after a few days of monitoring, that threads dumps on the openfire jvm are showing Timer threads that never seem to be cleaned up.

“Timer-10217” daemon prio=10 tid=0x0a550400 nid=0xaef in Object.wait() [0xc6848000…0xc6848f30]
java.lang.Thread.State: TIMED_WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.util.TimerThread.mainLoop(Unknown Source)
- locked <0xed3d33e8> (a java.util.TaskQueue)
at java.util.TimerThread.run(Unknown Source)

My monitor period was over 6 days and doing daily thread dumps produced;

43 Timers on the first day

383 … second day

635 … third day

765 … fourth day

1672 … fifth day

1782 …sixth day

I have confirmed that timers are indeed aggregating, ie all timers that existed on day n, exist on day n + m. Total threads on these dumps show that 97% are Timer threads.

The target is an internal test machine, so there is seldom more than 3-4 conncurrent users and only about 30 or so *active *accounts used sparingly. We use the gateway with the above networks enabled + group chat.

I’m not an Openfire or Gateway developer / expert so I have not isolated yet what part of the code or action is responsible for this. From looking through the code, I can see the gateway and openfire source contains a few Timers, but there is also a host of imported libraries, so the culprit may also lie in there.

From my testing in isolation, I worked the gateway in the typical fashion in which we use it, but have not reproduced this in any manner.

Any info or suggestions much appreciated,

M

About the IM Gateway. You can try its newest version, which is called Kraken now and maybe ask in its forums http://kraken.blathersource.org/

Hey Wroot,

Thanks for that, its a good point, though I should have referenced the (same) thread I raised at Kraken.

http://kraken.blathersource.org/node/197

The reason why the upgrade wasnt done is mentioned in there.

Cheers,

M

So I did a little experiment with the Timer class, the stack trace gives an indication of its state with TIMED_WAITING. A Timer who has no tasks scheduled will dump like this

“Marks Timer” daemon prio=10 tid=0xb5b17400 nid=0x7f5d in Object.wait() [0xb55e6000…0xb55e6fb0]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
- waiting on <0x8c10e6f0> (a java.util.TaskQueue)
at java.lang.Object.wait(Object.java:485)
at java.util.TimerThread.mainLoop(Timer.java:483)
- locked <0x8c10e6f0> (a java.util.TaskQueue)
at java.util.TimerThread.run(Timer.java:462)

Where TIMED_WAITING indicates the Timer still has at least one TimerTask scheduled for execution (single or repeating) and WAITING is having no Tasks for execution.

Given these threads are hanging around for days, I presume we are looking for a Timer who has Task(s) scheduled for repeated execution.

Also, I should mention that the CPU usage is about 5%, so I guess with the tasks are either never been run or do very little.

Update: I reckon this the gateway, inparticular the AIM transport.

See http://kraken.blathersource.org/node/197 for more details

Related: http://www.igniterealtime.org/community/message/196555

I ran a load test overnight against OF, which logged 4 AIM and 2 MSN users in, sent a couple messages then logged out. The interval was 4 mins, so got about 230 runs in. OF servers baseline stats was:

107MB physical memory used; 341MB virtual memory; 61 total threads; 11 timer threads.

after load test:

143MB physical memory used; 462MB virtual memory; 464 total threads; 409 timer threads.

Tonights run will exclude MSN, and subsequently will upgrade from 1.2.4d to K1.1.2.

Let me know if you would be keen to run this simple load test and I will post a runnable jar.

(in the graph attached, the x-axis is hours)