powered by Jive Software

Nagios monitoring? (memory leak?)

Has anyone implemented nagios monitoring of Wildfire? I am doing a simple TCP check of port 5222, but this seems to create a memory leak due to connections (very similar to JM-558 ??) and writes the following to the error log:

2006.02.22 09:41:24 [org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:159)

] Connection closed before session established

Socket[addr=/xx.xx.xx.xx,port=54068,localport=5222]

My two questions:

  1. is there a simple string I can send on port 5222 to check if the service is alive without throwing an error? I looked through the smack code but I am no programmer.

  2. does the memory leak in JM-558 still exist? Over a period of a few days the ram usage kept growing until eventually no clients could connect and the admin page was throwing a heap error. Nagios creates a socket connection every minute.

Much thanks!

Hey Aaron,

Which version of Wildfire are you running? Have you tried increasing the max memory of the JVM and are you still having the memory problem?

Regards,

– Gato

Hi Aaron,

this really causes a memory leak. Connecting to 5222 every second makes this very fast visible.

  1. Do it like this, probably every 10 minutes as long as this bug exists.

  2. This could be the case.

LG

Which version of Wildfire are you running? Have you

tried increasing the max memory of the JVM and are

you still having the memory problem?

Wildfire version 2.5.0 on Windows 2003 Enterprise with 2GB RAM. I have not adjusted the memory, but it looks like 493.06 MB is the current max. With 20 clients it usually hovers around 50MB when nagios isn’'t checking it. Increasing it would probably delay the problem, but probably not fix it.

Something else of note: There was an incompatibility with Gaim a while back which was causing clients to time out. The recommended fix was to set xmpp.client.idle to a very high number. Would this affect these connections?

  1. Do it like this, probably every 10 minutes as long as this bug exists.

Was there some XML in there that got stripped out?

Message was edited by: azink

Hi Aaron,

a simple TCP port check is the right way to check if a server is still available.

If nagios supports more then this you may send

(4. wait for close)

Maybe it is also possible to send 1+3 together in one packet without waiting for a reply.

LG

Thanks, this helped a lot. I’'ve set nagios to only check every 5 minutes until the memory leak can be solved.

Hey guys,

The problem JM-573 has been fixed last Thursday. You can use the latest nightly build that includes the bug fix.

Thanks,

– Gato

For those interested, here’‘s what I’'m using to monitor Wildfire with nagios. It does a bit more than just check port 5222 for a connection, but actually sends a version request and looks for the string “not-authorized”. With the versatility of the check_tcp program, you could work up some complex login routines to check the jabber server even further, but this at least makes sure the server is alive and talking xmpp. (thanks it2000 for providing the base)

define command {

command_name check_tcp_value

command_line libexec/check_tcp -H $HOSTADDRESS$ -p $ARG1$ -s $ARG2$ -e $ARG3$ -M crit -j

}

define service {

(clip)

check_command check_tcp_value!5222!""!“not-authorized”

}

/code

Message was edited by: azink

Message was edited by: azink