Wildfire 3.0.1 Locking Up After 30-40 Min

We’‘ve been running Wildfire 3.0.0 for quite some time, on a CentOS Linux server with no issues. I upgraded to 3.0.1 by doing rpm -U wildfire.blah.rpm. The server seems to work fine for about 30 - 40 minutes, then after awhile people complain about not being able to log in. I can’‘t even access the admin webpage. Though online users can still send messages. I’'m forced to kill the process and relaunch the server.

I’‘ll admit I’‘m not going to be able to help myself figure this out, since I was forced to down grade back to 3.0.0. So I don’'t have access to log files to trace the error. I was just hoping this was an issue other people have seen.

Thanks—

I’‘m having similar issues. I’‘m thinking I should just downgrade as well, rather than try to figure out what’'s going on.

Hey guys,

Interesting problem. One thing I would like to know is the list of plugins that you are running. When the server is not responding could you take a thread dump of the Java Virtual Machine so we can see what the server is doing. By executing kill -3 you can get a thread dump. You can then send me those thread dumps to me by email. If nothing suspicious is found then you may want to monitor memory consumption. To do that just add -verbose:gc to the start up script. Information will be printed to stdout.

Thanks,

– Gato

Hi,

it would help to know how much memory Wildfire did consume and how much memory was free inside the JVM. One could add

INSTALL4J_ADD_VM_PARAMS=’’-XX:+PrintGCDetails -Xloggc:…/logs/gc.log’’

to bin/wildfire to write a log of the JVM garbage collector to track this.

LG

Okay thanks. I will have to upgrade the server after hours and wait for it to lock up. We’'re using all of the non-enterprise plugins.

Though at the time of these lock ups I hadn’‘t completely configured my Asterisk server to allow Wildfire to connect. Know that I’‘ve got Asterisk’'s manager configured, I will try and upgrade the server, and see if that was an issue.

Thanks–

I quickly upgraded back to 3.0.1 and noticed my Asterisk plugin was acting odd. When I clicked on the phone mapping link, it went to an empty frame, and then hung the server. I restarted the server, and then tried again, and the server hung again.

I had to downgrade back to 3.0.0 and the Asterisk plugin worked fine.

I will do more testing tonight, once we are closed.

Thanks–

I tried to do a kill -3 while the process was hung, but it didn’'t kill the process. Should it produce some kind of dump somewhere anyway?

We were having memory problems before where wildfire would take up all the memory on the whole server and take the server down, but I thought we fixed them. Now Wildfire just hangs after about a day of use and you can’‘t connect to it. However memory use looked normal just before it hung. Let me know if there’'s anything else I can do.

Hey Derek,

Kill -3 will not terminate the process but will send a signal to the

process so that it dumps information about the running threads in the

Java Virtual Machine. The dumped information can be found in nohup.out

or stdout.log or to whatever place you are redirecting the stdout.

Regards,

– Gato

Okay, good news…I’'ve upgraded to 3.0.1 and have been running for about 24 hours without any lock ups. I found a patched version of the Asterisk plugin that was suppose to fix a couple issues. This patched JAR seems to be working.

I will let you know when I consider my server completely stable.

Link to Posting: http://www.jivesoftware.org/community/message.jspa?messageID=126416

Link to Patched JAR: http://couplet.be/temp/status_fix/asterisk-im.jar

Link to Patch Text: http://couplet.be/temp/status_fix/patch.txt