Openfire 10.0.0 and 10.0.1 beta stop working after about 1-2 days in linux

Hi,

Since I upgraded to Openfire 10.0.0 I constantly have the problem that openfire stops working on gentoo linux. Means I have to kill all the time with a “kill -9” command because it would not even react the normal start stop script or to a normal kill command.

When I restart it everything runs fine for maybe 1-2 days then same problem occurs again. I already incresed the memory size in the java command lien in the init script from 256M to 512M, now it seems that it runs a bit longer (before it dead in less then 1 day).

Also upgrading to latest 10.0.1 beta didn’t solve the problem, I still have this issue.

For me it looks like a memory leak, but is it a known bug and when will it be fixed. If not, how to provide feedback to the devs and even where look cause when openfire is dead if have no access to admin console as well of course. It’s a bit frustrating as openfire just ran fine with pre 10.x versions.

Also it seems that logfile also has some strange errors. It seems that openfire somehow has problems accessing the database for whatever reasons and why doest this start after 1-2 days? It also seems that it tries this over and over again and at some point just stops working (maybe there’s a memory leak in this retry loop) or maybe it just stops working because it cannot access the database. But these are the last few lines from log.error logfile and as you can see the last message is nearly 24 hours old now. I also left it running on purpose to see if it does anything anymore, but it seems its just dead.

The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:46 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_outgoing
java.io.FileNotFoundException: Could not open muc_traffic [non existent]
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:54:49 org.jivesoftware.openfire.reporting.stats.RrdSqlBackend - Error while accessing information in database: java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:49 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic server_sessions
java.io.FileNotFoundException: Could not open server_sessions [non existent]
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:54:50 org.jivesoftware.openfire.plugin.gojara.database.DatabaseManager - java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:50 org.jivesoftware.xmpp.workgroup.search.ChatSearchManager - ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
        at org.jivesoftware.database.DbConnectionManager.getConnection(DbConnectionManager.java:152)
        at org.jivesoftware.xmpp.workgroup.search.ChatSearchManager.rebuildIndex(ChatSearchManager.java:717)
        at org.jivesoftware.xmpp.workgroup.search.ChatSearchManager.rebuildIndex(ChatSearchManager.java:454)
        at org.jivesoftware.xmpp.workgroup.search.ChatSearchManager.updateIndex(ChatSearchManager.java:472)
        at org.jivesoftware.xmpp.workgroup.WorkgroupManager$5.run(WorkgroupManager.java:564)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:54:51 org.jivesoftware.openfire.reporting.stats.RrdSqlBackend - Error while accessing information in database: java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:51 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_rooms
java.io.FileNotFoundException: Could not open muc_rooms [non existent]
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:54:54 org.jivesoftware.openfire.reporting.stats.RrdSqlBackend - Error while accessing information in database: java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:54 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic conversations
java.io.FileNotFoundException: Could not open conversations [non existent]
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:54:57 org.jivesoftware.openfire.reporting.stats.RrdSqlBackend - Error while accessing information in database: java.sql.SQLException: ConnectionManager.getConnection() failed to obtain a connection after 11 retries. The exception from the last attempt is as follows: com.mysql.jdbc.exceptions.jdbc4.CommunicationsException: Communications link failure The last packet sent successfully to the server was 0 milliseconds ago. The driver has not received any packets from the server.
2015.06.09 07:54:57 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_users
java.io.FileNotFoundException: Could not open muc_users [non existent]
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jrobin.core.RrdDb.<init>(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:362)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:55:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_outgoing
org.jrobin.core.RrdException: Bad sample timestamp 1433829300. Last update time was 1433829300, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:55:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic sessions
org.jrobin.core.RrdException: Bad sample timestamp 1433829300. Last update time was 1433829300, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:55:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic packet_count
org.jrobin.core.RrdException: Bad sample timestamp 1433829300. Last update time was 1433829300, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:55:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic proxyTransferRate
org.jrobin.core.RrdException: Bad sample timestamp 1433829300. Last update time was 1433829300, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 07:55:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_occupants
org.jrobin.core.RrdException: Bad sample timestamp 1433829300. Last update time was 1433829300, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 11:33:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic proxyTransferRate
org.jrobin.core.RrdException: Bad sample timestamp 1433842380. Last update time was 1433842380, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
2015.06.09 11:33:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Error sampling for statistic muc_occupants
org.jrobin.core.RrdException: Bad sample timestamp 1433842380. Last update time was 1433842380, at least one second step is required
        at org.jrobin.core.RrdDb.store(Unknown Source)
        at org.jrobin.core.Sample.update(Unknown Source)
        at org.jivesoftware.openfire.reporting.stats.StatsEngine$SampleTask.run(StatsEngine.java:395)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
        at java.util.concurrent.FutureTask.run(FutureTask.java:262)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
        at java.lang.Thread.run(Thread.java:745)
[root@shadowstux ~]$

And the very last log entry I could find was in the warn.log, it’s these log lines:

2015.06.09 11:33:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Sample time of 1433842380 for statistic proxyTransferRate is invalid.

2015.06.09 11:33:43 org.jivesoftware.openfire.reporting.stats.StatsEngine - Sample time of 1433842380 for statistic muc_occupants is invalid.

which correlates with the error.log. Obviously after 11:33:43 sever was dead for reasons totally unclear to me.