Openfire Server Crash

Hi All,

On Friday our Openfire server crashed. I checked in the log and there is an error I am going to paste below. We run about 400 users on the service who utilize both the IM and Group chat rooms for quick meetings. Admittedly it has run smooth for the past year and a half. Very seldom does it go offline. However, I was away on Friday and it caused a severe impact to service for users. So I would like to hopefully get an idea on what the cause of the issue may have been and what can be done to avoid it in the future.

I have outlined the beginning of the error in red (about half way down). I am wondering if it was caused by the invalid group below repeating so many times that java crashed. I have corrected the group error and the logs are clean now.

Thanks

2011.09.16 15:07:34 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:34 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:34 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:42 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:43 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:49 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:50 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:58 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:07:58 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:01 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:02 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:02 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:02 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:03 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:07 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:10 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:24 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:45 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:08:47 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:08 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:15 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:18 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:35 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:35 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:35 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:39 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:40 [org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSes sion(LocalOutgoingServerSession.java:258)

] Error trying to connect to remote server: ims(DNS lookup: ims:5269)

java.net.ConnectException: Connection refused: connect

at java.net.PlainSocketImpl.socketConnect(Native Method)

at java.net.PlainSocketImpl.doConnect(Unknown Source)

at java.net.PlainSocketImpl.connectToAddress(Unknown Source)

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSess ion(LocalOutgoingServerSession.java:253)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.authenticateDomain (LocalOutgoingServerSession.java:144)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.sendPa cket(OutgoingSessionPromise.java:239)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(Ou tgoingSessionPromise.java:216)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

2011.09.16 15:09:41 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:44 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:45 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:45 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:46 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:46 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:46 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:47 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:48 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:48 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:49 [org.jivesoftware.openfire.nio.ConnectionHandler.exceptionCaught(ConnectionHand ler.java:110)

]

2011.09.16 15:09:49 [org.jivesoftware.openfire.nio.ConnectionHandler.exceptionCaught(ConnectionHand ler.java:110)

]

java.lang.OutOfMemoryError

at java.io.RandomAccessFile.readBytes(Native Method)

at java.io.RandomAccessFile.read(Unknown Source)

at java.io.RandomAccessFile.readFully(Unknown Source)

at net.sourceforge.jtds.util.BlobBuffer.read(BlobBuffer.java:280)

at net.sourceforge.jtds.util.BlobBuffer$BlobInputStream.read(BlobBuffer.java:610)

at java.io.InputStream.read(Unknown Source)

at net.sourceforge.jtds.util.BlobBuffer.getBytes(BlobBuffer.java:966)

at net.sourceforge.jtds.jdbc.ClobImpl.getSubString(ClobImpl.java:124)

at net.sourceforge.jtds.jdbc.Support.convert(Support.java:298)

at net.sourceforge.jtds.jdbc.JtdsResultSet.getString(JtdsResultSet.java:935)

at org.jivesoftware.openfire.vcard.DefaultVCardProvider.loadVCard(DefaultVCardProv ider.java:77)

at org.jivesoftware.openfire.ldap.LdapVCardProvider.loadAvatarFromDatabase(LdapVCa rdProvider.java:276)

at org.jivesoftware.openfire.ldap.LdapVCardProvider.loadVCard(LdapVCardProvider.ja va:215)

at org.jivesoftware.openfire.vcard.VCardManager.getOrLoadVCard(VCardManager.java:2 22)

at org.jivesoftware.openfire.vcard.VCardManager.getVCard(VCardManager.java:215)

at org.jivesoftware.openfire.handler.IQvCardHandler.handleIQ(IQvCardHandler.java:1 08)

at org.jivesoftware.openfire.handler.IQHandler.process(IQHandler.java:49)

at org.jivesoftware.openfire.IQRouter.handle(IQRouter.java:351)

at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:101)

at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:68)

at org.jivesoftware.openfire.net.StanzaHandler.processIQ(StanzaHandler.java:319)

at org.jivesoftware.openfire.net.ClientStanzaHandler.processIQ(ClientStanzaHandler .java:79)

at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:284)

at org.jivesoftware.openfire.net.StanzaHandler.process(StanzaHandler.java:176)

at org.jivesoftware.openfire.nio.ConnectionHandler.messageReceived(ConnectionHandl er.java:133)

at org.apache.mina.common.support.AbstractIoFilterChain$TailFilter.messageReceived (AbstractIoFilterChain.java:570)

at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)

at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)

at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)

at org.apache.mina.common.IoFilterAdapter.messageReceived(IoFilterAdapter.java:80)

at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)

at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)

at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)

at org.apache.mina.filter.codec.support.SimpleProtocolDecoderOutput.flush(SimplePr otocolDecoderOutput.java:58)

at org.apache.mina.filter.codec.ProtocolCodecFilter.messageReceived(ProtocolCodecF ilter.java:185)

at org.apache.mina.common.support.AbstractIoFilterChain.callNextMessageReceived(Ab stractIoFilterChain.java:299)

at org.apache.mina.common.support.AbstractIoFilterChain.access$1100(AbstractIoFilt erChain.java:53)

at org.apache.mina.common.support.AbstractIoFilterChain$EntryImpl$1.messageReceive d(AbstractIoFilterChain.java:648)

at org.apache.mina.filter.executor.ExecutorFilter.processEvent(ExecutorFilter.java :239)

at org.apache.mina.filter.executor.ExecutorFilter$ProcessEventsRunnable.run(Execut orFilter.java:283)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at org.apache.mina.util.NamePreservingRunnable.run(NamePreservingRunnable.java:51)

at java.lang.Thread.run(Unknown Source)

2011.09.16 15:09:49 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:09:57 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:10:05 [org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSes sion(LocalOutgoingServerSession.java:258)

] Error trying to connect to remote server: proxy.eu.jabber.org(DNS lookup: hermes.jabber.org:5269)

java.net.ConnectException: Connection timed out: connect

at java.net.PlainSocketImpl.socketConnect(Native Method)

at java.net.PlainSocketImpl.doConnect(Unknown Source)

at java.net.PlainSocketImpl.connectToAddress(Unknown Source)

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSess ion(LocalOutgoingServerSession.java:253)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.authenticateDomain (LocalOutgoingServerSession.java:144)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.sendPa cket(OutgoingSessionPromise.java:239)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(Ou tgoingSessionPromise.java:216)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

2011.09.16 15:10:07 [org.jivesoftware.openfire.roster.Roster.(Roster.java:179)

] Groups ([…Technology Operations]) include non-existent username (unlock_user_accounts)

2011.09.16 15:10:08 [org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSes sion(LocalOutgoingServerSession.java:258)

] Error trying to connect to remote server: eu.jabber.org(DNS lookup: eu.jabber.org:5269)

java.net.UnknownHostException: eu.jabber.org

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSess ion(LocalOutgoingServerSession.java:253)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.authenticateDomain (LocalOutgoingServerSession.java:185)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.sendPa cket(OutgoingSessionPromise.java:239)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(Ou tgoingSessionPromise.java:216)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

Still really kind of hoping there is a fix or workaround to this issue aside from just ensuring no AD groups within invalid users, whatever.

Sounds like an issue I had a couple versions back where Java was using up the entire system memory and causing the crash. We resolved this by modifying the “<openfire_root>/bin/openfire” file and specifying a Java Heap Size that was less than the max ram on the server (we have 1GB on it so we set it to 512MB…Openfire’s Java memory rarely goes over 60MB now and usually is around 30-40MB).

  1. Open “<openfire_root>/bin/openfire”

nohup “$app_java_home/bin/java” -Xmx512m -server -Dinstall4j.jvmDir="$app_java_home"

Thanks Dweez. I checked my openfire-service.vmoptions file and notice my memory settings are:

-Xms256m

-Xmx512m

I am running 1gb of memory. I presume these settings should be sufficient for the system memory installed in the server? I believe I had adjusted this in the past.

I wanted to add that what I have done as a workaround is configure a scheduled task to restart the openfire service every three days at 0300. It has not been long enough to know for sure if this will at least prevent the problem but its a workaround step for the time being.

Unfortunately our users now ‘rely’ on this system so its imperative it remains reliable…especially when they start setting up conferences with outside users and contractors to whom we allow access specifically to this tool.

Sorry that didn’t help any. Our users here quickly made our IM environment a “critical services” as well.

I’m not adept at interpreting java logs (I’m trying to learn myself as our set up still has some lingering issues) so I’m sorry I can’t assist you more. I hope you get it figured out.

-dweez

Hey, no problem at all! I really appreciate you at least trying to help and provide some suggestions!!