Users become offline

I’me having a problem for about 3 days.

I had Openfire 3.6.0 and users where becoming offline after a while connected to the server. They appear connected in Sessions on Openfire Console, but when I open their user properties they appear offline. If a user reconnects to the server, he becames online and shaw the others as offline. I’ve tried to use the plugin that corrects a bug on Openfire 3.6.0 (http://www.igniterealtime.org/community/message/178993) created by but it didn’t work. Yesterday I also added the xmpp.client.idle and set it to -1. For about a day it appears to work well.

Today I realized that this was happening again… I’ve tried to downgrade to Openfire 3.5.2 but it appears that installing the Monitoring plugin wasn’t work (it wasn’t able to create the table on database), and so I decided to use Openfire 3.6.0a. After installing it and use the plugin, for about an hour all worked well, but now I realized that the problem was happening again.

I’ve set (about 5 minuts ago) xmpp.client.idle to -1 again to see if this gets better, but I would like to know if anyone had ever experienced this king o trouble.

Thanks in advance (and sorry for the poor english).

Where what I found strange in logs (from today’s fresh install).

On error.log: (This happend right on the first time I started the server).

2008.09.12 12:15:26 [org.jivesoftware.util.log.util.CommonsLogFactory$1.error(CommonsLogFactory.jav a:88)] Line=19 The content of element type “dwr” must match “(init?,allow?,signatures?)”.
2008.09.12 12:20:30 [org.jivesoftware.util.log.util.CommonsLogFactory$1.error(CommonsLogFactory.jav a:88)] Line=19 The content of element type “dwr” must match “(init?,allow?,signatures?)”.
2008.09.12 13:01:26 [org.jivesoftware.util.log.util.CommonsLogFactory$1.error(CommonsLogFactory.jav a:88)] Line=19 The content of element type “dwr” must match “(init?,allow?,signatures?)”.

On warning.log:

2008.09.12 12:50:24 Error or result packet could not be delivered


On debug.log:

2008.09.12 13:04:23 JettyLog: EXCEPTION

java.io.IOException: Connection reset by peer

at sun.nio.ch.FileDispatcher.read0(Native Method)

at sun.nio.ch.SocketDispatcher.read(Unknown Source)

at sun.nio.ch.IOUtil.readIntoNativeBuffer(Unknown Source)

at sun.nio.ch.IOUtil.read(Unknown Source)

at sun.nio.ch.SocketChannelImpl.read(Unknown Source)

at org.mortbay.io.nio.ChannelEndPoint.fill(ChannelEndPoint.java:122)

at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:282)

at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:205)

at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380)

at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395)

at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488)

2008.09.12 13:04:23 JettyLog: EOF

Since yesterday till now server has been running without a problem, but it was only with a max. of 3/4 users for long time (I had 10 connected for an hour). If anyone ever seen anything related to my issue, please tell me, because Monday the server will start working with much more users…

(I’d added the xmpp.client.idle propertie, btw)

It appears that nobody knows (or “care” to know) this issue.

Well, today I’d tested with less users then it was suposed to and the got disconnected again. Even with xmpp.client.idle set as “-1”. I’m not sure (I wasn’t around here) how long did they “remain connected” but I believe it was for about 6 hours (they were disconnected almost at the end of they’re break time, is that relevant?).

After searching the forums I’ve found some possible explanations:

  • Presence plugin as it is said at http://www.igniterealtime.org/community/message/145309#145309 -> is not the cause because I don’t have it installed;

  • Gateways (I’ve seen it around here): in fact I’ve Gateway plugin installed, but none of the users affected were using it (in fact only one user on the server is using it but he was disconnected all day). Anyway, I uninstall it to see what happens.

  • RAM: there were only 6/7 only users and RAM limit is set to 485Mb…

  • VCards: I’ve the private data storing activated (it would suck if I had to deactivate it…)

Once I’ve no answer, I would like to know what do I need to have support with Jive Software.

I’ve had somewhat similar issues that appear to be related to the caching of… well, whatever it is that gets cached. If I clear out all the cache this situation seems to creep up. If you haven’t been tinkering with the cache, however… I have no idea.

Well, I’ve already reinstalled Openfire and this bug persisted… (I’ve erase it completly, even the db).

I’ve restarted the server a few hours ago an untill now it seems to be working well… I’ll have more details in the morning, to know if it’s working or not.

I found it hard to believe how can no developer of Openfire even reply to this thread…

Anyway, it’s not client’s side problem (as I was expection, but I test it anyway) because I’ve tested with Spark, my own client (which is Spark-based so I would have the same result…), Pandion and Pidgin. At all the same problem occur.

I’ve also spend about 3 hours in a row searching the forums but the similar problems that I’ve found that looked like mine were all unsolved or were other thinks (like wrong JID which is not what’s in cause here).

I’ve tried something now. My server’s name wasn’t a “real” domain. I mean, I have example.com and a JID was, for example, joel@srv.example.com (I didn’t have that subdomain created).

After reading this logs carefully I tried to create the subdomain because of the 404 error code that it is presented. Let’s see…

org.jivesoftware.openfire.PacketException: Cannot route packet of type IQ or Presence to bare JID:

I don’t know if I can be of any help, but it aggravates the hell out of me when no-one at least attempts a reply for an issue posted here. That being said it sounds like you may be using connection manager? If so have you tried NOT using it? I also made my jabber server (the service) the same as my physical server’s dns name. I noticed when I didn’t it caused problems though they were the same as yours. I simply lost all access to my server entirely. I hope this jogs some ideas. Good Luck.

Thanks for the support Septimusx.

No, I’m not using Connection Manager, I thought about that when I made this fresh install.

About your server access, how was that? I didn’t quiet understand what you mean…

Note: I still have this bug after creating the subdomain, so it isn’t it.

Ok, now I’ve checked the error.log and I’ve this new log (I haven’t check it since first time this happen):

Last packet sent to the server was 1 ms ago.
at org.jivesoftware.database.DbConnectionManager.getConnection(DbConnectionManager .java:124)
at org.jivesoftware.openfire.privacy.PrivacyListProvider.loadDefaultPrivacyList(Pr ivacyListProvider.java:205)
at org.jivesoftware.openfire.privacy.PrivacyListManager.getDefaultPrivacyList(Priv acyListManager.java:141)
at org.jivesoftware.openfire.roster.Roster.broadcastPresence(Roster.java:575)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.broadcastUpdate(Presenc eUpdateHandler.java:283)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:137)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:112)
at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:176)
at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:134)
at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:70)
at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:76)
at org.jivesoftware.openfire.SessionManager.removeSession(SessionManager.java:1070 )
at org.jivesoftware.openfire.SessionManager.removeSession(SessionManager.java:1023 )
at org.jivesoftware.openfire.SessionManager$ClientSessionListener.onConnectionClos e(SessionManager.java:1138)
at org.jivesoftware.openfire.nio.NIOConnection.notifyCloseListeners(NIOConnection. java:202)
at org.jivesoftware.openfire.nio.NIOConnection.close(NIOConnection.java:185)
at org.jivesoftware.openfire.nio.NIOConnection.systemShutdown(NIOConnection.java:1 92)
at org.jivesoftware.openfire.spi.LocalRoutingTable.stop(LocalRoutingTable.java:126 )
at org.jivesoftware.openfire.spi.RoutingTableImpl.stop(RoutingTableImpl.java:757)
at org.jivesoftware.openfire.XMPPServer.shutdownServer(XMPPServer.java:899)
at org.jivesoftware.openfire.XMPPServer.access$600(XMPPServer.java:97)
at org.jivesoftware.openfire.XMPPServer$ShutdownHookThread.run(XMPPServer.java:850 )

I’ve noticed that it is refered the Connection Manager there. Septimusx is that what you were talking (is that the interaction with MySQL?)?

I thought you were talking about the Connection Managers that is enabled on Server Settings.

I had SIP Plugin installed (but not running, it was for future use) and I’ve just noticed this in debug.log:

2008.09.16 19:02:57 InternalComponentManager: PACKET SENT:

2008.09.16 19:02:57 InternalComponentManager:

I belive this as nothing to do with the problem himself, but I’ve disabled SIP plugin anyway…

Yes I was talking of the connection manager you set up through the server settings. If you are not using it then it must not be your issue. I think openfire can use for times when you have a lot of users, like greater than 1000 or something. In regard to the server name. This has always been a confusing issue to me. For example I have a linux server and it’s name is jabber.llbean.com. That is the name defined in DNS. That is the name I can Ping. Then there is the Jabber Server Name ( I am not sure of my terminology here so maybe someone can correct me) That is the name of the openfire server (service?) running on the physical server. In the past I would just call that jabber so that my JID would be septimusx@jabber and that is who would chat with other users, etc. When I set up openfire I found that things worked better if I called my jabber server the same name as my linux server. So both of them are currently jabber.llbean.com. That means my JID is septimusx@jabber.llbean.com. When I changed it to just jabber so that my JID was septimusx@jabber I could connect with anything. Even my admin console was hosed up. I do not know why this happened but it could have been other problems that I was having at the time. Now that my openfire server appears to be running well I should try it the other way too, just to see if it works. I think my approach initially would be to strip the server down to just the basics with no plugins or fancy things. Make sure all the ports match up and your mySQL database is using the port that your server is expecting it to. If you are able to connect at all then it seems like ports might not be the issue. But hey who knows? I am also wondering if there is a timeout variable somewhere that may be causing disconnections if the client idle for a time. I haven’t had that issue but I am just thinking out loud if you will. good luck.

About the variable, I believe this would be solved by xmpp.client.properties set to “-1”, but I already have it set.

About the plugins I’ve the following installed:

Broadcast

Client Control

Dropper (like I’ve explained on the first post)

Monitoring Service

MotD (this is the useless I have, but it would be nice to keep it)

Presence Service

(like I said previously, I deleted SIP Plugin).

About the ping issue, I’ve created the srv subdomain just to fix that. I can now perfectly ping srv.myserver.com.

About firewall it’s not the problem for sure (I’ve already even tried to disable it, and the bug persist…).

I will try later to user only user@myserver.com. Btw, my FQDNS is srv.myserver.com.

I notice you have the client control plugin installed. You may want to check this out just for kicks. Under ‘server’ you should now have a client managment tab. Within this there is a ‘permitted clients’ link. It is supposed to default to allowing all xmpp clients, but if you have that set to specify a client if could be your client is not one specified. Or if there is a weird bug it may be preventing users from staying on. Check it out or remove it entirely and see if this makes a difference. You never know…

I’ve remove it to see how it goes on.

Strangly, I have two clients connected for about two hours without a problem. They’re are running two different clients: SparkWeb and a modified version of Spark. I’ve now connected a third one on Pandion (as long as remove the Client Control plugin), a 4th on the modified version of Spark and a 5th on Pidgin.

Let’s see what happens now.

Here’s some “literature”:

http://www.igniterealtime.org/community/message/171937 > It appears that it is the same problem as I have (but I don’t have Clearspace integration)

http://www.igniterealtime.org/community/message/174071#174071 > It would suck if my boss couldn’t login on his iPhone once that he uses beejive ( :X )

http://www.igniterealtime.org/community/message/126937 > this is a s2s connection but it appears that is something like what I have.

Here is what I have of new:

I believe it is something on the office network. On this office we are having some problems with Internet, it too slow. If only the users from the office became offline, that was the problem for sure, but others become too so I can’t have sure.

I sucessfully connected 5 users on office’s network (one by one with about 2 minutos gap between) and they all stayed online (I had about 5 other users connected outside it). When I connected the 6th, all users became offfline except one that I’d connected at home with Pidgin that managed to remain connected all day inspite of recreated the bug over and over. Meaning, it looks like it affects Spark, Pandion and SparkWeb (these were the clients I had connected) but not Pidgin. I’m now going to try connected more and more users on the network using different programs (right now I’ve 3 connected on the office for about 2 hours without a problem).

Btw, I have removed Client Control plugin.

I believe I have the “answer”.

Openfire probably puts users offline if their connection is too slow and, if that happens, there must be a bug that instead of putting just that one user offline, it puts them all.

I think it’s that because, as I said previously, we’re having some problems with our Internet on that office and I manage to reproduce the bug when I send big files to our website: all users from openfire got disconnected. However, I tried to connect 15 users when no one was in the office and they kept connected without a problem.

If someone with more knowlodge of Openfire’s source said if what I suposed previously was wrigth or wrong it would be great.