Users can't connected after 8 days of uptime

Hello all,

My server supports 1500 users and growing each day, across 12 sites and in one site after 8 days of openfire running I have 4 users who cannot connect anymore to the server. They login only to be booted out right away, if I restart openfire they are good for another 8 days.

I am running Openfire 3.9.1 on Ubuntu 12.04 x64 and I have no idea what is causing this issue only for these 4 people on the same VLAN in the same site and only after 8 days. There are no firewalls between them and the server but there are switch based ACL’s but if those were the issue why does restarting openfire fix it for 8 days.

Server is on a VM with 8 cores and 12GB of ram assigned to it. The CPU’s on avg site 20-40% depending on the time of day.

There is no mechanism in Openfire to boot users after 8 days. If it is happening exactly every 8 days, then it most probably something on your side. Either on the host (Ubuntu, some backups scheduled, etc.) or in the network.

So here we are again, day 8 and people can’t login and my server to server links have gone down and won’t reconnect.

Am I looking at a OS issue or a Java issue those are the only two things I can think of that would release the connections when the services restart.

I understand your frustration, but as i said, i can’t think of anything on the Openfire side doing this on purpose. I was mad when after upgrading to 3.9.3 Openfire couldn’t run half a day without running out of memory and halting. Then i have installed it on Windows Server box (was Arch linux) with the current Java 8 (was old Java 7) and it runs smoothly now. Not sure what was the cause. You should check Openfire logs. Maybe you will find Out of Memory errors there.

Also to clarify. Are you running one central server and all users from all the sites are connecting to this one server? And when 4 users from the affected site can’t login, what about the others? If they still can login, then i would think it is something on the network side of that site.

1200 users in 10 sites all too one server.

those 4 people are the only one having issues as well as our server to server connection to our overseas site disconnects and won’t come back on.

The 4 people when they try to login it kicks them out right away as if they are already logged in but the server shows no session from them.

Which logs should I look in for out of memory, I have looked in all 3 in the web interface (not debug) and no out of memory errors

That are the only logs that i know about. I’m out of ideas.

I changed out the java tonight from oracle to openjdk lets see if that does anything. Now it mirrors our overseas server which doesn’t seem to have the issue.

Seems like switching to from Oracle Java to OpenJDK may have been the answer. We can now search between servers, conference as well just waiting to see if the connections hold longer then the 8 days they have been.

Interesting. I had Oracle Java on linux box and had problems, now i use Oracle java on Windows box and it’s fine. Maybe linux works better with OpenJDK.