Crash Openfire too many time

Hi,

I’ve a very big problem with my Openfire install. I attempt to explain you because i search a solution.

i’ve an openfire serveur with 3000 users in DB and 230k rows in ofRoster.

i use 2 differents clients in my web-site (sparkWeb to One2One chat and Muckl to Muc chat)

My openfire install crash when i’ve 200-300 sessions in admin Console.

I don’t understand why, because i’ve a very usefull hardware.

My Openfire Server 3.8.2 has ;

Version de la JVM et Fournisseur :
1.7.0_51 Oracle Corporation – Java HotSpot™ 64-Bit Server VM
Serveur d’Applications :
jetty/7.x.y-SNAPSHOT
Host Name:
openfire
OS / Matériel :
Linux / amd64
Locale / Fuseau Horaire :
fr / Heure d’Europe centrale (1 GMT)
Mémoire Java



3078,99 MB of 10923,00 MB (28,2%) used for 66 actives sessions

/usr/java/jre1.7.0_51/bin/java -Xms6144m -Xmx12288m -server -DopenfireHome=/usr/share/openfire -Dopenfire.lib.dir=/usr/share/openfire/lib -classpath /usr/share/openfire/lib/startup.jar -jar /usr/share/openfire/lib/startup.jar

i’ve configure the limits.d with these lines

  •           soft    nofile          65535
    
  •           hard    nofile          65535
    

Ajout du 02/06/2014

openfire soft nofile 524288

openfire hard nofile 524288

and /etc/sysctl.conf with this line

s.file-max = 524288

i can give more details to guys who want help me to fix this problem. I cannot continue to start & stop the server too many time every day.

OS : Linux SERVER_NAME 2.6.32-5-amd64 #1 SMP Mon Oct 3 03:59:20 UTC 2011 x86_64 GNU/Linux

in the openfire / error.log i found a lot of :

org.jivesoftware.openfire.muc.cluster.MUCRoomTask - Room not found: ROOM_NAME

java.lang.IllegalArgumentException: Room not found: ROOM_NAME

at org.jivesoftware.openfire.muc.cluster.MUCRoomTask.getRoom(MUCRoomTask.java:66)

at org.jivesoftware.openfire.muc.cluster.MUCRoomTask.execute(MUCRoomTask.java:83)

at org.jivesoftware.openfire.muc.cluster.BroadcastPresenceRequest.run(BroadcastPre senceRequest.java:62)

at org.jivesoftware.openfire.muc.spi.LocalMUCRoom.broadcastPresence(LocalMUCRoom.j ava:1071)

at org.jivesoftware.openfire.muc.spi.LocalMUCRoom.leaveRoom(LocalMUCRoom.java:799)

at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:549)

at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:197)

at org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl.processPacket(MultiU serChatServiceImpl.java:308)

at org.jivesoftware.openfire.component.InternalComponentManager$RoutableComponents .process(InternalComponentManager.java:587)

at org.jivesoftware.openfire.spi.RoutingTableImpl.routeToComponent(RoutingTableImp l.java:356)

at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.jav a:238)

at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:170)

at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:84)

at org.jivesoftware.openfire.handler.PresenceUpdateHandler.broadcastUnavailableFor DirectedPresences(PresenceUpdateHandler.java:487)

at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:161)

at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:135)

at org.jivesoftware.openfire.handler.PresenceUpdateHandler.process(PresenceUpdateH andler.java:199)

at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:148)

at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:84)

at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:84)

at org.jivesoftware.openfire.SessionManager$ClientSessionListener.onConnectionClos e(SessionManager.java:1164)

at org.jivesoftware.openfire.net.VirtualConnection.notifyCloseListeners(VirtualCon nection.java:214)

at org.jivesoftware.openfire.net.VirtualConnection.close(VirtualConnection.java:19 0)

at org.jivesoftware.openfire.http.HttpSession.close(HttpSession.java:198)

at org.jivesoftware.openfire.handler.IQBindHandler.handleIQ(IQBindHandler.java:139 )

at org.jivesoftware.openfire.handler.IQHandler.process(IQHandler.java:65)

at org.jivesoftware.openfire.IQRouter.handle(IQRouter.java:374)

at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:121)

at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:76)

at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:10 8)

at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:69 )

at org.jivesoftware.openfire.http.HttpSession.sendPendingPackets(HttpSession.java: 645)

at org.jivesoftware.openfire.http.HttpSessionManager$HttpPacketSender.run(HttpSess ionManager.java:419)

at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

in war.log a lot of

2014.02.13 16:22:58 org.jivesoftware.openfire.http.HttpBindServlet - Client provided invalid session: 83fbef9b. [193.54.115.93]

2014.02.13 16:22:59 org.jivesoftware.openfire.http.HttpBindServlet - Client provided invalid session: 83fbef9b. [193.54.115.93]

2014.02.13 16:31:53 org.jivesoftware.openfire.http.HttpSession - Deliverable unavailable for 884659

2014.02.13 16:32:03 org.jivesoftware.openfire.http.HttpBindServlet - Client provided invalid session: 2dc6a64c. [193.54.115.93]

and in info.log

2014.02.13 16:29:57 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: d3be721e4547db6aef7ac5f5479f3596ae8a28db0@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:29:57 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: da4b9237bacccdf19c0760cab7aec4a8359010b00@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:29:57 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: 83e6d84bd1181d61c202289b2055bc6cfacd19a40@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:31:27 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: ce717558c2f5fb24a32efbf9142b8e1a734f9f720@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:31:57 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: 356a192b7913b04c54574d18c28d46e6395428ab0@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:32:27 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: 83e6d84bd1181d61c202289b2055bc6cfacd19a40@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:32:27 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: 83e6d84bd1181d61c202289b2055bc6cfacd19a40@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:32:27 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: 356a192b7913b04c54574d18c28d46e6395428ab0@SERVEUR_DOMAINE/ressource_name

2014.02.13 16:32:27 org.jivesoftware.openfire.http.HttpSessionManager - Closing idle session: SERVEUR_DOMAINE

2 plugins enabled :

Monitoring Service 1.3.2-beta1

Presence Service 1.5.1

i need help and thank you for all help me.

PS : sorry for my poor english.

A few questions:

Does the JVM crash or does Openfire stop procession packets?

Is Openfire idle then or using 1/2/all cores?

Can you create a thread dump when this issue occurs (kill -3 pid-of-openfire) before you kill the process?

Does it help if you restart only SparkWeb?

Looking at the logs you have a lot of issues with SparkWeb / HTTP binding. Finding experts for mina/nio may be hard.

Hello LG,

i try to answer.

  • Openfire is simply at rest, it stops processing the received packets.

  • At the next crash, I will make a thread dump before restarting openfire.

For Sparkweb, it’s a web app deployed in my webServer , not in my OpenfireServer.

i’ve too virtual servers.

The first host my website with sparkWeb and Muckl (WEBAPP_SERVER)

The second host my Openfire install (OPENFIRE_SERVER)

Muckl and Sparweb in the first server request WEBAPP_SERVER/http-bind/

WEBAPP_SERVER has a proxy configuration to redirect /http-bind/ request to OPENFIRE_SERVER:HTTP_BIND_PORT/http-bind/

I don’t know how to restart only SparkWeb.

in the webServer’s logs, i found a lot of

[Thu Feb 13 14:13:06 2014] [error] proxy: HTTP: disabled connection for (OPENFIRE_SERVER)

[Thu Feb 13 14:13:19 2014] [error] proxy: HTTP: disabled connection for (OPENFIRE_SERVER)

Ce message a été modifié par: Rodrigue

I don’t know how to restart only SparkWeb.

I thought of restarting the HTTP binding within Openfire / dropping all HTTP connections. If you can still reach the admin console try to disable it there, while I have no idea whether the connections will then really be forcefully closed.

Setting up iptables to block traffic from $webapp to Openfire could be also an option, anyhow the connections may be kept open for a while.

Hopefully someone with cluster experience can share some thought about setting up another Openfire instance on $webapp and connect the web clients to this instance.

Hi LG,

i try this week-end to close http-bind communication between WEBAPP_SERVER and OPENFIRE_SERVER to see if OPENFIRE_SERVER does not crash.

i post here my experimentations Monday.

I understand why i cannot have more 200 concurrents users with my hardware.

Do you think, i obtain best results if i change my xmpp clients?

i see, there is Jappix-Mini whose a can use and ofMeet to make the same fonctions that SparkWeb and Muckl.