We’re seeing an issue where MUC users are unable to post / receive messages from the chat rooms they’re in, and it appears to directly corrolate with large volume of messages like this in the openfire error log:
2017.07.18 10:40:18 org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Internal server error
java.lang.NullPointerException
at org.jivesoftware.openfire.group.ConcurrentGroupMap.includesKey(ConcurrentGroupMap.java:49)
at org.jivesoftware.openfire.muc.spi.LocalMUCRoom.joinRoom(LocalMUCRoom.java:642)
at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:471)
at org.jivesoftware.openfire.muc.spi.LocalMUCUser.process(LocalMUCUser.java:177)
at org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl.processPacket(MultiUserChatServiceImpl.java:366)
at org.jivesoftware.openfire.component.InternalComponentManager$RoutableComponents.process(InternalComponentManager.java:606)
at org.jivesoftware.openfire.spi.RoutingTableImpl.routeToComponent(RoutingTableImpl.java:407)
at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.java:249)
at org.jivesoftware.openfire.PresenceRouter.handle(PresenceRouter.java:166)
at org.jivesoftware.openfire.PresenceRouter.route(PresenceRouter.java:80)
at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:88)
at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:122)
at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:73)
at org.jivesoftware.openfire.http.HttpSession.sendPendingPackets(HttpSession.java:641)
at org.jivesoftware.openfire.http.HttpSession$HttpPacketSender.run(HttpSession.java:1284)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
I didn’t see anything obvious on ConcurrentGroupMap.java:49 which would trigger this. What could cause a NPE in org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpland how can we best debug / troubleshoot this?
We are using clustering, if that could affect things. Running Openfire v4.1.4 on Ubuntu Linux.