This is an installation that has been running for a month (no issues). Now, users report on their clients (Ex. Pigeon or Spark) that they continuously get dropped from a chat room and reconnect.
What messages or logs should I look at in server?
After reading the update below, please provide feedback on:
If this is a known bug, what is the work-around or time-frame to fix?
If you need further information?
How can we prevent this in the future?
Here is an update to our problem:
After analyzing the client logs, and working with the folks that support Pidgin, they believe the problem we have is a known bug in Openfire.
Bug: The OpenFire server will send (forward) malformed XML 1.0 messages to clients. In this case, the malformed XML message came from someone in the TSN-FL room that contained the following invalid XML code: �
This caused the Pidgin client to report a number of errors and disconnect from the server. (all group rooms and unicast IM’s disconnected)
This problem is specific to group-chat in our environment. It should self-correct for unicast IM.
Through the Pidgin debug logs, the problem was traced to a message that was coming from the ‘TSN-FL’ room. The offending message appears to be a cut & past of a Utopia e-mail notification message.
Further (part of our issues) everytime a user enters the ‘TSN-FL’ room, that last 25 messages in ‘history’ streamed out including the message with the bad XML code. My Pidgin client was configured to auto-login to a number of rooms. Everytime that I (re)started Pidgin, the problem repeated when the client auto-entered the TSN-FL room. Further, Pidgin tries to re-login at regular intervals after the initial ‘XML Parse error’ so this is why we see the repeating ‘enter/leave’ status messages.
In hindsight, had I typed 25 messages () in the ‘TSN-FL’ room I could have cleared out the offending message. More messages in the FL room from other users would have accomplished the same thing eventually.
FYI: Pigeon support ticket: