OF 4.7.1 presence message repeats status code entry over 6000 times; 100% cpu usage

Openfire version 4.7.1

Debug message: “org.jivesoftware.openfire.muc.MUCRole - Send packet to nickname null and userJid null”
Reason: User seems to be unreachable (didn’t respond to a ping request).

Presence message contains over 6000 times items.

Presence message is sent for any user that is online. So system slow downs with a cpu usage of 100%.

TaskEngine-pool-6032 org.jivesoftware.openfire.muc.MUCRole - Send packet <presence from="XXX" to="YYY" type="unavailable"><x xmlns="http://jabber.org/protocol/muc#user"><item jid="ZZZ" affiliation="member" role="none"><reason>User seems to be unreachable (didn't respond to a ping request).</reason></item><status code="307"/><status code="110"/><status code="110"/> ... <status code="110"/><status code="110"/><status code="110"/></x></presence> to nickname null and userJid null

This is concerning. The nickname value can be null in certain circumstances, but I would not expect the userJid to ever be null.

Can you reproduce this issue?

Can you tell us more about the nature of your setup? Do you run Openfire without any modifications, plugins, etc? What clients are you using?

This issue has happened to us too.

Unfortunately, we have no evidence apart from over 1GB of logs repeating this kind of messages, with over 10k of <status code="110">

The only difference is that we have OF 4.7.4.

We will be monitoring it in order to gather more info about it.

In the meantime, anything we can check?

Regards

Can you identify the ‘from’ and ‘to’ addresses in the stanza? If they refer to a user, what client is that user using? Can you reproduce the problem at will?

<presence from="room@SERVICEID.SERVER/RESOURCE|NICKNAMEX|" to="room@SERVICEID.SERVER/RESOURCE|NICKNAMEX|" type="unavailable"><x xmlns="http://jabber.org/protocol/muc#user"><item jid="username@SERVER/RESOURCE" affiliation="member" role="none"><reason>User seems to be unreachable (didn't respond to a ping request).</reason></item><status code="307"/><status code="110"/>~10K more 110's

Our room names are generated using UDIDs and we are providing extra info as Nicknames in order to paint more info about a given user (real name, avatar, our main system user’s ID to request a full profile, etc.)

The user is using a custom client, developed by us, which is developed as an Angular App using Strophe.JS.

I can’t reproduce it right now, it has been working fine since April’s 28th, and some minor tests don’t provoke it.

The original author of this thread mentioned having null references for user and nickname values. That’s not apparent from the information that you’re providing.

As a single instance, this seems to be a regular removal of a MUC occupant, because Openfire believes the occupant’s client is no longer connected to Openfire. To determine this, Openfire sends an IQ stanza to the client, which the client must respond to (even if it’s just with an error that it cannot understand the request - this is mandated by the XMPP specification). Maybe your client does not respond to all IQ requests?

If you have many active users, and/or clients rejoin automatically, then this perhaps explains why you see a lot of these messages - unless they’re all happening for the same user within the same few seconds - that’s indicative of some kind of looping error.

Sorry, you’re correct, I’ve misinterpreted the original message because we have this line several times too: Nickname is null, assuming room role

What is similar is having almost 10k <status code="110"/> for the same presence.

I’ve checked the logs and this happens hundreds of times (+500) at nearly the very same moment for the same user.

So we have hundreds of lines for the same user at the same time containing thousands of <status code="110"/> .

You mean that it could be related to having to respond to that IQ stanza and we’re not responding? Anyways hundreds of lines with thousands of those status code seems that we have to look somewhere else, could it be?

Regards.

Yes and no. The Openfire code that generates this error clearly has some kind of bug if it triggers 100s of times for the same user. You might be able to work around the issue, by avoiding that code from being triggered by Openfire. You might be able to do that by making sure that the client responds to every IQ request (even if it’s just with an error indicating that the client does not understand the request). This behavior is mandatory in XMPP anyways.