powered by Jive Software

Smack thread indefinite freeze when joining MUC

While developing a SINT extension against the current main branch of Smack (4.5.0-alpha), one (of many) test runs suddenly seemed to grind to a halt. After a couple of minutes, there was no progression. I took a thread dump of the process, which gave me the following:

"main" #1 prio=5 os_prio=0 cpu=1008,36ms elapsed=400,35s tid=0x00007f19dc018800 nid=0xae0f in Object.wait()  [0x00007f19e1f68000]
   java.lang.Thread.State: WAITING (on object monitor)
	at java.lang.Object.wait(java.base@11.0.11/Native Method)
	- waiting on <0x00000004525a9238> (a org.jivesoftware.smackx.muc.MultiUserChat$3)
	at java.lang.Object.wait(java.base@11.0.11/Object.java:328)
	at org.jivesoftware.smackx.muc.MultiUserChat.enter(MultiUserChat.java:423)
	- waiting to re-lock in wait() <0x00000004525a9238> (a org.jivesoftware.smackx.muc.MultiUserChat$3)
	at org.jivesoftware.smackx.muc.MultiUserChat.join(MultiUserChat.java:720)
	- locked <0x00000004525a8e30> (a org.jivesoftware.smackx.muc.MultiUserChat)
	at org.jivesoftware.smackx.muc.MultiUserChat.join(MultiUserChat.java:648)
	at org.jivesoftware.smackx.muc.MultiUserChatRolesAffiliationsPrivilegesIntegrationTest.mucTestPersistentAffiliation(MultiUserChatRolesAffiliationsPrivilegesIntegrationTest.java:659)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(java.base@11.0.11/Native Method)
	at jdk.internal.reflect.NativeMethodAccessorImpl.invoke(java.base@11.0.11/NativeMethodAccessorImpl.java:62)
	at jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(java.base@11.0.11/DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(java.base@11.0.11/Method.java:566)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.lambda$runTests$0(SmackIntegrationTestFramework.java:405)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework$$Lambda$120/0x0000000840213440.execute(Unknown Source)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.runConcreteTest(SmackIntegrationTestFramework.java:480)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework$PreparedTest.run(SmackIntegrationTestFramework.java:681)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.runTests(SmackIntegrationTestFramework.java:468)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.run(SmackIntegrationTestFramework.java:229)
	- locked <0x00000004154001f0> (a org.igniterealtime.smack.inttest.SmackIntegrationTestFramework)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.main(SmackIntegrationTestFramework.java:107)

It seems that the thread is waiting indefinitely on reflected self presence, as part of a user joining a MUC room.

Line 423 is the one that invokes wait() in the snippet below:

synchronized (presenceListener) {
    // Only continue after we have received *and* processed the reflected self-presence. Since presences are
    // handled in an extra listener, we may return from enter() without having processed all presences of the
    // participants, resulting in a e.g. to low participant counter after enter(). Hence we wait here until the
    // processing is done.
    while (!processedReflectedSelfPresence) {

I’m not ruling out that the server is to blame here. I cannot verify if it actually sent the self-presence. I did check that according to the server, the user had entered the room when this thread dump was made.

Should Smack guard against misbehaving servers, by adding some kind of time-out here?

Could this be caused by an issue in Smack? Is the setting and reading of the value in the processedReflectedSelfPresence field (which is defined as being volatile) thread safe?

} else if (mucUser.getStatus().contains(MUCUser.Status.PRESENCE_TO_SELF_110)) {
    processedReflectedSelfPresence = true;
    synchronized (this) {
} else {

A misbehaving server is unlikely, otherwise I had put a timeout here. :wink:

The code in question is awaiting that completion of the processing of the reflected self-presence. Line 404

reflectedSelfPresence = selfPresenceCollector.nextResultOrThrow(conf.getTimeout());

already ensures that the self-presence was received, but not processed (at least, that is the intention of the code).

As with the information available at the moment, my guess would be that something blocks the execution of the presenceLIstener so that it is not able to notify about the processed self-presence.