Synchronicity issue in Smack

I suspect some kind of synchronicity issue in Smack.

When executed against Openfire (based on HEAD of its main branch, which at the time of writing is 4.9.0-SNAPSHOT), the following Smack integration test (from smack-integration-test-4.5.0-alpha4-20240509.011046-9.jar) fails consistently:

  • MultiUserChatOccupantIntegrationTest.mucChangeNicknameInformationTest (Normal)

As part of the test, a MUC occupant changes its nickname. This, internally, uses Smack’s org.jivesoftware.smackx.muc.MultiUserChat.changeNickname. This method sends a presence stanza and blocks until the nickname change is reflected.

This is a snippet of Smack’s implementation of the changeNickname method:

// We change the nickname by sending a presence packet where the "to"
// field is in the form "roomName@service/nickname"
// We don't have to signal the MUC support again
Presence joinPresence = connection.getStanzaFactory().buildPresenceStanza()
        .to(jid)
        .ofType(Presence.Type.available)
        .build();

// Wait for a presence packet back from the server.
StanzaFilter responseFilter =
    new AndFilter(
        FromMatchesFilter.createFull(jid),
        new StanzaTypeFilter(Presence.class));
StanzaCollector response = connection.createStanzaCollectorAndSend(responseFilter, joinPresence);
// Wait up to a certain number of seconds for a reply. If there is a negative reply, an
// exception will be thrown
response.nextResultOrThrow();

The test fails with a timeout, as the reflection does not arrive.

This stack trace is logged by the test execution.

java.lang.reflect.InvocationTargetException
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:77)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.base/java.lang.reflect.Method.invoke(Method.java:568)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.lambda$runTests$0(SmackIntegrationTestFramework.java:476)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.runConcreteTest(SmackIntegrationTestFramework.java:551)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework$PreparedTest.run(SmackIntegrationTestFramework.java:759)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.runTests(SmackIntegrationTestFramework.java:539)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.run(SmackIntegrationTestFramework.java:277)
	at org.igniterealtime.smack.inttest.SmackIntegrationTestFramework.main(SmackIntegrationTestFramework.java:115)
Caused by: org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). StanzaCollector has been cancelled. Waited for response using: AndFilter: (FromMatchesFilter (full): smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-new-C91X19, StanzaTypeFilter: Presence).
	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:124)
	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:104)
	at org.jivesoftware.smack.StanzaCollector.nextResultOrThrow(StanzaCollector.java:281)
	at org.jivesoftware.smack.StanzaCollector.nextResultOrThrow(StanzaCollector.java:228)
	at org.jivesoftware.smackx.muc.MultiUserChat.changeNickname(MultiUserChat.java:1314)
	at org.jivesoftware.smackx.muc.MultiUserChatOccupantIntegrationTest.mucChangeNicknameInformationTest(MultiUserChatOccupantIntegrationTest.java:981)
	... 10 more

Smack’s debug logging shows the following (note that I have configured a 5000 millisecond timeout, rather than the default 1 minute):

Connection 2 sends the nickname change:

16:50:38.806 SENT (2): 
<presence to='smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-new-C91X19' id='CXEGH-31'>
  <priority>
    0
  </priority>
  <c xmlns='http://jabber.org/protocol/caps' hash='sha-1' node='https://igniterealtime.org/projects/smack' ver='lWpWNgqtfynY+MxupmhOr2LZqOg='/>
</presence>

Instantly, another occupant of the MUC room (connection 1) as well as the user itself (on connection 2) receives the presence ‘unvailable’ of the old nickname:

16:50:38.850 RECV (1): 
<presence to="smack-inttest-one-2gd5a@example.org/one-2gd5a" id="CXEGH-30" from="smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-original-C91X19" type="unavailable">
  <priority>
    0
  </priority>
  <c xmlns="http://jabber.org/protocol/caps" hash="sha-1" node="https://igniterealtime.org/projects/smack" ver="lWpWNgqtfynY+MxupmhOr2LZqOg=">
  </c>
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item jid="smack-inttest-two-2gd5a@example.org/two-2gd5a" affiliation="none" role="participant" nick="two-new-C91X19"/>
    <status code="303"/>
  </x>
</presence>
16:50:38.850 RECV (2): 
<presence to="smack-inttest-two-2gd5a@example.org/two-2gd5a" id="CXEGH-30" from="smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-original-C91X19" type="unavailable">
  <priority>
    0
  </priority>
  <c xmlns="http://jabber.org/protocol/caps" hash="sha-1" node="https://igniterealtime.org/projects/smack" ver="lWpWNgqtfynY+MxupmhOr2LZqOg=">
  </c>
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item jid="smack-inttest-two-2gd5a@example.org/two-2gd5a" affiliation="none" role="participant" nick="two-new-C91X19"/>
    <status code="303"/>
    <status code="110"/>
  </x>
</presence>

The other occupant (on connection 1) also receives the presence ‘available’ that communicates the new nickname, but the user itself (on connection 2) does not at this time:

16:50:38.851 RECV (1): 
<presence to="smack-inttest-one-2gd5a@example.org/one-2gd5a" id="CXEGH-31" from="smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-new-C91X19">
  <priority>
    0
  </priority>
  <c xmlns="http://jabber.org/protocol/caps" hash="sha-1" node="https://igniterealtime.org/projects/smack" ver="lWpWNgqtfynY+MxupmhOr2LZqOg=">
  </c>
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item jid="smack-inttest-two-2gd5a@example.org/two-2gd5a" affiliation="none" role="participant"/>
  </x>
</presence>

After the 5000ms timeout, the test is being teared down as a result of the NoResponseException having happened. At the exact same timestamp, the stanza that was being waited on is logged (as if it just arrived) on connection 2:

16:50:43.807 SENT (1): 
<iq to='smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org' id='BBUK3-145' type='set'>
  <query xmlns='http://jabber.org/protocol/muc#owner'>
    <destroy>
      <reason>
        test fixture teardown
      </reason>
    </destroy>
  </query>
</iq>
16:50:43.807 RECV (2): 
<presence to="smack-inttest-two-2gd5a@example.org/two-2gd5a" id="CXEGH-31" from="smack-inttest-changenickname-2gd5a-v6rj88@conference.example.org/two-new-C91X19">
  <priority>
    0
  </priority>
  <c xmlns="http://jabber.org/protocol/caps" hash="sha-1" node="https://igniterealtime.org/projects/smack" ver="lWpWNgqtfynY+MxupmhOr2LZqOg=">
  </c>
  <x xmlns="http://jabber.org/protocol/muc#user">
    <item jid="smack-inttest-two-2gd5a@example.org/two-2gd5a" affiliation="none" role="participant"/>
    <status code="110"/>
  </x>
</presence>

The timing of the arrival of this missing stanza is curious, more so because it’s very reproducible.

I have used packet sniffing. This shows that the stanza that is registered by Smack only after the 5 second timeout was actually sent by the server 5 seconds before. This is illustrated by the screenshot of the packet dump below. The dump is filtered for data belonging to the client that changes its nickname only. In this screenshot:

  • packet 20395 is the nickname change request sent by the client
  • packet 20399 is the server sending the client the presence ‘unavailable’ for the old nickname
  • packet 20401 is the server sending the client the presence ‘available’ for the new nickname

Note that all these packets are exchanged at the 16:50:38 timestamp, comfortably 5 seconds before the timeout is triggered by Smack.

Smack seems to be stalling the processing of the stanza, which causes the test to fail. What’s going on here?

For reference: here are the TCP dump and full debug log of the test:
tcpdump.pcapng (86.3 KB)
MultiUserChatOccupantIntegrationTest.mucChangeNicknameInformationTest (Normal).log (9.6 KB)

Please take a thread dump at the 5 second time window (so maybe increase it back to 60 seconds when attempting to take the thread dump).

For posterity: Flow created a thread dump that I’ll attach to this message.

It shows an issue with threads #1 and #19, suggesting that Smack has a bug with the implementation of org.jivesoftware.smackx.muc.MultiUserChat#changeNickname that attempts to use a monitor that is simultaneously used by the stanza collector that’s processing stanzas received by the server that are part of the nickname change.

threaddump.txt (28.9 KB)

This got fixed and tracked under [SMACK-942] - Ignite Realtime Jira

1 Like