Deadlock between the reader and writer with Stream Management

We’re running Smack 4.2.4 and we recently enabled Stream Management. We observed a deadlock between the XMPPTCPConnection Reader and Writer. The reader is blocked trying to send an SM acknowledgement:

"Smack Reader (0)" #19 daemon prio=5 os_prio=0 tid=0x00007f513400a800 nid=0x4af6 waiting on condition [0x00007f516d6ef000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x000000070041b100> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at org.jivesoftware.smack.util.ArrayBlockingQueueWithShutdown.put(ArrayBlockingQueueWithShutdown.java:251)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.sendStreamElement(XMPPTCPConnection.java:1343)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection.sendSmAcknowledgementInternal(XMPPTCPConnection.java:1695)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection.access$2900(XMPPTCPConnection.java:151)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets(XMPPTCPConnection.java:1200)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.access$300(XMPPTCPConnection.java:1000)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader$1.run(XMPPTCPConnection.java:1016)
    at java.lang.Thread.run(Thread.java:748)

The writer is blocked trying to add to the unacknowledged stanzas list:

"Smack Writer (0)" #18 daemon prio=5 os_prio=0 tid=0x00007f5134009800 nid=0x4af5 waiting on condition [0x00007f516ddfc000]
   java.lang.Thread.State: WAITING (parking)
    at sun.misc.Unsafe.park(Native Method)
    - parking to wait for  <0x00000007004cad70> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)
    at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
    at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:2039)
    at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:353)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.maybeAddToUnacknowledgedStanzas(XMPPTCPConnection.java:1546)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.writePackets(XMPPTCPConnection.java:1446)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.access$3300(XMPPTCPConnection.java:1264)
    at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter$1.run(XMPPTCPConnection.java:1312)
    at java.lang.Thread.run(Thread.java:748)

I have a heapdump if I can consult if you need details, though I can’t share it as it’s from a machine serving users. Both the writer’s packet queue and the unacknowledged stanzas queue are full.

This commit seems to address a similar issue, but maybeAddToUnacknowledgedStanzas in master still performs a blocking put().

Regards,
Boris

4 Likes

Thanks for you bug report. It appears the issue exists in the current smack master too.

We could

  • replace the unacknowledgesSTanzas.put() with a throwing add()
  • make the `unacknowledgedStanzas queue unbounded
  • increase the size of the bounded queue
  • and/or lower the threadsholds when we requests SM acks (currently at 80% of the queue capacity)

The underlying problem is that the client does not now how many inbound stanzas are between the current one, and the ack. The number could potentially be very large…

Om first thought, I’d go with unbounded queues by default, maybe making other options available though configuration. For applications where, by design, this kind of capacity is needed, developers can be expected to worry about scenarios revolving around resource contention.

Created SMACK-881 to track this.

BTW @Boris_Grozev since I think I saw that you disabled SM in jitsi: You may can work around this issue with

connection.addRequestAckPredicate(ForEveryStanza.INSTANCE)

Although I am not sure if SM provides any benefit in Jitsi’s case. But I do not have enough insight in how Smack is used there.

1 Like

What is even more troublesome is that Smack always had this code

if (requestAckPredicates.isEmpty()) {
    // Assure that we have at lest one predicate set up that so that we request acks
    // for the server and eventually flush some stanzas from the unacknowledged
    // stanza queue
    requestAckPredicates.add(Predicate.forMessagesOrAfter5Stanzas());
}

in Smack. So if there are no other predicates configured, Smack will, by default request an ack for every message or after 5 outgoing stanzas.

I have to think about how this being active, the unacked stanza queue could become full, assuming the server actually sends a response to those ack requests. One could speculate that, if the server’s outgoing stanza queue is well filled, and the ack responses are enqueued and not put into the front of the queue by the server, then the response is so long delayed that the unacked stanza queue hits its capacity.

Thanks for the suggestion! I don’t think we need SM for the connections we use smack on, so we’ll just keep it disabled for now.

1 Like

This topic was automatically closed 100 days after the last reply. New replies are no longer allowed.