Potential deadlock in PacketWriter

(I’m a new user to the smack community and not sure if I shall post this to the bug tracker. Let me know if yes.)

This deadlock is unlikely to happen with a large queue, say, 500 (PacketWriter.java:61), but pretty repeatable when the queue’s capacity is small. Here is a scenario (there might be more):

Suppose the queue is empty.

  • At time slice 1: Thread 1, the Packet Writer, just enters the while loop of nextPacket() but haven’t called queue.wait() yet.
  • At time slice 2: Thread 2 calls sendPacket() N+1 times where N is the capacity of the queue, so it blocks on queue.put() as the queue is full.
  • At time slice 3: Thread 1 calls queue.wait()
    Both threads wait forever.

Here is an actual deadlock on my Ubuntu machine with queue capacity 1:

“Smack Packet Writer (0)” daemon prio=10 tid=0xb4689000 nid=0x7e6f in Object.wait() [0xb2fb3000…0xb2fb3f84]

java.lang.Thread.State: WAITING (on object monitor)

at java.lang.Object.wait(Native Method)

  • waiting on <0x1962f7b8> (a java.util.concurrent.ArrayBlockingQueue)

at java.lang.Object.wait(Object.java:502)

at org.jivesoftware.smack.PacketWriter.nextPacket(PacketWriter.java:173)

  • locked <0x1962f7b8> (a java.util.concurrent.ArrayBlockingQueue)

at org.jivesoftware.smack.PacketWriter.writePackets(PacketWriter.java:189)

at org.jivesoftware.smack.PacketWriter.access$000(PacketWriter.java:40)

at org.jivesoftware.smack.PacketWriter$1.run(PacketWriter.java:76)

“x” prio=10 tid=0x08241000 nid=0x7e69 waiting on condition [0xb3193000…0xb3193e84]

java.lang.Thread.State: WAITING (parking)

at sun.misc.Unsafe.park(Native Method)

  • parking to wait for <0x1962fe20> (a java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject)

at java.util.concurrent.locks.LockSupport.park(LockSupport.java:186)

at java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(Abs tractQueuedSynchronizer.java:1978)

at java.util.concurrent.ArrayBlockingQueue.put(ArrayBlockingQueue.java:280)

at org.jivesoftware.smack.PacketWriter.sendPacket(PacketWriter.java:95)

at org.jivesoftware.smack.XMPPConnection.sendPacket(XMPPConnection.java:480)

at org.jivesoftware.smackx.muc.MultiUserChat.sendMessage(MultiUserChat.ja

… …

And here is my proposed fix (also attached):

Index: E:/Users/weihan/Documents/_ws/smack/source/org/jivesoftware/smack/PacketWriter. java

===================================================================

— E:/Users/weihan/Documents/_ws/smack/source/org/jivesoftware/smack/PacketWriter. java (revision 11693)

+++ E:/Users/weihan/Documents/_ws/smack/source/org/jivesoftware/smack/PacketWriter. java (working copy)

@@ -98,9 +98,6 @@

ie.printStackTrace();

return;

}

  •        synchronized (queue) {
    
  •            queue.notifyAll();
    
  •        }
    

// Process packet writer listeners. Note that we’re using the sending

// thread so it’s expected that listeners are fast.

@@ -146,9 +143,6 @@

*/

public void shutdown() {

done = true;

  •    synchronized (queue) {
    
  •        queue.notifyAll();
    
  •    }
    

}

/**

@@ -165,19 +159,14 @@

  • @return the next packet for writing.

*/

private Packet nextPacket() {

  •    Packet packet = null;
    
  •    // Wait until there's a packet or we're done.
    
  •    while (!done && (packet = queue.poll()) == null) {
    
  •    while (true) {
    

try {

  •            synchronized (queue) {
    
  •                queue.wait();
    
  •            }
    
  •            return done ? null : queue.take();
    

}

catch (InterruptedException ie) {

// Do nothing

}

}

  •    return packet;
    

}

private void writePackets(Thread thisThread) {

Note that I removed the notification in shutdown(). If one really wants to notify the Writer thread on shutdown, I suggest to 1) remove the field done, 2) change the queue’s template type from Packet to a more general type, and insert a special-typed element to the queue whenever shutdown() is called. 3) writePackets() drain the queue and exits on receiving this special-typed element.
diff.txt.zip (814 Bytes)

Hi Weihan,

thanks for finding this bug.

Since Object.wait() and Object.notifyAll() shouldn’t be used anyway your solution is much better and makes use of the features of a blocking queue. But the Writer should still try to send all packet left in the queue on shutdown.

What do you think about this: Remove all .wait() and notifyAll() from the code and change the nextPacket() method to this:

private Packet nextPacket() {
    Packet packet = null;
    // Wait until there's a packet or we're done.
    try {
        while (!done && (packet = queue.poll(500, TimeUnit.MILLISECONDS)) == null);
    }
    catch (InterruptedException e) {
        // Do nothing
    }
    return packet;
}

Shouldn’t this fix the problem too without removing the shutdown functionality?

Do you have any use-case where it is useful to decrease/change the capacity of the queue? If so, maybe the capacity should be configurable, for example via the ConnectionConfiguration class.

If you have time you could also write a unit-test to verify the bug or your patch. Probably this libraries may help: Thread Weaver, Mockito and PowerMock.

I can file the bug for you in the bugtracker, if you want.

Best regards,

Henning

Hi Henning,

Yeah, queue.poll(…) also works, except it wastes a tiny bit cycles every half seconds

In my case I use xmpp as a transport layer, and I need packet-level QoS, i.e. high-priority packets may preempt low-priority ones and get sent first. So I have to set the queue capacity to 1 and implement packet priorities in a separate priority queue. It would be perfect if we could configure the capacity through ConnectionConfiguration.

I’ll write some unit tests when free–pretty busy lately. Lemme try…

Thanks,

Weihan