Deadlock in HttpSession

Today, we ran into a deadlock that involved a HTTP Binding session. I’'ve attached the deadlock report from the thread dump.

The bottom two threads are the most interesting ones. Of those two, the first thread locks on an object that the second wants to lock, and vica versa. Deadlocked. The two classes that are involved are the HttpSession class (used to implement HTTP Binding) and BaseTransport (used in the gateway plugin).

The HttpSession sends out a packet to the gateway plugin. This plugin aquires a lock during processing.

While the gateway plugin is still processing the packet, the HttpSession gets its ‘‘closeConnection()’’ method called. This method is synchronized, which causes the HttpSession to lock on itself. As part of its execution, the closeConnection() method causes another packet (presence unavailable) to be sent to the plugin. Before the plugin can start processing this packet, it has to wait for the first lock to be released.

In the mean time, the first packet causes the plugin (that’‘s still locking), to send a packet back to the user. This packet gets delivered to HttpSession’'s ‘‘deliver()’’ function, which is synchronized. As the HttpSession instance was just locked by the ‘‘closeConnection()’’ method, a deadlock occurs.

We think that HttpSession uses a locking strategy that’'s very (too) generic: The entire object is locked by a number of methods. This prevents other methods from being run simultaneously. Deadlocks are waiting to happen if the first method causes the second method to be executed (SessionListeners that get fired in one method can be expected to generate a call to another method, for example). The locking mechanism of HttpSession might be improved by replacing synchronized methods by usage of synchronize blocks that lock on a (very) specific object.

Please note that we run an adjusted version of the gateway plugin, so line numbers in the attached file might be off. We do use a similar locking mechanism as the ‘‘official’’ plugin though. But this is besides the point, as the synchronization as used in HttpSession can cause problems like these in any plugin that interacts with the client by sending/receiving packets.

We just found another HTTP related deadlock in our system. This time no plugins were involved. Please have a look at the thread dump below.

Regards, Lars

Found one Java-level deadlock:

=============================

“pool-openfire1502”:

waiting to lock monitor 0x00002aaafd6ab5b0 (object 0x00002aaabeb95550, a org.jivesoftware.openfire.http.HttpSession),

which is held by “pool-openfire182”

“pool-openfire182”:

waiting to lock monitor 0x00002aaaf4dd4280 (object 0x00002aaabeb957f8, a org.jivesoftware.openfire.http.HttpSession$HttpVirtualConnection),

which is held by “pool-9-thread-546”

“pool-9-thread-546”:

waiting to lock monitor 0x00002aaafd6ab5b0 (object 0x00002aaabeb95550, a org.jivesoftware.openfire.http.HttpSession),

which is held by “pool-openfire182”

Java stack information for the threads listed above:

===================================================

“pool-openfire1502”:

at org.jivesoftware.openfire.http.HttpSession.getLastActivity(HttpSession.java:312 )

  • waiting to lock <0x00002aaabeb95550> (a org.jivesoftware.openfire.http.HttpSession)
    at org.jivesoftware.openfire.http.HttpSessionManager$HttpSessionReaper.run(HttpSes sionManager.java:291)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java: 885)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
    at java.lang.Thread.run(Thread.java:619)
    “pool-openfire182”:
    at org.jivesoftware.openfire.net.VirtualConnection.close(VirtualConnection.java:13 9)
  • waiting to lock <0x00002aaabeb957f8> (a org.jivesoftware.openfire.http.HttpSession$HttpVirtualConnection)
    at org.jivesoftware.openfire.http.HttpSession.close(HttpSession.java:130)
  • locked <0x00002aaabeb95550> (a org.jivesoftware.openfire.http.HttpSession)
    at org.jivesoftware.openfire.http.HttpSessionManager$HttpSessionReaper.run(HttpSes sionManager.java:293)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:441)
    at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
    at java.util.concurrent.FutureTask.run(FutureTask.java:138)
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java: 885)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)
    at java.lang.Thread.run(Thread.java:619)
    “pool-9-thread-546”:
    at org.jivesoftware.openfire.http.HttpSession.closeConnection(HttpSession.java:636 )
  • waiting to lock <0x00002aaabeb95550> (a org.jivesoftware.openfire.http.HttpSession)
    at org.jivesoftware.openfire.http.HttpSession.access$200(HttpSession.java:42)
    at org.jivesoftware.openfire.http.HttpSession$HttpVirtualConnection.closeVirtualCo nnection(HttpSession.java:707)
    at org.jivesoftware.openfire.net.VirtualConnection.close(VirtualConnection.java:14 4)
  • locked <0x00002aaabeb957f8> (a org.jivesoftware.openfire.http.HttpSession$HttpVirtualConnection)
    at org.jivesoftware.openfire.handler.IQAuthHandler.login(IQAuthHandler.java:211)
    at org.jivesoftware.openfire.handler.IQAuthHandler.handleIQ(IQAuthHandler.java:141 )
    at org.jivesoftware.openfire.handler.IQHandler.process(IQHandler.java:48)
    at org.jivesoftware.openfire.IQRouter.handle(IQRouter.java:300)
    at org.jivesoftware.openfire.IQRouter.route(IQRouter.java:104)
    at org.jivesoftware.openfire.spi.PacketRouterImpl.route(PacketRouterImpl.java:67)
    at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:11 0)
    at org.jivesoftware.openfire.SessionPacketRouter.route(SessionPacketRouter.java:67 )
    at org.jivesoftware.openfire.http.HttpSession.sendPendingPackets(HttpSession.java: 429)
  • locked <0x00002aaabebc3698> (a java.util.LinkedList)

at org.jivesoftware.openfire.http.HttpSessionManager$HttpPacketSender.run(HttpSess ionManager.java:311)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java: 885)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:907)

at java.lang.Thread.run(Thread.java:619)

Found 1 deadlock.

Thanks for reporting this one:

http://www.igniterealtime.org/issues/browse/JM-1105

I should have a fix checked in for 3.3.0 momentarily.

Thanks Again,

Alex