As the problem only occurs in production, we’‘re not comfortable running a beta JVM that includes a questionable fix. Instead, we’‘d ideally like to see the entire Semaphore mechanism to be removed. As the same functionality is implementable by ordinary Java synchronization, I’‘d opt for that. I think it’'s less complex anyway.
What struck me as odd was that The HttpSessionManager (through the work thread), first polls the session for remaining packets (doing tricky semaphore stuff here to make sure only one consumer is active at a time). After that, the HttpSessionManager creates a router object based on that same Session (from which it just extracted data), which in the end is used to send any queued packets (again, the stuff from the Session).
It seemed more logical to me to make the HttpSession itself responsible for routing any queued packets. By doing that, you can remove the method that queries the Session for pending packets and the method that releases the lock that’'s set by that operation. Instead, one new ‘‘sendPendingPackets’’ method is added, that simply sends any remaining packets itself. By making that method synchronized, we make sure that no two worker threads act on that specific queue simultaneously. Separate Sessions can still be handled concurrently though.
I’'ve included our suggested modifications in the attachment. Please note that their are based on the 3.2.3 tag in SVN.
I’‘ve tried to change as little as possible to other code that isn’‘t directly related. I think there’‘s a couple of other modifications that could be considered, but that’‘s something for a possible step 2. I’'ve made some comments in the code that reflect some of these changes.
This is my proposal for the Runnable in HttpSessionManager. I’'ve commented out the original code.
/** * A runner that gurantees that the packets per a session will be sent and * processed in the order in which they were received. */
private class HttpPacketSender implements Runnable
{
private HttpSession session; HttpPacketSender(HttpSession session)
{
this.session = session;
} public void run()
{
// Collection<Element> elements = null;
// try
// {
// elements = session.getPacketsToSend(20, TimeUnit.MILLISECONDS);
// }
// catch (Throwable t)
// {
// /** Do nothing * */
// }
// if (elements == null)
// {
// this.init();
// return;
// }
//
// SessionPacketRouter router = new SessionPacketRouter(session);
//
// try
// {
// for (Element packet : elements)
// {
// try
// {
// router.route(packet);
// }
// catch (UnsupportedEncodingException e)
// {
// Log.error(
// "Client provided unsupported encoding type in auth request",
// e);
// }
// catch (UnknownStanzaException e)
// {
// Log.error("Client provided unknown packet type", e);
// }
// }
// }
// finally
// {
// session.releasePacketsToSend();
// }
session.sendPendingPackets();
} /*
* TODO: the inital call to init() if elements == null seems to encourage never-ending loops. I removed
* it in my alternative, but we should check if this is safe. If it is not, we can easily adjust
* HttpSession.sendPendingPackets() to return a boolean value instead.
*/
private void init()
{
sendPacketPool.execute(this);
}
}
Next, the addition I made to HttpSession. Note that much of this code is directly taken from the code that I commented out in the previous listing.
/** * This methods sends any pending packets in the session. If no packets are * pending, this method simply returns. The method is declared synchronized * to avoid simultanious sending operations on this Session object. If two * threads try to run this method simultaniously, the first one will trigger * the pending packets to be sent, while the second one will simply return * (as there are no packets left to send). */
protected synchronized void sendPendingPackets()
{
if (packetsToSend.size() <= 0)
{
return;
}
final Collection<Element> elements = packetsToSend.remove(); /*
* TODO: This is a result of copy/pasting the code from
* HttpSessionManager. Can we make this SessionPacketRouter a class
* attribute? That would save us from having to recreate it for each
* batch of packets that''s being sent.
*/
final SessionPacketRouter router = new SessionPacketRouter(this); for (Element packet : elements)
{
try
{
router.route(packet);
}
catch (UnsupportedEncodingException e)
{
Log.error(
"Client provided unsupported encoding type in auth request",
e);
}
catch (UnknownStanzaException e)
{
Log.error("Client provided unknown packet type", e);
}
}
}
Note that the getPacketsToSend() and releasePacketsToSend() methods in HttpSession are no longer used.
We’‘ve successfully deployed an Openfire patched like this to our test domain. We haven’'t tried stress or load tests, but normal client behavior (using http binding) seems to be ok.