Ok, Ive done more debugging, and some learning about TLS (the GNUTLS library, specificly). Gaim is waiting for a reply from a closed socket connection. Im not yet ready to say whos fault that is (Wildfire? Java? Gaim? GnuTLS?)
In the Gaim backtrace, we are getting hung up on the gnutls library call to gnutls_bye(…), which has the following documentation:
/**
-
gnutls_bye - This function terminates the current TLS/SSL connection.
-
@session: is a &gnutls_session structure.
-
@how: is an integer
-
Terminates the current TLS/SSL connection. The connection should
-
have been initiated using gnutls_handshake().
-
@how should be one of GNUTLS_SHUT_RDWR, GNUTLS_SHUT_WR.
-
In case of GNUTLS_SHUT_RDWR then the TLS connection gets terminated and
-
further receives and sends will be disallowed. If the return
-
value is zero you may continue using the connection.
-
GNUTLS_SHUT_RDWR actually sends an alert containing a close request
-
and waits for the peer to reply with the same message.
-
In case of GNUTLS_SHUT_WR then the TLS connection gets terminated and
-
further sends will be disallowed. In order to reuse the connection
-
you should wait for an EOF from the peer.
-
GNUTLS_SHUT_WR sends an alert containing a close request.
-
This function may also return GNUTLS_E_AGAIN or GNUTLS_E_INTERRUPTED; cf.
-
gnutls_record_get_direction().
**/
/code
This seemed interesting, so I did some debugging of the session itself. Here is an ssldump of a normal SSL connection closing:
1 12 14.2167 (0.0007) S>CV3.0(32) application_data
1 13 14.2167 (0.0000) S>CV3.0(32) application_data
1 14 30.6220 (16.4052) C>SV3.0(32) Alert
1 15 30.6224 (0.0004) S>CV3.0(32) Alert
1 30.6244 (0.0020) S>C TCP FIN
1 30.6250 (0.0005) C>S TCP FIN
/code
Notice than an alert gets sent from the client to the server, then from the server to the client, THEN the connection gets closed.
Now, here it is with Gaim and Wildfire:
1 30 1.4051 (0.0008) S>CV3.1(288) application_data
1 31 5.5092 (4.1041) C>SV3.1(320) application_data
1 32 5.5098 (0.0005) C>SV3.1(176) application_data
1 33 5.5098 (0.0000) C>SV3.1(96) Alert
/code
The client sends its Alert, then nothing. The client is waiting for a reply. Now, Gaim shouldnt wait forever, thats bad. I dont know if thats Gaim or GnuTLS, but the client should be able to tollerate this. On the server side, however, Wildfire should be completing this conversation.
In the course of debugging this, I also noticed that if you fire up two Gaim clients, connect with one, then try to disconnect (gaim hangs), then connect with the other the “hung” client cleans itself up. No network traffic from the hung client, just the new session from the second client. The second client dosnt have to be gaim, and dosnt have to have the same resource, just any new connection for the same JID it seems.