powered by Jive Software

ReconnectionManager + XMPP tcp connection leaking sockets

We experienced a problem where we saw the reconnection manager stopped reconnecting.
And it turned out it was due to received closing </stream> element. Server wants to terminate the connection, calling disconnect().
Further looking at the code we saw that the socket was never closed and just a new one is created when ReconnectionManager called connection.connect().
So there is one problem here where this means the logic is leaking sockets as it never closes those.
And our problem came from there, that while trying to reconnect, close element was received on the old socket as the server detected the client not responding and was trying to close the session.

My question is what is the best way to fix that? Closing the socket is internal to the xmpp tcp implementation, at the same time ReconnectionManager is common for all types of connections?
Or is it fine the ReconnectionManager to do connection.disconnect(false) on reconnectionFailed … so the tcp socket is closed?
Can you help with hints?

Here are some logs from the problem (we are using smack 4.2.4 branch).

JVB 2020-06-18 04:03:07.303 WARNING: [11052] org.jivesoftware.smack.AbstractXMPPConnection.callConnectionClosedOnErrorListener: Connection XMPPTCPConnection[jvb@auth.jvb.example.com/cnA7VGg-] (0) closed with error
java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
	at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
	at sun.security.ssl.InputRecord.read(InputRecord.java:532)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:990)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:948)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
	at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.read1(BufferedReader.java:210)
	at java.io.BufferedReader.read(BufferedReader.java:286)
	at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:2992)
	at org.xmlpull.mxp1.MXParser.more(MXParser.java:3046)
	at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1144)
	at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets(XMPPTCPConnection.java:1248)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.access$300(XMPPTCPConnection.java:1000)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader$1.run(XMPPTCPConnection.java:1016)
	at java.lang.Thread.run(Thread.java:748)
JVB 2020-06-18 04:03:07.304 WARNING: [11052] [hostname= id=shard] MucClient$1.connectionClosedOnError#295: Closed on error:
java.net.SocketException: Connection reset
	at java.net.SocketInputStream.read(SocketInputStream.java:210)
	at java.net.SocketInputStream.read(SocketInputStream.java:141)
	at sun.security.ssl.InputRecord.readFully(InputRecord.java:465)
	at sun.security.ssl.InputRecord.readV3Record(InputRecord.java:593)
	at sun.security.ssl.InputRecord.read(InputRecord.java:532)
	at sun.security.ssl.SSLSocketImpl.readRecord(SSLSocketImpl.java:990)
	at sun.security.ssl.SSLSocketImpl.readDataRecord(SSLSocketImpl.java:948)
	at sun.security.ssl.AppInputStream.read(AppInputStream.java:105)
	at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:284)
	at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:326)
	at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:178)
	at java.io.InputStreamReader.read(InputStreamReader.java:184)
	at java.io.BufferedReader.read1(BufferedReader.java:210)
	at java.io.BufferedReader.read(BufferedReader.java:286)
	at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:2992)
	at org.xmlpull.mxp1.MXParser.more(MXParser.java:3046)
	at org.xmlpull.mxp1.MXParser.nextImpl(MXParser.java:1144)
	at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets(XMPPTCPConnection.java:1248)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.access$300(XMPPTCPConnection.java:1000)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader$1.run(XMPPTCPConnection.java:1016)
	at java.lang.Thread.run(Thread.java:748)
JVB 2020-06-18 04:03:22.687 WARNING: [24211] [hostname= id=shard] MucClient$1.reconnectionFailed#326: Reconnection failed: 
org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). While waiting for establishing TLS
	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:93)
	at org.jivesoftware.smack.SynchronizationPoint.checkForResponse(SynchronizationPoint.java:272)
	at org.jivesoftware.smack.SynchronizationPoint.checkIfSuccessOrWait(SynchronizationPoint.java:157)
	at org.jivesoftware.smack.SynchronizationPoint.checkIfSuccessOrWaitOrThrow(SynchronizationPoint.java:128)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection.connectInternal(XMPPTCPConnection.java:908)
	at org.jivesoftware.smack.AbstractXMPPConnection.connect(AbstractXMPPConnection.java:383)
	at org.jivesoftware.smack.ReconnectionManager$2.run(ReconnectionManager.java:289)
	at java.lang.Thread.run(Thread.java:748)
JVB 2020-06-18 04:03:33.088 INFO: [24216] org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketReader.parsePackets: XMPPTCPConnection[jvb@auth.jvb.example.com/cnA7VGg-] (0) received closing </stream> element. Server wants to terminate the connection, calling disconnect()


JVB 2020-06-18 04:03:38.088 WARNING: [24211] [hostname= id=shard] MucClient$1.reconnectionFailed#326: Reconnection failed: 
org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). While waiting for establishing TLS
	at org.jivesoftware.smack.SmackException$NoResponseException.newWith(SmackException.java:93)
	at org.jivesoftware.smack.SynchronizationPoint.checkForResponse(SynchronizationPoint.java:272)
	at org.jivesoftware.smack.SynchronizationPoint.checkIfSuccessOrWait(SynchronizationPoint.java:157)
	at org.jivesoftware.smack.SynchronizationPoint.checkIfSuccessOrWaitOrThrow(SynchronizationPoint.java:128)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection.connectInternal(XMPPTCPConnection.java:908)
	at org.jivesoftware.smack.AbstractXMPPConnection.connect(AbstractXMPPConnection.java:383)
	at org.jivesoftware.smack.ReconnectionManager$2.run(ReconnectionManager.java:289)
	at java.lang.Thread.run(Thread.java:748)
JVB 2020-06-18 04:03:38.089 INFO: [24216] [hostname= id=shard] MucClient$1.connectionClosed#289: Closed.
JVB 2020-06-18 04:03:38.089 WARNING: [24215] org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.writePackets: Exception flushing queue during shutdown, ignore and continue
java.lang.NullPointerException
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.writePackets(XMPPTCPConnection.java:1474)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.access$3300(XMPPTCPConnection.java:1264)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter$1.run(XMPPTCPConnection.java:1312)
	at java.lang.Thread.run(Thread.java:748)
JVB 2020-06-18 04:03:38.091 WARNING: [24215] org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.writePackets: Exception writing closing stream element
java.lang.NullPointerException
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.writePackets(XMPPTCPConnection.java:1484)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter.access$3300(XMPPTCPConnection.java:1264)
	at org.jivesoftware.smack.tcp.XMPPTCPConnection$PacketWriter$1.run(XMPPTCPConnection.java:1312)
	at java.lang.Thread.run(Thread.java:748)

@Flow any thoughts on this?
Hum, I see that actually after “received closing element. Server wants to terminate the connection, calling disconnect()” disconnect is called …
Not sure what had gone wrong …

I’m not sure what was the path and timings of events this to happen … but I see that the code around notifications and errors on packetReader error is changed in the master code so I suppose we are on our own with this problem :frowning:

ReconnectionManager starts its work based on the connectionClosedOnError() callback. Whenever this callbacked is invoked the connection, and all related open handles of it, should be closed. If a connection implementation does not do this, then this would be an issue with that connection implementation.

I am a little bit in a hurry right now, so I do not have time to have a close look at Smack 4.2 code.

That is right. Especially considering that you are one release (4.2) behind the current release (4.3). You are not alone though, I am happy to help if I find that time. That said, I am mostly interesting in fixing issues in the release branch (‘4.3’ as of writing this) or the ‘master’ branch. I am unable to support older releases as part of my spare time work.