Wildfire consumes all Java memory, then swap, kills server

After upgrading to 3.0.1, the server seems to slowly eat up more and more memory until it uses up all the memory assigned to Java, then takes down the entire server rendering it useless. It does this on two different boxes. At first I thought maybe I upgraded incorrectly (because everything worked fine before…the two boxes were using 3.0 and 2.6.5 or whatever the last 2 release was), so I uninstalled the rpm, wiped the directories it created, and installed 3.0.1 from scratch. But it’'s still happening. This is only with about 3-4 test users on at a time…it takes about 5 hours to use up 128 megs of ram. If anyone could help it would be greatly appreciated.

I’‘m also getting lots of errors in the logs that I don’'t really know how to decipher. Let me know what to post that would help.

Error.log

2006.07.25 19:15:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

a76306[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2538,localport=5223]]

2006.07.25 19:25:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

cbbdf3[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2543,localport=5223]]

2006.07.25 19:25:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

83fff9[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2539,localport=5223]]

2006.07.25 19:27:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

20b3a3[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2540,localport=5223]]

2006.07.25 19:27:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

5ebc6e[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2544,localport=5223]]

2006.07.25 19:29:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

145132[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2541,localport=5223]]

2006.07.25 19:29:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

6076c4[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2545,localport=5223]]

2006.07.25 19:31:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

bb0257[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2542,localport=5223]]

2006.07.25 19:31:09 org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java:1 04) Connection closed before session established

1dac8a5[TLS_RSA_WITH_AES_128_CBC_SHA: Socket[addr=/69.54.15.69,port=2546,localport=5223]]

…etc.

debug.log

at org.jivesoftware.wildfire.net.SocketReader.createSession(SocketReader.java:432)

at org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java: 53)

at org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:123)

at java.lang.Thread.run(Unknown Source)

2006.07.25 20:03:09 Error creating session

java.net.SocketException: Connection reset

at java.net.SocketInputStream.read(Unknown Source)

at com.sun.net.ssl.internal.ssl.InputRecord.readFully(Unknown Source)

at com.sun.net.ssl.internal.ssl.InputRecord.read(Unknown Source)

at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readRecord(Unknown Source)

at com.sun.net.ssl.internal.ssl.SSLSocketImpl.readDataRecord(Unknown Source)

at com.sun.net.ssl.internal.ssl.AppInputStream.read(Unknown Source)

at org.jivesoftware.wildfire.net.ServerTrafficCounter$InputStreamWrapper.read(Serv erTrafficCounter.java:183)

at sun.nio.cs.StreamDecoder$CharsetSD.readBytes(Unknown Source)

at sun.nio.cs.StreamDecoder$CharsetSD.implRead(Unknown Source)

at sun.nio.cs.StreamDecoder.read(Unknown Source)

at java.io.InputStreamReader.read(Unknown Source)

at org.xmlpull.mxp1.MXParser.fillBuf(MXParser.java:2992)

at org.xmlpull.mxp1.MXParser.more(MXParser.java:3046)

at org.xmlpull.mxp1.MXParser.parseProlog(MXParser.java:1410)

at org.jivesoftware.wildfire.net.MXParser.nextImpl(MXParser.java:331)

at org.xmlpull.mxp1.MXParser.next(MXParser.java:1093)

at org.jivesoftware.wildfire.net.SocketReader.createSession(SocketReader.java:432)

at org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java: 53)

at org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:123)

at java.lang.Thread.run(Unknown Source)

2006.07.25 20:03:13 SSL Connect 1382897[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.54.15.69,port=2575,localport=5223]]

2006.07.25 20:03:33 SSL Connect 13a6057[SSL_NULL_WITH_NULL_NULL: Socket[addr=/216.217.20.12,port=3392,localport=5223]]

2006.07.25 20:03:57 SSL Connect 18ada25[SSL_NULL_WITH_NULL_NULL: Socket[addr=/67.82.34.33,port=3076,localport=5223]]

2006.07.25 20:04:06 SSL Connect 7af3e0[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.200.94.34,port=3579,localport=5223]]

2006.07.25 20:04:31 SSL Connect 1a418a3[SSL_NULL_WITH_NULL_NULL: Socket[addr=/24.225.253.117,port=2231,localport=5223]]

2006.07.25 20:05:13 SSL Connect 1d8e0f3[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.54.15.69,port=2576,localport=5223]]

2006.07.25 20:05:33 SSL Connect 3ae8ac[SSL_NULL_WITH_NULL_NULL: Socket[addr=/216.217.20.12,port=3401,localport=5223]]

2006.07.25 20:05:56 SSL Connect 4b6a6c[SSL_NULL_WITH_NULL_NULL: Socket[addr=/67.82.34.33,port=3077,localport=5223]]

2006.07.25 20:06:06 SSL Connect 1076683[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.200.94.34,port=3584,localport=5223]]

2006.07.25 20:06:31 SSL Connect d63b5b[SSL_NULL_WITH_NULL_NULL: Socket[addr=/24.225.253.117,port=2232,localport=5223]]

2006.07.25 20:07:13 SSL Connect 15d7e8e[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.54.15.69,port=2577,localport=5223]]

2006.07.25 20:07:33 SSL Connect acf892[SSL_NULL_WITH_NULL_NULL: Socket[addr=/216.217.20.12,port=3404,localport=5223]]

2006.07.25 20:07:56 SSL Connect 1cd1217[SSL_NULL_WITH_NULL_NULL: Socket[addr=/67.82.34.33,port=3078,localport=5223]]

2006.07.25 20:08:06 SSL Connect 13cae7[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.200.94.34,port=3589,localport=5223]]

2006.07.25 20:08:31 SSL Connect 1c65f7[SSL_NULL_WITH_NULL_NULL: Socket[addr=/24.225.253.117,port=2233,localport=5223]]

2006.07.25 20:09:13 SSL Connect 1e0123b[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.54.15.69,port=2578,localport=5223]]

2006.07.25 20:09:33 SSL Connect 79ca2e[SSL_NULL_WITH_NULL_NULL: Socket[addr=/216.217.20.12,port=3408,localport=5223]]

2006.07.25 20:09:56 SSL Connect 174aee5[SSL_NULL_WITH_NULL_NULL: Socket[addr=/67.82.34.33,port=3079,localport=5223]]

2006.07.25 20:10:06 SSL Connect 19849e5[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.200.94.34,port=3594,localport=5223]]

2006.07.25 20:10:31 SSL Connect 70bdc6[SSL_NULL_WITH_NULL_NULL: Socket[addr=/24.225.253.117,port=2234,localport=5223]]

2006.07.25 20:11:13 SSL Connect 120ab60[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.54.15.69,port=2579,localport=5223]]

2006.07.25 20:11:32 SSL Connect 182b18d[SSL_NULL_WITH_NULL_NULL: Socket[addr=/216.217.20.12,port=3413,localport=5223]]

2006.07.25 20:11:56 SSL Connect 196e397[SSL_NULL_WITH_NULL_NULL: Socket[addr=/67.82.34.33,port=3080,localport=5223]]

2006.07.25 20:12:06 SSL Connect 18fcb66[SSL_NULL_WITH_NULL_NULL: Socket[addr=/69.200.94.34,port=3599,localport=5223]]

I’‘ve also noticed the Active Client Sessions on the Sessions tab does not match up with the number of sessions shown. I currently have 3 users online, but it’‘s showing 16 active sessions. That doesn’'t seem good to me…

I’'ve stopped most of the other errors, but my warning log keeps filling up with this:

2006.07.26 13:01:19 SaslException

javax.security.sasl.SaslException: DIGEST-MD5: IO error acquiring password Caused by java.io.IOException: org.jivesoftware.wildfire.user.UserNotFoundException: 2014

at com.sun.security.sasl.digest.DigestMD5Server.validateClientResponse(Unknown Source)

at com.sun.security.sasl.digest.DigestMD5Server.evaluateResponse(Unknown Source)

at org.jivesoftware.wildfire.net.SASLAuthentication.handle(SASLAuthentication.java :248)

at org.jivesoftware.wildfire.net.SocketReadingMode.authenticateClient(SocketReadin gMode.java:117)

at org.jivesoftware.wildfire.net.BlockingReadingMode.readStream(BlockingReadingMod e.java:136)

at org.jivesoftware.wildfire.net.BlockingReadingMode.run(BlockingReadingMode.java: 62)

at org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:123)

at java.lang.Thread.run(Unknown Source)

Caused by: java.io.IOException: org.jivesoftware.wildfire.user.UserNotFoundException: 2014

at org.jivesoftware.wildfire.net.XMPPCallbackHandler.handle(XMPPCallbackHandler.ja va:73)

… 8 more

I think this is related to the extra sessions shown since my debug log will show it’'s killing idle sessions after getting these errors.

Hi,

how much memory does your server have and did you specify -Xmx for Wildfire? With 16 users you should not see out of memory errors and taking a look a your logs you did not get OutOfMemory errors.

One must set Xmx to a value which makes sure that the server does not swap if the java memory is completely used, If you have a server with 96 MB this may be 32 MB or even less as your server may be running a lot of other services. As soon as the java memory or even a small part of it gets swapped out you’'ll encounter a lot of timeout errors.

LG

It looks like we’‘ve fixed the problem. We did 2 things, but I’'m not sure which fixed it. We already had the java memory settings set.

  1. We changed from the default ports because apparently we were getting a lot of hits that weren’'t from our client (though these did not show in the debug logs or anything).

  2. We solved the problem that was causing all of the UserNotFound exceptions. But shouldn’'t it handle not finding a user gracefully? I would think the person should just fail authorization rather than the server throwing an exception.

Either way it’'s been up for about a day now, sticking around 20megs used…so everything looks good.

Hi,

I just experienced the same problem with active client sessions. Sometimes this number becomes negative and in other tests it was much more larger then real session number.

Thx,

Tim

Hey Tim,

The issue JM-793 has been fixed for Wildfire 3.1. The bug fix is available in the nightly build version.

Thanks,

– Gato

Hey Gato,

I’'m testing the latest nightly build available and you are talking about 20 days ago. The problem still not fixed.

Thx,

Tim

Hey Gato,

Your fix doesn’'t resolve the problem. As I understood, you forgot to decrement session counter on line 916 in SessionManager.java. I think, that it is better to move decrement of session counter to removeSession function. Here is a diff of my changes:

Index: D:/Temp/wildfire/src/java/org/jivesoftware/wildfire/SessionManager.java

===================================================================

— D:/Temp/wildfire/src/java/org/jivesoftware/wildfire/SessionManager.java (re vision 5102)

+++ D:/Temp/wildfire/src/java/org/jivesoftware/wildfire/SessionManager.java (wo rking copy)

@@ -1401,7 +1401,14 @@

offline.setType(Presence.Type.unavailable);

router.route(offline);

}

  •    return auth_removed || preauth_removed;
    
  •    if (auth_removed || preauth_removed) {
    
  •        // Decrement the counter of user sessions
    
  •        usersSessionsCounter.decrementAndGet();
    
  •        return true;
    
  •    }
    
  •    else
    
  •         return false;
    

}

public void addAnonymousSession(ClientSession session) {

@@ -1455,10 +1462,7 @@

}

finally {

// Remove the session

  •                if (removeSession(session)) {
    
  •                    // Decrement the counter of user sessions
    
  •                    usersSessionsCounter.decrementAndGet();
    
  •                }
    
  •                removeSession(session);
    

}

}

catch (Exception e) {

Hey Tim,

Thanks for pointing that out. The fix is now in SVN.

Regards,

– Gato