powered by Jive Software

Problems after connectivity loss. How can I debug them?

I’‘m using JM 2.2.0. I’‘ve not yet upgraded to 2.2.1 since I’‘ve seen a lot of people having problems recently on the forums and I don’‘t want to open myself up to anything I’'m not having to deal with now.

I’‘ve had this happen once or twice before, but I’‘ve yet to figure out how to fix it. Whenever my server happens to lose connectivity, everything in JM just comes to a standstill for the users connected to it after connectivity is restored. Logging out and back in does not fix anything. Transports are unavailable or don’‘t work, rosters aren’‘t downloaded, presence information isn’'t updated, etc. I actually have to kill -9 the java process for JM and restart it to get anything to come back.

So how can I fix this? If I can’‘t fix it myself (I’'m not a coder), what can I do to provide information to developers to diagnose and hopefully correct this problem?

How long do you wait until your try to restore the connection? 10 minutes? More? I’‘d wager that the connections on being held on the JM and not timing out. A quick look at the source and this seems to be the case. The socket just sits there and waits for some imput from the client. I’‘m sure the folks at Jive will set me straight if this isn’'t the case

Where is the network break happening? Are the clients connecting directly to the server (i.e the same network segement…no router between user and server) or are they coming across a few subnets to get there? Can you isolate where the fault is happening on the network or its it out of your hands (beyond the ISP…out in the wild)

Noah

Next time it happens, give us a thread dump.

Anywhere from about 10-30 minutes. It just never seems to recover. The loss of connectivity is between my server (which is in a colo facility in California) and the Internet. I’‘m guessing they’‘ve had a router die somewhere upstream from them. It’‘s not like it’‘s a common problem, but, it’'s happened once or twice and Jive just goes wonky. My client connects directly to the server as I run CJC in an ssh session. Other clients are connecting over the Internet.