Keep alive not sent at fixed rate

Keep alive whitespaces are not sent at fixed rate, e.g. if your setting is 30s, it’s not sent 30s after the last packet you send.

Instead it is sent between an interval between 30s and 60s (the double) due to the way the code is written.

I fixed that:

public void run() {
            try {
                // Sleep 15 seconds before sending first heartbeat. This will give time to
                // properly finish TLS negotiation and then start sending heartbeats.
                Thread.sleep(15000);
            }
            catch (InterruptedException ie) {
                // Do nothing
            }
            while (!done && keepAliveThread == thread) {
                long timeToWait = delay - (System.currentTimeMillis() - lastActive);
                synchronized (writer) {
                    // Send heartbeat if no packet has been sent to the server for a given time
                    if (timeToWait <= 0) {
                        try {
                            writer.write(" ");
                            writer.flush();
                        }
                        catch (Exception e) {
                            if (!done) {
                                done = true;
                                connection.packetReader.notifyConnectionError(e);
                            }
                        }
                        lastActive = System.currentTimeMillis();
                        timeToWait = delay;
                    }
                }
                try {
                    // Sleep until we should write the next keep-alive.
                    Thread.sleep(timeToWait);
                }
                catch (InterruptedException ie) {
                    // Do nothing
                }
            }
        }

By the way my patch also reports a connection error if we can’t write the whitespace.

The patch file is attached to this discussion.
keepalive.patch (2764 Bytes)

Hi gperrot,

That looks like a good catch, I’m wondering if it would help with a reported problem that after a network failure, it takes 15-20 minutes to detect the outage. I found this with my program using Smack 3.1.0 and on Spark as well.

Presumably, this code would fire ConnectionListener.connectionClosedOnError and we could detect the network drop-out sooner.

In your code excerpt, what class is “writer”, is it ObservableWriter?

Thanks!

Phil