Login and Threading issue

Smack 4.2.0
Android application

11-21 14:47:48.715 D/SMACK: SENT (0): <stream:stream xmlns='jabber:client' to='backend' xmlns:stream='http://etherx.jabber.org/streams' version='1.0' from='user@backend' xml:lang='en'>
11-21 14:47:48.746 D/SMACK: RECV (0): <?xml version='1.0'?><stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='15E774FAF5FA6617' from='backend' version='1.0' xml:lang='en'><stream:features><starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'/><mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'><mechanism>PLAIN</mechanism><mechanism>DIGEST-MD5</mechanism><mechanism>SCRAM-SHA-1</mechanism></mechanisms><register xmlns='http://jabber.org/features/iq-register'/><amp xmlns='http://jabber.org/feature/amp'/><sm xmlns='urn:xmpp:sm:3'/></stream:features>
11-21 14:47:48.784 D/SMACK: SENT (0): <starttls xmlns='urn:ietf:params:xml:ns:xmpp-tls'></starttls>
11-21 14:47:48.808 D/SMACK: RECV (0): <proceed xmlns='urn:ietf:params:xml:ns:xmpp-tls'/>
11-21 14:47:49.551 D/SMACK: SENT (0): <stream:stream xmlns='jabber:client' to='backend' xmlns:stream='http://etherx.jabber.org/streams' version='1.0' from='user@backend' xml:lang='en'>
11-21 14:47:49.631 D/SMACK: RECV (0): <?xml version='1.0'?><stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='DB711ED103AEC73A' from='backend' version='1.0' xml:lang='en'>
11-21 14:47:49.636 D/SMACK: RECV (0): <stream:features><mechanisms xmlns='urn:ietf:params:xml:ns:xmpp-sasl'><mechanism>PLAIN</mechanism><mechanism>DIGEST-MD5</mechanism><mechanism>SCRAM-SHA-1</mechanism></mechanisms><register xmlns='http://jabber.org/features/iq-register'/><amp xmlns='http://jabber.org/feature/amp'/><sm xmlns='urn:xmpp:sm:3'/></stream:features>
11-21 14:47:49.770 D/SMACK: SENT (0): <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='SCRAM-SHA-1'>biwsbj0zMjQ4NjkwOTI1NyxyPU8tTXtnV29mfGRFaG02RFJvKX5vJCVncHJ3THkuVU4m</auth>
11-21 14:47:49.809 D/SMACK: RECV (0): <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>cj1PLU17Z1dvZnxkRWhtNkRSbyl+byQlZ3Byd0x5LlVOJmxkaDkvNHg2dVFPbWtOdCszeDNKWmc9PSxzPSs1K2hKdHFtUUlGMEt4cmo2OU9SVWc9PSxpPTQwOTY=</challenge>\

11-21 14:47:52.999 V/FA: Inactivity, disconnecting from the service
11-21 14:47:54.535 D/SyncContactsTask: Init persisted items finished
11-21 14:47:54.536 D/SyncContactsTask: Checking changes
11-21 14:47:55.299 I/zygote64: Compiler allocated 8MB to compile void android.widget.TextView.<init>(android.content.Context, android.util.AttributeSet, int, int)
11-21 14:47:56.412 D/SyncContactsTask: Initialized RawContacts. Updated contacts: 0
11-21 14:47:56.413 D/SyncContactsTask: Contacts are up to date. Aborting

11-21 14:47:59.796 W/AbstractXMPPConnection: Connection XMPPTCPConnection[not-authenticated] (0) closed with error
                                             org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 10000ms (~10s). While waiting for successful SASL authentication
                                                 at org.jivesoftware.smack.SASLAuthentication.authenticate(SASLAuthentication.java:223)
                                                 at org.jivesoftware.smack.tcp.XMPPTCPConnection.loginInternal(XMPPTCPConnection.java:385)
                                                 at org.jivesoftware.smack.AbstractXMPPConnection.login(AbstractXMPPConnection.java:491)
                                                 at org.jivesoftware.smack.AbstractXMPPConnection.login(AbstractXMPPConnection.java:448)
                                                 at com.app.AppService.login(AppService.java:254)
                                                 at android.os.AsyncTask$2.call(AsyncTask.java:333)
                                                 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
                                                 at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1162)
                                                 at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:636)
                                                 at java.lang.Thread.run(Thread.java:764)

11-21 14:48:00.228 D/SMACK: SENT (0): <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>Yz1iaXdzLHI9Ty1Ne2dXb2Z8ZEVobTZEUm8pfm8kJWdwcndMeS5VTiZsZGg5LzR4NnVRT21rTnQrM3gzSlpnPT0scD0wZWUxZDhXTEdwMkpzUE9pQ3I4eUZWUWJOT289</response>
11-21 14:48:00.252 D/SMACK: RECV (0): <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>dj1WWjl2MzZjNlVmRWRSZWt2cGNtS2ttOEYxRGs9</success>
11-21 14:48:00.271 D/SMACK: SENT (0): <stream:stream xmlns='jabber:client' to='backend' xmlns:stream='http://etherx.jabber.org/streams' version='1.0' from='user@backend' id='DB711ED103AEC73A' xml:lang='en'>
11-21 14:48:00.354 D/SMACK: RECV (0): <?xml version='1.0'?><stream:stream xmlns='jabber:client' xmlns:stream='http://etherx.jabber.org/streams' id='8EFD75CEC577B6E7' from='backend' version='1.0' xml:lang='en'>
11-21 14:48:00.362 D/SMACK: RECV (0): <stream:features><bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'/><session xmlns='urn:ietf:params:xml:ns:xmpp-session'/><register xmlns='http://jabber.org/features/iq-register'/><amp xmlns='http://jabber.org/feature/amp'/><sm xmlns='urn:xmpp:sm:3'/></stream:features>

In our application, from time to time we get this kind of error while logging in. There is no exact way to reproduce it.
After digging the code/debugging the app, I have noticed the following:
Since 11-21 14:47:49.809 till 11-21 14:48:00.228 the application was doing nothing with the connection. Basically, method org.jivesoftware.smack.SASLAuthentication#authenticate was in a waiting loop, and org.jivesoftware.smack.SASLAuthentication#challengeReceived(java.lang.String) was never invoked after the 11-21 14:47:49.809 D/SMACK: RECV (0): <challenge stanza was received. And the challengeReceived method was only invoked after the authenticate method has finished it’s waiting loop with an exception.
Taking into an account this, I assume that somehow PacketReader is getting blocked by some of the Threads causing the impossibility of logging in. As it is happening during the application launch, there are some other AsyncTasks running in the background, so it might be the case that one of them is causing the problem.

I am having big problems with debugging it and troubleshooting this. Do you have any clues, ideas and reasons of why PacketParser can be locked?

Dmitry

It looks more like PacketWriter is blocked. Notice that the SASL <response/> element is delayed. A thread dump taken within that (5 second) window would be very helpful! Have a look at https://stackoverflow.com/a/13592951/194894

Flow, thanks for response.

Some more investigation. I have put some logging into method org.jivesoftware.smack.sasl.SASLMechanism#challengeReceived, so that it looks like:

Log.w("WTF", "challengeReceived.start");
        byte[] challenge = Base64.decode((challengeString != null && challengeString.equals("=")) ? "" : challengeString);
        Log.w("WTF", "challengeReceived.evaluate");
        byte[] response = evaluateChallenge(challenge);
        if (finalChallenge) {
            return;
        }

        Log.w("WTF", "challengeReceived.responseBuild");
        Response responseStanza;
        if (response == null) {
            responseStanza = new Response();
        }
        else {
            responseStanza = new Response(Base64.encodeToString(response));
        }

        Log.w("WTF", "challengeReceived.sendNonza");
        // Send the authentication to the server
        connection.sendNonza(responseStanza);
        Log.w("WTF", "challengeReceived.finish");

Then after launching the app, I have got the following logs:

11-21 18:14:52.423 D/SMACK: SENT (0): <auth xmlns='urn:ietf:params:xml:ns:xmpp-sasl' mechanism='SCRAM-SHA-1'>biwsbj0zMjQ4NjkwOTI1NyxyPTlPV28zem5pW2xlZT02TE1YL0JYW2g6KWV3RzUtWUZd</auth>
11-21 18:14:52.461 D/SMACK: RECV (0): <challenge xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>cj05T1dvM3puaVtsZWU9NkxNWC9CWFtoOilld0c1LVlGXTlEYytPNE1uQVc4TVZwOE50MkFMUnc9PSxzPW1vdG5JWVF6ZHRWSjBzMXNsaUhQS1E9PSxpPTQwOTY=</challenge>
11-21 18:14:53.393 E/WTF: challengeReceived.start
11-21 18:14:53.393 W/WTF: challengeReceived.start
11-21 18:14:53.393 W/WTF: challengeReceived.evaluate
... Nothing valuable here.
11-21 18:15:03.571 W/WTF: challengeReceived.responseBuild
11-21 18:15:03.572 W/WTF: challengeReceived.sendNonza
11-21 18:15:03.573 W/WTF: challengeReceived.finish
11-21 18:15:03.579 D/SMACK: SENT (0): <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>Yz1iaXdzLHI9OU9XbzN6bmlbbGVlPTZMTVgvQlhbaDopZXdHNS1ZRl05RGMrTzRNbkFXOE1WcDhOdDJBTFJ3PT0scD1SbWNYNzQ3dyt5WTI4V1RPOFBqVzU0WXNXMTQ9</response>
11-21 18:15:03.602 D/SMACK: RECV (0): <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>dj03a3lCa0JNVWpXZzBTUFVYMkQ5RWd5VnFJYWM9</success>

So, evaluate method, which is doing nothing (see below) is blocking? And are there any tricks with this method that actually re-implement it? I could not find in the code anything.

protected byte[] evaluateChallenge(byte[] challenge) throws SmackException {
        return null;
    }

Notice that the SASL element is delayed

True, but according to the logs below, it is working fine.

11-21 18:15:03.572 W/WTF: challengeReceived.sendNonza
11-21 18:15:03.573 W/WTF: challengeReceived.finish
11-21 18:15:03.579 D/SMACK: SENT (0): <response xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>Yz1iaXdzLHI9OU9XbzN6bmlbbGVlPTZMTVgvQlhbaDopZXdHNS1ZRl05RGMrTzRNbkFXOE1WcDhOdDJBTFJ3PT0scD1SbWNYNzQ3dyt5WTI4V1RPOFBqVzU0WXNXMTQ9</response>
11-21 18:15:03.602 D/SMACK: RECV (0): <success xmlns='urn:ietf:params:xml:ns:xmpp-sasl'>dj03a3lCa0JNVWpXZzBTUFVYMkQ5RWd5VnFJYWM9</success>

I can provide you thread dumps, let me know at which point to make them, so that I provide you the most detailed information

If this is some sort of synchronization issue, then using printf style debugging is likely going to change the timings and make the issue possibly disappear. If you are able to reliable reproduce the issue, then taking a thread dump is the way to go. It’s easy, just follow the link I gave above.

SCRAM-SHA-1 is implemnted in the ScramMechanism class. Look there for evaluateChallenge().

Here comes the Thread-Dump
threads_report.txt (40.9 KB)
So far I don’t really see there any errors.
I have taken the thread dump at the moment of throw NoResponseException.newWith(connection, "successful SASL authentication"); at org.jivesoftware.smack.SASLAuthentication#authenticate

What does “at the moment” mean? It is crucial that the threaddump is taken after the <challenge/> is received, but before the exception is thrown, i.e., at the time window where no progress is made.

At the moment, means:
I have taken it right before the Exception was about to be thrown. I have put the breakpoint on that line and taken it at that moment. It is right after the wait loop is finished.
There’s a thread "Smack Packet Reader (0)@7168" daemon prio=5 tid=0x7f14 nid=NA runnable which shows that there’s challenge is being processed.

Ok, now please try to get a threaddump while the thread calling login is blocking in SASLAuthentication.authenticate() wait(). And avoid breakpoints.

Maybe increase the reply timeout from the default of 5 seconds, obverve the point in time when the <challenge/> is received, and take, possibly multiple threaddumps, right after that.