No Response within reply timeout - 4.3.1

Hello,
we are using 4.3.1 which seems to be the latest release version on maven.

When creating a query to message archive management (see code below) we receive that no-response exception every few requests.

Running on Android, every version from Api21 (5.0) up to 28 (9.0) creates the same exceptions.
The app then is blocked for 5 seconds, then continues. We created code to retry the last query and this is some kind of a patch on the problem, but getting the full archive takes forever with that approach.

I found somewhere a note that this seems to be fixed in some branch and the cause was some lock that deadlocked until a timeout but I can’t find that page anymore.

MamManager.MamQueryArgs args = new MamManager.MamQueryArgs.Builder()
								.limitResultsSince(new Date(lower))  // start
								.limitResultsBefore(new Date(upper)) // end
								.setResultPageSize(100)
								.build();
							
							MamManager.MamQuery query = mamManager.queryArchive(args);
							processArchiveMessages(query.getMessages());
							
							while (!query.isComplete()) {
								processArchiveMessages(query.pageNext(100));
							}

We put this in a try/catch with a retry counter and normally a query is succesful within 3 retries. but this means up to 15 seconds(!!) waiting time for a single page…

If there is something we can do about that, please help.

Thanks, Mike

Unfortunately your post does not contain enough information to provide any assistance. Please have a look at https://github.com/igniterealtime/Smack/wiki/How-to-ask-for-help,-report-an-issue-and-possible-solve-the-problem-yourself

Hey there,
sorry for the incomplete post.

When looping through the pages of the query, 2 out of 3 requests throw that exception.

Here is a stack trace

10-23 07:06:28.486 6152-6231/com.xxx.yyy W/XmppAccount: Got no response within timeout. Retrying day "2018-10-18" (1/3).
    (+) org.jivesoftware.smack.SmackException$NoResponseException: No response received within reply timeout. Timeout was 5000ms (~5s). Waited for response using: IQReplyFilter: iqAndIdFilter (AndFilter: (OrFilter: (IQTypeFilter: type=error, IQTypeFilter: type=result), StanzaIdFilter: id=F67Cj-42)), : fromFilter (OrFilter: (FromMatchesFilter (full): null, FromMatchesFilter (ignoreResourcepart): user@server.at, FromMatchesFilter (full): development.server.at)).
    (+) No response received within reply timeout. Timeout was 5000ms (~5s). Waited for response using: IQReplyFilter: iqAndIdFilter (AndFilter: (OrFilter: (IQTypeFilter: type=error, IQTypeFilter: type=result), StanzaIdFilter: id=F67Cj-42)), : fromFilter (OrFilter: (FromMatchesFilter (full): null, FromMatchesFilter (ignoreResourcepart): user@server.at, FromMatchesFilter (full): development.server.at)).
    (+) org.jivesoftware.smack.StanzaCollector.nextResultOrThrow()#265
    (+) org.jivesoftware.smack.StanzaCollector.nextResultOrThrow()#219
    (+) org.jivesoftware.smackx.mam.MamManager.queryArchivePage()#915
    (+) org.jivesoftware.smackx.mam.MamManager.queryArchive()#898
    (+) org.jivesoftware.smackx.mam.MamManager.queryArchive()#653
    (+) com.ourapp.im.xmpp.XmppAccount.archiveThreadLoop()#1382
    (+) com.ourapp.im.xmpp.XmppAccount.lambda$17U2zsBS4wuM0a31kAeIWHEMSH8()#-1
    (+) com.ourapp.im.xmpp.-$$Lambda$XmppAccount$17U2zsBS4wuM0a31kAeIWHEMSH8.run()#-1
    (+) java.lang.Thread.run()#818

It’s hard to provide more information than that, as there really is not more code on our side. There are only those view lines of code involved… while (!complete) { process(nextpage) }

Happens on every device with every android version, every time.
Maybe I misunderstood something conceptional from the message archive?

Any help or hint how to hunt that one down would be greatly appeciated.
Thanks

Is that so? The linked page also mentions an XMPP trace. Which would be helpful, as it would provide timing information (when is the request send, when is the response received, etc.).

I’d use good ol’ systematic debugging. :slight_smile: For example, ff there is a deadlock (which is possibly resolved after a timeout), then a thread dump taken at the time of the deadlock would be helpful.

Please note that I also provide professional services.

We found it.

On the connection we had to disable streamManagement

connection.setUseStreamManagement(false);
connection.setUseStreamManagementResumption(false);

Then the deadlock/timeouts are gone.

Is it Openfire server? There was a bug with stream management, which should be fixed in 4.3.0. You can try enabling stream management again when you update to Openfire 4.3.0.

Hey,
No it is not open fire.

I don’t say, that it even might be a problem with the server (similar to the open fire issue).
For us the important part is, that it works now, as it is a production issue.
We are entering test phase now.

thanks