powered by Jive Software

Bug in roster when reconnecting: All/most roster entries are missing

hi Flow and all, when i lost xmpp connection dure to network disconnection or some reason i don’t understand like below

SENT (0): </stream:stream>

Exception writing closing stream element

java.net.SocketException: sendto failed: EPIPE (Broken pipe)

at libcore.io.IoBridge.maybeThrowAfterSendto(IoBridge.java:499)

at libcore.io.IoBridge.sendto(IoBridge.java:468)

at java.net.PlainSocketImpl.write(PlainSocketImpl.java:507)

at java.net.PlainSocketImpl.access$100(PlainSocketImpl.java:46)

at java.net.PlainSocketImpl$PlainSocketOutputStream.write(PlainSocketImpl.java:269 )

at java.io.OutputStreamWriter.flushBytes(OutputStreamWriter.java:167)

at java.io.OutputStreamWriter.flush(OutputStreamWriter.java:158)

at java.io.BufferedWriter.flush(BufferedWriter.java:124)

at org.jivesoftware.smack.util.ObservableWriter.flush(ObservableWriter.java:44)

at org.jivesoftware.smack.PacketWriter.writePackets(PacketWriter.java:190)

at org.jivesoftware.smack.PacketWriter.access$000(PacketWriter.java:40)

at org.jivesoftware.smack.PacketWriter$1.run(PacketWriter.java:77)

Caused by: libcore.io.ErrnoException: sendto failed: EPIPE (Broken pipe)

at libcore.io.Posix.sendtoBytes(Native Method)

at libcore.io.Posix.sendto(Posix.java:155)

at libcore.io.BlockGuardOs.sendto(BlockGuardOs.java:177)

at libcore.io.IoBridge.sendto(IoBridge.java:466)

but when the connection restarts, it seems will clear all exiten Roster, and cause RosterListener’s entriesDeleted called, i’m using rc2-SNAPSHOT-2014-05-06 and in Roster.java 's addUpdateEntry method, new roster entries(items) got after reconnection seems totally the same as they are before reconnection(which is the same indeed), and thus are neither belongs to Collection addedEntries, Collection updatedEntries and thus in addEntries method they all goes into List toDelete and deleted.

what should i do to better perform reconnection? and although EPIPE don’t cause disconnection very offen but how could i avoid that(pls tell me to post more logs if needed). Thanks!

EPIPE don’t cause disconnection very offen but how could i avoid that

A broken pipe (EPIPE) is usually caused when the network that is providing the current connection goes down. You can’t avoid that on Android as it happens basically all the time due to WiFi <-> Mobile switches.

but when the connection restarts, it seems will clear all exiten Roster, and cause RosterListener’s entriesDeleted called, i’m using rc2-SNAPSHOT-2014-05-06 and in Roster.java 's addUpdateEntry method, new roster entries(items) got after reconnection seems totally the same as they are before reconnection(which is the same indeed), and thus are neither belongs to Collection addedEntries, Collection updatedEntries and thus in addEntries method they all goes into List toDelete and deleted.

Roster should call Roster.reload() once the connection has been (re-)established. But most roster operation are asynchronous, ie. it may take a while until the Roster instance for the connection represents the actual state of the Roster.

I understand that we have a small language barrier here, but could you try to explain again what you think does happen with the roster on reconnect and what exactly goes wrong?

thanks Flow, pls correction me if i’m making any mistakes. Before reconnection we can assume my current Roster has 2 entries calls oldA and oldB that belong to none groups and both has subscription=“both”

After reconnection established, Roster.reload is called. a new RosterResultListener is added to handle IQ get roster packet. The new IQ roster response let’s assume including 2 items let’s call newA and newB, and now in RosterResultListener, we call

nonemptyResult((RosterPacket) packet, addedEntries, updatedEntries, deletedEntries);

 in this method first check RosterPacket.Item ( newA and newB ) is valid or not, then call

addEntries(addedEntries, updatedEntries, deletedEntries, version, validItems);

 in this method, for each validItems ( newA and newB ), get an RosterEntry and call

addUpdateEntry(addedEntries, updatedEntries, item, entry);

 in this method we first got RosterEntry oldEntry = entries.put(item.getUser(), entry);  then compare oldEntry ( oldA ) with params entry ( newA ). since by RosterEntry.equalsDeep method oldA and newA are the same, and both group unchanged, newA's getUser() will not be added to addedEntries nor updatedEntries.

 Then this method returns back to

addEntries(addedEntries, updatedEntries, deletedEntries, version, validItems);

 addedEntries, updatedEntries will not including neither newA nor newB, and we init toDelete include both newA and newB then call

toDelete.removeAll(addedEntries);

toDelete.removeAll(updatedEntries);

will do nothing since both addedEntries, updatedEntries are empty, then

for (String user : toDelete) {

deleteEntry(deletedEntries, entries.get(user));

}

 newA and newB are deleted from Roster, and RosterListener's entriesDeleted is fired. So after reconnection, we have a empty Roster.

I think in

addUpdateEntry(Collection addedEntries,

Collection updatedEntries, RosterPacket.Item item,

RosterEntry entry)

if (!oldEntry.equalsDeep(entry) || !item.getGroupNames().equals(oldItem.getGroupNames())) {

updatedEntries.add(item.getUser());

}

can we add a boolean to indicate reconnection stituation and add ( newA and newB ) to updatedEntries… not deeply thought. Thanks!

Thanks for your detailed descriptions. I prepared a commit that needs some testing. Instead of using a boolean I’ve added a new List ‘unchangedEntries’ to prevent those entries form ending up in ‘toDelete’: https://github.com/Flowdalic/Smack/commit/f1f7713513cf0662e56416407843c984f8f2b8 b6

Those changes are included in todays snapshot of Smack and aSmack. It would be greatly appreciated if you could test those and report back if the issue is solved. Thank you.

Friendly reminder. Your feedback if the commit fixes the issue would be greatly appreciated.

hi Flow, I have tested with the new 4.0.0-rc2-SNAPSHOT-2014-05-17 and my code works, thanks. Sorry for not response in time dure to vocation. Thanks very much again

No problem. It was just the last issue that blocked the release of 4.0.0-rc2 which I want to happen as soon as possible.

more expecting rc2 now , ready to update. Thank you very much Flow.