Case-sensitive Roster issue, again?

So, in the past, the issue of case-sensitivity in XMPP areas, especially the roster, has come up:

SMACK-30

SMACK-22

http://www.jivesoftware.org/forums/thread.jspa?threadID=13128

http://www.jivesoftware.org/forums/thread.jspa?threadID=12929

I don’‘t think it’'s been fixed. Check out this code:

      final String fullJID = "someUser@mymachine.com";

roster.createEntry(fullJID, fullJID, null);

Presence presence = roster.getPresence(fullJID);

if(presence == null)

{

System.err.println(“Presence is null.”);

}

presence = roster.getPresence(fullJID.toLowerCase());

if(presence == null)

{

System.err.println(“Lowercase presence is null.”);

}

/code

Can you guess what the output should be? The intuitive answer would be that the first check should pass and the second should fail. Actually, the output is:

Presence is null.

/code

Weird, huh? I think the following prior posting is the most relevant to the issue:

http://www.jivesoftware.org/forums/thread.jspa?threadID=13128

I can write a JUnit test case for you guys if you tell me details so this issue doesn’‘t come up again. That’‘s if it actually is an issue and I’'m not just plain wrong.

Thanks, and keep up the great work!

Smack 2.0.0

Jive Messenger 2.2.0

Looking at the source (http://tinyurl.com/bu9e6) and then where it’‘s actually used, (http://tinyurl.com/d8yoo) you’'ll see that this is expected behavior.

However, I’‘m can’‘t find any reference that this is incorrect behavior in the spec (http://www.faqs.org/rfcs/rfc3920.html). They don’‘t discuss equality, at least not what I could find. I was wager that since JID’'s look like email address, most people would expect them to act like email address and be case insensitive.

Noah

Message was edited by:

noahcampbell - long url’‘s don’'t work in forum I guess.

Looking at the source (http://tinyurl.com/bu9e6) and

then where it’'s actually used,

(http://tinyurl.com/d8yoo) you’'ll see that this is

expected behavior.

However, I’‘m can’'t find any reference that this is

incorrect behavior in the spec

(RFC 3920 - Extensible Messaging and Presence Protocol (XMPP): Co (RFC3920)). They don’'t

discuss equality, at least not what I could find. I

was wager that since JID’'s look like email address,

most people would expect them to act like email

address and be case insensitive.

Thanks, Noah. After doing some research (online and Orielly’‘s “Programming Jabber”), it appears Jabber usernames are case-insensitive. The example given in Orielly’‘s book is that if a user created an account with the name ‘‘dj’’, then another user can’'t come along and create an account called ‘‘DJ’’. Also, server-side operations are case-insensitive: if the user ‘‘dj’’ decided to log in as ‘‘DJ’’, then the user should be allowed and will be known as ‘‘DJ’’ for the remainder of the login.

How this applies to my original posting is that if an account named ‘‘someUser’’ was created, then you should be able to check the presence of ‘‘someUser’’ by checking the presence of ‘‘someuser’’ (note the lowercase ‘‘u’’) AND ‘‘someUser’’. Just like sending an email. The latter doesn’'t work in Smack.

It’'s very counter-intuitive for a developer to have a ‘‘static final String’’ designating the user name of some client, but not being able to use that constant to check their presence.

Hey Karl,

JIDs in XMPP are not case insensitive. Each part of a JID must conform a set of rules as url=http://www.xmpp.org/specs/rfc3920.html#addressingdefined in the spec[/url]. Basically, usernames must be in lowercase and must not contain any white space.

So when you add a contact to your roster whose JID is invalid, the server may not validate the contact’'s JID but be sure that the intended target user will not get an authorization request from you. Therefore, Smack will never get a presence of the other user.

Regards,

– Gato

Hey Karl,

JIDs in XMPP are not case insensitive.

Yes, I understand. The resource identifier is the perfect example. Smack itself uses “Smack” as the default resource.

Each part of a

JID must conform a set of rules as

[url=XMPP | XMPP RFCs

]defined in the spec[/url]. Basically, usernames must

be in lowercase and must not contain any white

space.

This is the part I think I’‘m missing: where does it say usernames must be in lowercase? (I’‘m honestly not trying to sound pretentious; I’'m generally curious). Is it RFC3454 (“stringprep”)? Lower case usually makes sense in Latin-based alphabets, but loses its meaning as you move into more complex languages.

The Jive server automatically lower cases usernames when you create the account. iain’'s post a while back makes sense to me (http://www.jivesoftware.org/forums/thread.jspa?messageID=69981):

“”"Servers have the option of preserving an address or converting it to a ‘‘stringprep’’ version (for most latin languages this means essentially lowercase, for asian languages it can be a bit more complex). Clients can’‘t and shouldn’'t rely on the addresses to maintain their case (just as domain names are not case sensitive).

Messenger plays it safe and always sends the stringprep’‘d version. I can understand the desire to preserve case and will add it to the wish list. I think some client developers may find it better that the addresses are stringprep’‘d so that they can do a simple binary comparison of addresses to check for equality without having to worry about stringprep’‘ing (it can be an intensive operation). As a client developer though, you need to be prepared for stringprep’'d addresses since messages pass from server to server, and there is no way to guarantee the addresses will maintain their case in transit.“”"

So when you add a contact to your roster whose JID is

invalid, the server may not validate the contact’'s

JID but be sure that the intended target user will

not get an authorization request from you. Therefore,

Smack will never get a presence of the other user.

OK, so a username with capital letters is invalid. Why can you send messages using Smack to recipients with capital letters (i.e., Packet.setTo(“someUser@somemachine.com”) and it will be received on the other end? Message and IQ packets work this way. Perhaps I should post to the Jive forum instead.

I’'m just genuinely interested at this point as to why I can properly send Messages and IQs to “someUser”, but checking presence of “someUser” will fail.

Thanks ahead of time to anyone willing to reply.

This is the part I think I’'m missing: where does it

say usernames must be in lowercase? (I’'m honestly

not trying to sound pretentious; I’'m generally

curious). Is it RFC3454 (“stringprep”)?

Yep, the stringprep profiles defined by the XMPP RFC are what apply. It’‘s not really “lowercase”, it’‘s what’'s called “case folding”. For some Unicode characters that means the same thing as “lowercase”. For others, it has no affect.

I’'m just genuinely interested at this point as to why

I can properly send Messages and IQs to “someUser”,

but checking presence of “someUser” will fail.

Perhaps we should just apply Stringprep to presence checks? It sounds like that might solve the issue.

Regards,

Matt

Looking at this line:

A node identifier MUST be formatted such that the Nodeprep profile of Hoffman, P. and M. Blanchet, Preparation of Internationalized Strings (“stringprep”), December 2002. can be applied without failing.

I’‘d say that uppercase is allowed on the server as long as it won’‘t fail when it is applied. This would mean that equality should always be done after both strings have ben folded…currently Smack just folds the entity when it’'s created and leaves the rest to the caller to make sure they fold the JID before passing it to smack.

Maybe I’'m missing something.

Noah

Looking at this line:

A node identifier MUST be formatted such that the

Nodeprep profile of Hoffman, P. and M.

Blanchet, Preparation of Internationalized Strings

(“stringprep”), December 2002. can be applied without

failing.

I’'d say that uppercase is allowed on the server as

long as it won’'t fail when it is applied. This would

mean that equality should always be done after both

strings have ben folded…currently Smack just folds

the entity when it’'s created and leaves the rest to

the caller to make sure they fold the JID before

passing it to smack.

Maybe I’'m missing something.

Yes, I agree that string equality should be done after strings have been folded / stringprep’‘ed. Smack (or Jive Messenger? who does the folding?) appears to fold node identifiers when sending Message and IQ packets, but not when dealing with the Roster. This would explain why I can send a Message or custom IQ packets to “someUser”, as we’‘ve been doing for months now in our software, while I can’'t get the Presence of “someUser”.

Applying stringprep to presence checks, as Matt has said, would fix the problem I believe, and Smack consistent throughout. I don’‘t see a reason for Messages and IQs to be stringprep’'d, but not presence checks.

If I’'m wrong somewhere, someone please do correct me.

I agree, it should be applied everywhere.

If you read the requirement, it doesn’‘t have to be applied everywhere as long as you can garuntee that if it were applied it won’'t fail.

OK, I’‘m glad there’'s a general consensus as to what to do. Are there any plans on implementing this for a future release?

I’'ve filed this as SMACK-86.

Regards,

Matt