Google S2S and Srv records

So on an attempt from google to subscribe to a user@jabber.mydomain.com account, I get dialback errors - it looks like Wildfire is not using SRV records for the dialback.

I saw a few other (related?) issues in the forums but no solutions. Even though I use @jabber.domain.com I added an srv record pointing back at itself, but that had no effect. (And shouldn’‘t be necessary according to rfc, but I’'m grasping here…)

The server can see the SRV records:

dis@jabber:logs $ host -t srv jabber.tcp.gmail.com

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server1.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server2.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server3.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server4.l.google.com.

jabber.tcp.gmail.com has SRV record 5 0 5269 xmpp-server.l.google.com.

Its Wildfire 2.6.2, Linux x86, java -version reports:

java version “1.5.0_06”

Java™ 2 Runtime Environment, Standard Edition (build 1.5.0_06-b05)

Java HotSpot™ Client VM (build 1.5.0_06-b05, mixed mode, sharing)

And here are the log entries when google requests the subscription:

==> debug.log <==

2006.05.02 08:49:56 Connect Socket[addr=/216.239.36.129,port=17813,localport=5269]

2006.05.02 08:49:56 RS - Received dialback key from host: gmail.com to: jabber.mydomain.com

2006.05.02 08:49:56 RS - Trying to connect to Authoritative Server: gmail.com:5269

==> error.log <==

2006.05.02 08:50:16 org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:161) Connection closed before session established

Socket[addr=/216.239.36.129,port=17813,localport=5269]

==> warn.log <==
2006.05.02 08:50:16 Error verifying key of remote server: gmail.com
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:507)
at org.jivesoftware.wildfire.server.ServerDialback.verifyKey(ServerDialback.java:5 18)
at org.jivesoftware.wildfire.server.ServerDialback.validateRemoteDomain(ServerDial back.java:455)
at org.jivesoftware.wildfire.server.ServerDialback.createIncomingSession(ServerDia lback.java:338)
at org.jivesoftware.wildfire.server.IncomingServerSession.createSession(IncomingSe rverSession.java:98)
at org.jivesoftware.wildfire.net.ServerSocketReader.createSession(ServerSocketRead er.java:208)
at org.jivesoftware.wildfire.net.SocketReader.createSession(SocketReader.java:607)
at org.jivesoftware.wildfire.net.SocketReader.run(SocketReader.java:110)
at java.lang.Thread.run(Thread.java:595)

Going the other way (wildfire -> google) results in:

2006.05.02 08:56:02 OS - Going to try connecting using server dialback
2006.05.02 08:56:02 OS - Trying to connect to gmail.com:5269
2006.05.02 08:56:22 Error connecting to the remote server: gmail.com(DNS lookup: gmail.com:5269)
java.net.SocketTimeoutException: connect timed out
at java.net.PlainSocketImpl.socketConnect(Native Method)
at java.net.PlainSocketImpl.doConnect(PlainSocketImpl.java:333)
at java.net.PlainSocketImpl.connectToAddress(PlainSocketImpl.java:195)
at java.net.PlainSocketImpl.connect(PlainSocketImpl.java:182)
at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:366)
at java.net.Socket.connect(Socket.java:507)
at org.jivesoftware.wildfire.server.ServerDialback.createOutgoingSession(ServerDia lback.java:149)
at org.jivesoftware.wildfire.server.OutgoingServerSession.createOutgoingSession(Ou tgoingServerSession.java:350)
at org.jivesoftware.wildfire.server.OutgoingServerSession.authenticateDomain(Outgo ingServerSession.java:140)
at org.jivesoftware.wildfire.server.OutgoingSessionPromise.createSessionAndSendPac ket(OutgoingSessionPromise.java:126)
at org.jivesoftware.wildfire.server.OutgoingSessionPromise.access$300(OutgoingSess ionPromise.java:37)
at org.jivesoftware.wildfire.server.OutgoingSessionPromise$1$1.run(OutgoingSession Promise.java:91)
at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java: 650)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)
at java.lang.Thread.run(Thread.java:595)
2006.05.02 08:56:22 Error sending packet to remote server:

java.lang.Exception: Failed to create connection to remote server

at org.jivesoftware.wildfire.server.OutgoingSessionPromise.createSessionAndSendPac ket(OutgoingSessionPromise.java:139)

at org.jivesoftware.wildfire.server.OutgoingSessionPromise.access$300(OutgoingSess ionPromise.java:37)

at org.jivesoftware.wildfire.server.OutgoingSessionPromise$1$1.run(OutgoingSession Promise.java:91)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java: 650)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:675)

at java.lang.Thread.run(Thread.java:595)

Help!

Just an update, adding “216.239.37.125 gmail.com” to /etc/hosts and restarting wildfire worked. (Well… works for gmail only and might break other things)

So it looks like Wildfire doesn’'t use SRV records. Any suggestions on fixing that?

Yes, I get the same debug output from our jive server as well, when adding googletalk contacts.

Hmmmmmmmmmm.

Wonder what is going on here? Is Jive completely ignoring DNS?

Cheers.

xmpp:jason@sjobeck.com

Hey Dis,

Some of the info that is being printed might be confusing. When you get something like “RS - Trying to connect to Authoritative Server: gmail.com:5269” it does not mean that the actual socket was tried to be established to gmail.com but to that XMPP hostname. I will enhance those printings to include DNS info to show actual host/port values.

However, something weird is going on here. When you get this message: “Error connecting to the remote server: gmail.com(DNS lookup: gmail.com:5269)” it means that the server is trying to connect to gmail.com when it should be xmpp-server1.l.google.com or any other of the valid addresses.

The server will first try running a look up for xmpp-server.tcp.gmail.com and if that failed then it will retry with jabber.tcp.gmail.com and if that failed then gmail.com will be used. We are connecting to gmail.com from jivesoftware.com and also from my local server and in both cases it is working fine. Is there any chance for you to debug the server and see why the DNS look up is failing? The class to debug would be org.jivesoftware.wildfire.net.DNSUtil line 65.

BTW, are you executing “host -t srv jabber.tcp.gmail.com” from the same machine that is running Wildfire?

Regards,

– Gato

I tried that and it worked fine (running maradns on the gateway, so I initially suspected it was screwing up the SRV records. But no such ‘‘luck’’.)

dis@floyd:~$ host -t SRV xmpp-server.tcp.gmail.com

xmpp-server.tcp.gmail.com has SRV record 20 0 5269 xmpp-server4.l.google.com.

xmpp-server.tcp.gmail.com has SRV record 5 0 5269 xmpp-server.l.google.com.

xmpp-server.tcp.gmail.com has SRV record 20 0 5269 xmpp-server1.l.google.com.

xmpp-server.tcp.gmail.com has SRV record 20 0 5269 xmpp-server2.l.google.com.

xmpp-server.tcp.gmail.com has SRV record 20 0 5269 xmpp-server3.l.google.com.

dis@floyd:~$ host -t SRV jabber.tcp.gmail.com

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server3.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server4.l.google.com.

jabber.tcp.gmail.com has SRV record 5 0 5269 xmpp-server.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server1.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server2.l.google.com.

Any other suggestions?

Hey guys, I have the same problem here. Running Wildfire 2.6.2 on FreeBSD.

In my log directory I got:

2006.05.23 14:22:27 org.jivesoftware.wildfire.server.OutgoingServerSession.createOutgoingSession(Out goingServerSession.java:259) Error trying to connect to remote server: gmail.com(DNS lookup: gmail.com:5269)

When I run “host -t srv jabber.tcp.gmail.com” I get the following:

starbuck# host -t srv jabber.tcp.gmail.com

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server2.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server3.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server4.l.google.com.

jabber.tcp.gmail.com has SRV record 5 0 5269 xmpp-server.l.google.com.

jabber.tcp.gmail.com has SRV record 20 0 5269 xmpp-server1.l.google.com.

Anyone know how we can fix this? A lot of users are starting ot use Gmail and it would be nice to chat with them!

P.S. I also noticed this in my Warn* log:

2006.05.23 14:37:30 Error verifying key of remote server: gmail.com

Message was edited by: travisbell

I have exactly the same issue. I wonder why there’‘s no bug report on that one yet, I’‘d say it’'s a pretty huge problem…

I’‘ll tell you something probably not good cause’’ it means we got a legitimate bug… I restarted WIldfire and it started working again for me.

So the question remains, what caused it to stop responding to begin with?

I’'m having the exact same problems on temp123.org. I have investigated this quite a bit, and here are my results:

o) wildfire 2.6.2 on my Debian SID(unstable) server; no-go, exhibits the exact problems described in this thread. Java2 SE 1.5.0-6 installed using make-jpkg.

o) wildfire 2.6.2 on a WindowsXP Professional system on the same network; works perfectly. Java installed by bundled Wildfire installer.

o) wildfire 2.6.2 on a Fedora Core 5 system on the same network; works perfectly. Java2 SE 1.5.0-6 installed by hand.

o) ejabberd 1.0.0 on the above-mentioned Debian SID server; works perfectly.

Conclusions:

o) Nothing wrong in my configuration or in the global environment, i.e. my DNS records are fine.

o) Possibly a bug in make-jpkg on Debian SID.

o) Possibly a bug in Java2 SE 1.5.0-6 that shows up on Debian SID but does not show up on Fedora Core 5.

o) Possibly a bug in Wildfire 2.6.2 that shows up on Debian SID but does not show up on Fedora Core 5.

I’‘d be very interested how other people’'s results fit into this framework. I think we can at least winnow this thing down to which component makes it show up.

Thanks!

I am running FreeBSD 5.4 with a personal compiled version of Java. Java version “1.5.0-p2” to be exact.

Ran into the issue again last night but a restart fixed again.

Cheers,

Hm… restarting definitely does not clear up the problems I’‘ve seen… they’‘re terminal on Debian SID. Still could be dodgy interaction with Java I suppose. I’‘m going to probably reimage my server this weekend with Sarge, and we’'ll see if that makes a difference.

FYI, I’'m have those issues on a very custom Gentoo-box runnning Java 1.5.0_06.

I’‘ve now isolated that bug in Smack, too. It seems that the API used by jivesoftware to look up SRV records is simply broken on most systems and generates invalid DNS queries. The only possible solution would be to switch to another SRV-resolving library (I know there’'s a BSD-licensed one for Java).

Hi Andy,

could you provide a short code example, this will make it much more easy to create a JM and SMACK issue.

LG

Here’‘s the code I used, I got it from matt when I complained that Smack doesn’'t do SRV lookups for connecting to the server:

http://paste.lisp.org/display/20732

I compared Smack’‘s DNS queries to dig’'s via tcpdump.

Smack: “[udp sum ok] 1+ ANY ANY? xmpp-client.tcp.gmail.com. (45)”

Reply: “[udp sum ok] 1 Refused 0/0/0 (12)”

dig: “[udp sum ok] 46805+ SRV? xmpp-client.tcp.gmail.com. (45)”

Reply: “46805 q: SRV? xmpp-client.tcp.gmail.com. 5/0/1 xmpp-client.tcp.gmail.com.[|domain]”

The following happens when I do an ANY lookup:

dig: “[udp sum ok] 6459+ ANY? xmpp-client.tcp.gmail.com. (45)”

Reply: “6459 q: ANY? xmpp-client.tcp.gmail.com. 5/0/0 xmpp-client.tcp.gmail.com.[|domain]”

anlumo,

Great work.

Have you reported back to Matt & Jive your findings so that Jive can resolve in v2.6.3?

Thank you.

Jason

Have you reported back to Matt & Jive your findings

so that Jive can resolve in v2.6.3?

Yes, I’'ve pointed Matt to this thread, and the tcpdump lines I pasted here were actually forwarded to him about a week ago. Jivesoftware just seems to be a bit reluctant to file bug reports based on user input.

I don’'t know if they allow 3rd parties to open bug reports, if this is the case, I could do it myself.

The two ANY querys are not the same.

dig will do search queries in the IN class by default, since 99% of queries are there. Smack is doing the query with the ANY class. I cant even get dig to allow me to do a query with the ANY class. It appears that some DNS servers dont handle this query well.

So, there are a few solutions here. First, use a differnet DNS server (like Bind, it seems to handle it ok). PowerDNS looks like the one that has the problem, but I dont know any PowerDNS servers to check with. The other solution is to change the code so the query is a bit better. Since we know we want SRV record types (thus in the IN class), it should just ask for that instead of doing a wildcard search.

Gato said he would make the change, so that will fix this end of things, but if it can be verified a bug report should go to PowerDNS.

PS- since dig wont do it, you can use nslookup to generate such a query:

nslookup

set class=ANY

set type=ANY

xmpp-server.tcp.gmail.com

Hey guys,

Both Wildfire and Smack where modified to search for records of type SRV and class IN. This should work with all DNS servers so the problem should be fixed now. The bug fixes will be available with the next nightly build.

BTW, people using pdns_recursor 3.1 may want to check out this thread:

http://mailman.powerdns.com/pipermail/pdns-users/2006-June/003584.html

Regards,

– Gato