I’‘m attempting to get S2S working reliably. I’'ve got 2.3.0 (alpha) running on Linux at both ends. One server is a conventional IP, the other end is behind a NAT and using port forwarding. There are full DNS entries for both boxes. The public IP on the NAT network is on the router, and the host itself configured with a 192.168.1.2 private IP.
When both servers are started the system initially works. I can see muc rooms on each server, and communicate OK. The problem seems to crop up when the s2s connection goes down. The firewall between the two boxes is pretty notorious for terminating TCP connections violently and without warning. When the 5269 s2s connection goes down it seems to not come back up.
First of all, the dialback operation seems to be failing (DNS names changed to protect the innocent):
2005.10.06 06:32:22 org.jivesoftware.messenger.server.ServerDialback.createOutgoingSession(ServerDia lback.java:194) Error creating outgoing session to remote server: cs.spn.edu(DNS lookup: cs.spn.edu)
java.net.ConnectException: Connection timed out
This is failing because the server is dialing back to the wrong host–it should be trying to dial back to “xmpp.cs.spn.edu”, but the first part of the FQDN is being removed. Any idea why this may be happening? It’‘s strange, since as I say it initially works. It’'s only after some period of time, and I think a violent death to the 5269 s2s connection, that this crops up.
I also see this on the box behind the NAT:
This makes me worried for two reasons: first, just as before, it looks like jive lops off the first part of the FQDN. Second, it looks like the server is trying to establish a connection to itself based on the DNS name. Does this do an actual, guaranteed DNS lookup, or will it go through the nsswitch process? Since this is on a NAT, I’'ve got an /etc/hosts entry that points to 192.168.1.2 for that name, while DNS points to the public IP.