Openfire 3.8 problems

I am wondering if anyone else might be having the same problems as me, or if I have an unrelated problem! Here is the problem…

I upgraded to 3.8 on Friday and experienced the problems with LDAP / Groups that several folks have mentioned, so installed the nightly build this morning 3.8.1 and got that back up and running fine. I can see all of my local users, etc. as normal now. The next problem is that we have another site with 3.7? that has been working fine until now. I cannot add users or message any users at the other site now, and never see the server in the sessions screen as I did before. The s2s configuration has not changed.

The reason I am unsure is that there were some strange problems over the weekend with the web filter at the remote site (and a firmware upgrade) that caused some other config problems, so I cannot say for sure that Openfire is the problem! Is anyone else experiencing any issues with s2s connections? Could this possibly be a version problem now between the two servers?

Thanks,

Drew Wood

Any S2S errors in the logs?

Yes, it appears from looking at this that it is trying to resolve the server name, but it has cut off the name. The servername (that I can ping, and telnet to on 5269) is chat.rvmc.local, but it is apparently trying to resolve only “local”? I have tried adding and removing the servername in the s2s config, and also tried changing it to “allow all servers to connect” as well with the same result.

Ideas??

2013.02.11 13:17:55 org.jivesoftware.openfire.session.LocalOutgoingServerSession - Error trying to connect to remote server: local(DNS lookup: local:5269)

java.net.UnknownHostException: local

at java.net.PlainSocketImpl.connect(Unknown Source)

at java.net.SocksSocketImpl.connect(Unknown Source)

at java.net.Socket.connect(Unknown Source)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.createOutgoingSess ion(LocalOutgoingServerSession.java:278)

at org.jivesoftware.openfire.session.LocalOutgoingServerSession.authenticateDomain (LocalOutgoingServerSession.java:208)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.sendPa cket(OutgoingSessionPromise.java:261)

at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(Ou tgoingSessionPromise.java:238)

at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)

at java.lang.Thread.run(Unknown Source)

To my knowledge, no S2S issues related to the 3.8.0 release have been reported by others.

When Openfire tries to resolve a remote domain through DNS (but fails) it will retry, using a less specific domain name each iteration. In your example, this would be the order in which resolving would take place (until one allows for successful connection, obviously:

  1. chat.rvmc.local
  2. rvmc.local
  3. local
    The last one is what you observed in your logs. It looks a bit odd, but it isn’t that uncommon.

Did you try restarting Openfire? When Java processes perform DNS lookups, they often linger in a cache indefinately. If your network indeed had hickups, and during this time Openfire did some lookups, it might suffer from that.

Looking around a bit more and enabled dialback and now I see this…

2013.02.11 14:17:23 org.jivesoftware.openfire.net.ServerTrustManager - Accepting self-signed certificate of remote server: [*.chat.rvmc.local]

2013.02.11 14:17:23 org.jivesoftware.openfire.server.ServerDialback - ServerDialback: OS - Ignoring unexpected answer in validation from: chat.rvmc.local id: 55a22d3c for domain: chat.esh.local answer:<stream:features xmlns:stream=“http://etherx.jabber.org/streams”></stream:features>

The only reference that I see to this particular error are from back in ver 3.7, where apparently 3.7 broke this functionality. Is there something else I am missing?

Drew

Guus der Kinderen wrote:

When Openfire tries to resolve a remote domain through DNS (but fails) it will retry, using a less specific domain name each iteration. In your example, this would be the order in which resolving would take place (until one allows for successful connection, obviously:

  1. chat.rvmc.local
  2. rvmc.local
  3. local
    The last one is what you observed in your logs. It looks a bit odd, but it isn’t that uncommon.

To my knowledge, name resolution should work the other way around by cutting out each level of domain name starting at the furthest out:

  1. chat.rvmc.local
  2. chat.rvmc
  3. chat

I’m not following you. The most generic domain identifier is furthest to the right. When no connection can be made to “host.example.org”, a DNS SRV lookup at “example.org” might save the day. Resolving “host.example” would result in an obvious error.