Openfire 4.8.1 S2S connection only goes outbound

Hi,

I’m trying to set up a basic Openfire server setup with 3 different instances:

  • xmpp.server1.com (xmpp domain: server1.com): Openfire 4.8.1 server connected to an XMPP client running on 192.168.75.128
  • xmpp.server2.com (xmpp domain: server2.com): Openfire 4.8.1 server connected to an XMPP client running on 192.168.75.129
  • gateway.com (xmpp domain: gateway.com): Openfire 4.8.1 gateway server to connect server1 and server2 running on 192.168.75.1

I’ve tested several S2S scenario’s which worked perfectly fine. Each server has its own self-signed certificate with is trusted by the servers that directly contact them. And the servers allow self-signed certificates in the S2S configuration.

  • Direct connection from server1 to server2
  • Direct connection from server1 to gateway
  • Direct connection from server2 to gateway

Now I’m trying to setup the servers to use the xmpp gateway functionality based on the official openfire documentation.

I’ve configured the gateway server as follows:

xmpp.gateway.enabled=true
xmpp.gateway.domains=server1.com,server2.com

along with a dnsOverride to point to the actual DNS instead of the XMPP domain

- dnsutil.dnsOverride={server1.com,xmpp.server1.com:5269},{server2.com,xmpp.server2.com:5269}

The regular servers have a dns override configured to point to the gateway instead:

- dnsutil.dnsOverride={server2.com,gateway.com:5269}

According to the openfire logs in the S2S connection test everything seems to work. However the S2S connection socket is only outgoing from server1 to server2. When investigating the logs I see that server1.com is attempting an DNS SRV lookup directly to server2.com which fails.

Any assistance or ideas are greatly appreciated.

Here are my openfire logs:
openfire-log.txt (22.1 KB)

Here is my S2S connection test log:
xmpp-test.txt (9.4 KB)

Gateway/trunking is a difficult thing to tackle, as the logs are often quite contradictory. I’m not immediately spotting the source of the problem that you’re having.

One thing that might cause an issue is authentication of server-to-server connections.

In a gateway setup, the gateway server often offers certificates that are valid for its own domain, but not for the domains that are ‘behind’ it. Thus, when a server connects to the gateway thinking it is connecting to one of the servers ‘behind’ it, it fails TLS authentication when the certificate that is offered by the gateway does not cover the name of the server that’s ‘behind it’.

Your logs suggest that TLS is used for authentication, then fails. They also suggest that Dialback is then going to be used. And although I see some indication that it started, I think I do not see it complete (although the aforementioned contradictory logs might simply make this hard to correctly identify):

2024.05.22 10:12:47.584 e[36mDEBUGe[m [nioEventLoopGroup-10-1]: org.jivesoftware.openfire.nio.NettyOutboundConnectionHandler - TLS negotiation with 'server2.com' was successful, but peer's certificates are not valid for its domain.
2024.05.22 10:12:47.584 e[33mWARN e[m [nioEventLoopGroup-10-1]: org.jivesoftware.openfire.nio.NettyOutboundConnectionHandler - As peer's certificates are not valid for its domain ('server2.com'), the SASL EXTERNAL authentication mechanism cannot be used. The Server Dialback authentication mechanism is available.

Since Openfire 4.8.0, the ‘fallback’ authentication mechanism called “Server Dialback” is no longer being attempted when the original authentication mechanism (based on TLS) was attempted and found a certificate on the other server that it explicitly rejected. This behavior can be controlled in the admin console, under Server > Server Settings > Server to Server. Make sure that the checkbox in the “Mutual Authentication” section is not checked, as shown below

If switching off the above-mentioned setting does not work, I would suggest to make two changes to your test setup, to make analysis easier (you can revert those changes after the issue has been identified):

  • Make your XMPP domain be equal to the hostname of the server that’s running Openfire. This takes away lookups, which takes away something that muddies the water.
  • Make sure that the first authentication mechanism that is used for s2s succeeds. You can do this in one of two ways:
    – Ensure that the gateway offers a certificate that does not only identify its own XMPP domain, but also the domains of the server that are ‘behind it’.
    – Disable TLS authentication (either disable TLS completely, or set ‘mutual authentication’ to ‘disabled’)

This should reduce the amount of noise in the logs and going over the line, which hopefully makes the problem easier to spot!

2 Likes

Hi Guus,

I managed to get a basic gateway working by completely disabling TLS. After that I tested a direct connection to the gateway by marking TLS as required and authentication as needed. This also worked in both directions.

However, the gateway functionality does not seem to validate the certificate for server2.com even though I import both the certificates of gateway.com and server2.com into the server1.com truststore.

CertificateManager: Subject Alternative Name Mapping returned [*.gateway.com, gateway.com]
TLS negotiation with 'server2.com' was successful, but peer's certificates are not valid for its domain.
As peer's certificates are not valid for its domain ('server2.com'), the SASL EXTERNAL authentication mechanism cannot be used. The Server Dialback authentication mechanism is available.

My test scenario is as follows:

  • I start a S2S Connection test on server1.com to server2.com. This results in the following logs on server1:
    openfire-server1.txt (21.8 KB)
    I notice that the gateway server catches a timeoutexception while creating the outgoing connection to server2:
    openfire-gateway.txt (40.4 KB)

Using TLS-based authentication in context of gatewaying/trunking is notoriously hard.

I believe that the following is going on:

  1. server1.com is connecting to server2.com ‘through’ gateway.com. Thus, server1.com creates a direct connection to gateway.com.
  2. gateway.com presents its TLS certificate (from its identity store) to server1.com during TLS-based s2s authentication
  3. server1.com thinks it’s connecting to server2.com (but it’s really connecting to gateway.com)
  4. server1.com gets a certificate from gateway.com that includes only the identity of gateway.com
  5. as server1.com expected to be connecting to server2.com, but gets a certificate that does not have
    an identity for server2.com, it rejects the authentication.

For TLS authentication to work, the identity store of gateway must contain a privatekey/certificate combination that identifies gateway both as gateway.com as well as server2.com (and, for the reverse communication path to be possible, also as server1.com).

To make things more complex: I do not believe that you can have more than one certificate in an identity store: Openfire wil use just one of the certificates in the store, ignoring all others. This can be done by including multiple Subject Alternative Names in the identity certificate offered by gateway.com.

This indeed seems to be the issue!

Is there some way where I can configure the gateway to pass all packets to server2 without performing any checks?

I’m not sure what ‘checks’ you want to avoid. A form of authentication is always needed, as far as I know.

The easiest way to configure things is to disable TLS-based authentication for server-to-server (set ‘Mutual Authentication’ to ‘disabled’) and rely on Server Dialback authentication. That does a way with a lot of the TLS complexity.

Hi,

I tried configuring the gateway to disable mutual authentication and using Dialback instead. However the gateway reports that there is no authentication method available for the gateway to use. The StanzaHandler also is not able to process the ‘features’ tag, causing the session to close prematurely.

In a scenario where I try to connect server1 to server2 via the gateway, I see the following logs in the gateway:
openfire-gateway.txt (3.0 KB)

All servers use a certificate that is derived from the same root certificate. Server1 and Server2 are running on Openfire 4.7.1 while the gateway is running on 4.8.1, could this cause incompatibility issues? Since the gateway reports that server2 is not offering dialback.

I’m not sure if using older versions of Openfire has an effect.

It is curious that the servers on each endpoint are not offering the Dialback authentication mechanism. Try setting the system property xmpp.server.dialback.enabled to true

I’ve updated this property in all of the servers but this didn’t seem to resolve the issue:

gateway.com:

2024.05.28 14:50:51.383 DEBUG [nioEventLoopGroup-12-1]: org.jivesoftware.openfire.net.RespondingServerStanzaHandler - Check if both us as well as the remote server have enabled STARTTLS and/or dialback ...
2024.05.28 14:50:51.383 DEBUG [nioEventLoopGroup-12-1]: org.jivesoftware.openfire.net.RespondingServerStanzaHandler - Remote server is offering dialback: false, EXTERNAL SASL: false
2024.05.28 14:50:51.383 DEBUG [nioEventLoopGroup-12-1]: org.jivesoftware.openfire.net.RespondingServerStanzaHandler - No authentication mechanism available.

When checking the open sessions, I have a bidirectional session to the gateway and an outgoing connection to server2.

You’ll need to enable dialback at both endpoints. Two servers will only agree to use a particular authentication mechanism when they both support it. According to your logs, gateway.com found that its peer did not support dialback.