Problems after upgrade to 4.6.6

Hello,
I’ve upgrade my instance firstly to 4.6.5 15 Dec, finally to 4.6.6 17 Dec.
Today I got reports that the system is not working. Users could not log in, and those logged in did not receive messages.
I restarted the service and everything was back to normal.
This has already happened the second time. The first was after updating to version 4.6.5. Before this upgade, there was no such problem.
Unfortunately, the logs was overwritten after restarting the service. I only have Erro.log, info.log and warn.log

Can I ask for help, where to look for the problem?

The most obvious place to look for issues would be in the log files. There sadly is little other information that persists anywhere, useful for debugging purposes.

Another way that would be helpful is if you find a way to reproduce the problem.

Thanks.

When this problem occurs again, I will make a copy of the logs before restarting the service.

Hello,

Since upgrade to 4.6.6, i face disconnection with some XMPP client like pidgin and spark. This phenomenon was not observed with version OF 4.6.3 and before. Once this happen it is impossible to re-connect with pidgin or spark unless openfire is re-started
As a result i need to restart OF and then it is fixed but this issue has come 4 times in 2 days no idea what could be the root cause.

When this happen OF remains up and running but pidgin or spark cannot connect to server unless i restart OF. Could this be related to xmpp_client SRV records i use to redirect 5222 port to server port ?

This behaviour is obviously observed since 4.6.6 I run many instances till version 4.6.3 that never have such a prob. Looks similar that yours

Hope this helps, but this issue is weird and annoying as it is randomly happening

Thanks a lot in advance if you find something on that one :slight_smile:

Hi Claude. I this sounds like a different problem than the one reported byt @Chrees.

Your issue sounds more similar to https://discourse.igniterealtime.org/t/problem-with-openfire-4-6-4/ - No-one has been able to provide a way to reproduce this problem, which makes the problem hard to diagnose.

@guus

Your are probably right. It looks very similar to that issue. However it is very difficult to reproduce and find a root cause. I have 2 clients docker instances having the same problem as they both use Pidgin. It happen once a day but so far very difficult to isolate.
I am running a pidgin client/ same with spark , when it freezes i know we just got this issue, i will try to collect logs at that moment before restarting and share.

++ Thanks for your reply :slight_smile:

In my environment, SPARK is mainly used, I use Miranda NG. The problem occurs for all clients.

Hi guus I had the problem again this night for one docker instance only. After a restart, back to normal. I did capture openfire logs before re-starting my Docker instance. No clue what is causing this
Problem occor at 4:30 PM PST time
It happens only for one instance today, it looks random and only pidgin + Spark client are freezing
Let me know where i can send you OF logs, if you can check

Thanks in advance for your help

Latest message in error.log, unsure if related is

2021.12.22 16:30:42 org.jivesoftware.openfire.spi.RoutingTableImpl - Primary packet routing failed
org.jivesoftware.openfire.PacketException: Cannot route packet of type IQ or Presence to bare JID: <iq type="error" id="107-3343" to="im.360bcgroup.com" from="ios13push.monal.im"><pubsub xmlns="http://jabber.org/protocol/pubsub"><publish node="D2F16C41-C45C-4F78-B309-24EB7381BC20"><item><notification xmlns="urn:xmpp:push:0"><x xmlns="jabber:x:data" type="form"><field var="FORM_TYPE" type="hidden"><value>urn:xmpp:push:summary</value></field><field var="message-count" type="text-single"><value>1</value></field><field var="last-message-sender" type="text-single"/><field var="last-message-body" type="text-single"><value>New Message</value></field></x></notification></item></publish><publish-options><x xmlns="jabber:x:data" type="submit"><field var="FORM_TYPE"><value>http://jabber.org/protocol/pubsub#publish-options</value></field><field var="secret"><value>bd115e680889e431a013a597dec670c5409478ce477058a6beac87849654550d</value></field></x></publish-options></pubsub><error code="404" type="cancel"><remote-server-not-found xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/></error></iq>
        at org.jivesoftware.openfire.spi.RoutingTableImpl.routeToLocalDomain(RoutingTableImpl.java:329) ~[xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.java:262) [xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.returnErrorToSender(OutgoingSessionPromise.java:346) [xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(OutgoingSessionPromise.java:245) [xmppserver-4.6.6.jar:4.6.6]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]
2021.12.22 16:30:42 org.jivesoftware.openfire.interceptor.InterceptorManager - Error in interceptor: org.jivesoftware.openfire.plugin.ofmeet.BookmarkInterceptor@24f7003c while intercepting:
<iq type="error" id="864-3345" to="im.360bcgroup.com" from="ios13push.monal.im">
  <pubsub xmlns="http://jabber.org/protocol/pubsub">
    <publish node="D2F16C41-C45C-4F78-B309-24EB7381BC20">
      <item>
        <notification xmlns="urn:xmpp:push:0">
          <x xmlns="jabber:x:data" type="form">
            <field var="FORM_TYPE" type="hidden">
              <value>urn:xmpp:push:summary</value>
            </field>
            <field var="message-count" type="text-single">
              <value>1</value>
            </field>
            <field var="last-message-sender" type="text-single"/>
            <field var="last-message-body" type="text-single">
              <value>New Message</value>
            </field>
          </x>
        </notification>
      </item>
    </publish>
    <publish-options>
      <x xmlns="jabber:x:data" type="submit">
        <field var="FORM_TYPE">
          <value>http://jabber.org/protocol/pubsub#publish-options</value>
        </field>
        <field var="secret">
          <value>bd115e680889e431a013a597dec670c5409478ce477058a6beac87849654550d</value>
        </field>
      </x>
    </publish-options>
  </pubsub>
  <error code="404" type="cancel">
    <remote-server-not-found xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/>
  </error>
</iq>
java.lang.NullPointerException: null
2021.12.22 16:30:42 org.jivesoftware.openfire.spi.RoutingTableImpl - Primary packet routing failed
org.jivesoftware.openfire.PacketException: Cannot route packet of type IQ or Presence to bare JID: <iq type="error" id="864-3345" to="im.360bcgroup.com" from="ios13push.monal.im"><pubsub xmlns="http://jabber.org/protocol/pubsub"><publish node="D2F16C41-C45C-4F78-B309-24EB7381BC20"><item><notification xmlns="urn:xmpp:push:0"><x xmlns="jabber:x:data" type="form"><field var="FORM_TYPE" type="hidden"><value>urn:xmpp:push:summary</value></field><field var="message-count" type="text-single"><value>1</value></field><field var="last-message-sender" type="text-single"/><field var="last-message-body" type="text-single"><value>New Message</value></field></x></notification></item></publish><publish-options><x xmlns="jabber:x:data" type="submit"><field var="FORM_TYPE"><value>http://jabber.org/protocol/pubsub#publish-options</value></field><field var="secret"><value>bd115e680889e431a013a597dec670c5409478ce477058a6beac87849654550d</value></field></x></publish-options></pubsub><error code="404" type="cancel"><remote-server-not-found xmlns="urn:ietf:params:xml:ns:xmpp-stanzas"/></error></iq>
        at org.jivesoftware.openfire.spi.RoutingTableImpl.routeToLocalDomain(RoutingTableImpl.java:329) ~[xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.spi.RoutingTableImpl.routePacket(RoutingTableImpl.java:262) [xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.returnErrorToSender(OutgoingSessionPromise.java:346) [xmppserver-4.6.6.jar:4.6.6]
        at org.jivesoftware.openfire.server.OutgoingSessionPromise$PacketsProcessor.run(OutgoingSessionPromise.java:232) [xmppserver-4.6.6.jar:4.6.6]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130) [?:?]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630) [?:?]
        at java.lang.Thread.run(Thread.java:832) [?:?]

Hi, my instance not working today. I restarted. I copied logs before restarted server.
How can I give you these logs?

@guus I have rolled back 2 customers instances to 4.6.3 and no more connection problems. Pidgin Spark disconnect and impossible to re-connect unless OF is restarted. Never observed this with 4.6.3, hope this helps to understand the problem.

I keep 4.6.3 for now with workarround

@guus ??

Hi all,

I am suffereing the same issues ones that I have migrated to 4.6.6. I have tried with Beta 4.7 but no luck. Same issues. No issues with 4.6.3.

All seems that works properltly but after X minutes all of our clients start appearing as: “Invalid session/connection”

I have no cluster and diferent clients.

I tried to disabled all related with SSL but no luck:

Errors that I can see in logs:

org.jivesoftware.openfire.nio.ConnectionHandler - Closing connection due to exception in session: (0x00001099: nio socket, server, null => 0.0.0.0/0.0.0.0:5222)
javax.net.ssl.SSLHandshakeException: SSL handshake failed.

org.jivesoftware.openfire.nio.ConnectionHandler - Closing connection due to exception in session: (0x00001996: nio socket, server, null => 0.0.0.0/0.0.0.0:5222)
javax.net.ssl.SSLHandshakeException: SSL handshake failed.

But theses errors still happen when the system is working propertly.

The only way to solve is to change something in the client connections config just to trigger a restart ConnectionListener

022.01.04 11:26:04 INFO [Jetty-QTP-AdminConsole-6342]: org.jivesoftware.openfire.spi.ConnectionListener[socket_c2s] - Stopped.
2022.01.04 11:26:04 INFO [Jetty-QTP-AdminConsole-6342]: org.jivesoftware.openfire.spi.ConnectionListener[socket_c2s] - Started.
2022.01.04 11:26:04 INFO [Jetty-QTP-AdminConsole-6342]: org.jivesoftware.openfire.spi.ConnectionListener[socket_c2s] - Done restarting…

Ones I do this, clients start reconnecting again for another X minutes.

Any ideas about what will be the problem?.

Until find any solution is any way to restart ConnectionListener every X minutes?

Thanks,

1 Like

@guus My old post seems to be gaining in importance. Paying attention to the link to Stackoverflow.

So the issue could have something to do with SSL/TLS in combination with mina…

Meanwhile i have implemented a little workaround plugin which checks for invalid sessions and
restarts the connection listener for plain connections (5222 for clientsessions)

Thanks @totzkotz for this workaround.

I have installed and configured as:

plugin.sessioncheck.enabled = true
plugin.sessioncheck.interva = 60
plugin.sessioncheck.mintorestart = 5

Keep you informed how it works and waiting at the same time for a final solution

Thanks again.

@totzkotz Could you confirm that works with Openfire Beta 4.7?
Thanks.

nope… did not test it on 4.7 beta, but i cant see why it should not work… if you look into the source it is very simple

I am not a Java developer, but let me to check It.

What I can see is that it tries to execute:

2022.01.04 16:44:47 INFO [Timer-7]: org.igniterealtime.openfire.plugin.sessioncheck.SessionCheckPlugin - Checking Sessions…
2022.01.04 16:44:47 WARN [Timer-7]: org.igniterealtime.openfire.plugin.sessioncheck.SessionCheckPlugin - Found enough invalid sessions to restart Connection Listener!
2022.01.04 16:44:47 INFO [Timer-7]: org.igniterealtime.openfire.plugin.sessioncheck.SessionCheckPlugin - done…

But not working.

ok seem that it need some more changes to restart ConnectionListener

I have checked and review your code and I realized that it not the same that compiled one.

Before start doing any other thing it will be better to wait until you check the code.

It you will provided a new one, I can check it again.

Thnaks a lot.

Hi @Gus, As you can see I can reproduce this problem and @totzkotz is working in a workaround.

If will be really interesting to kown if you have been able to detect where is the main issue related with it.

Thanks a lot.

----As I only have 3 answers I am going to edit this one to response.

Sorry @totzkotz but I get an error:

Excepción:
java.lang.NullPointerException
at org.jivesoftware.openfire.plugin.sessioncheck.sessioncheck_jsp._jspService(sessioncheck_jsp.java:146)
at org.apache.jasper.runtime.HttpJspBase.service(HttpJspBase.java:71)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.jivesoftware.openfire.container.PluginServlet.handleJSP(PluginServlet.java:432)
at org.jivesoftware.openfire.container.PluginServlet.service(PluginServlet.java:122)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)
at org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1459)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)
at org.eclipse.jetty.servlet.ServletHandler$ChainEnd.doFilter(ServletHandler.java:1626)
at org.jivesoftware.admin.PluginFilter.doFilter(PluginFilter.java:226)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at org.jivesoftware.admin.AuthCheckFilter.doFilter(AuthCheckFilter.java:234)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)
at com.opensymphony.sitemesh.webapp.SiteMeshFilter.obtainContent(SiteMeshFilter.java:129)
at com.opensymphony.sitemesh.webapp.SiteMeshFilter.doFilter(SiteMeshFilter.java:77)
at org.eclipse.jetty.servlet.FilterHolder.doFilter(FilterHolder.java:193)
at org.eclipse.jetty.servlet.ServletHandler$Chain.doFilter(ServletHandler.java:1601)

If you confirm that works with 4.6.5, I can rollback to 4.6.6/7 and try it.

Thanks.