powered by Jive Software

Openfire slow startup 4.6.2 as a service

Hello, we are running openfire 4.6.2 as a service on server 2012 r2.

from reboot this takes about 5-6 minutes to startup. during this time we cannot access the console or log into spark. is there a way to speed up the process.

My guess is that there is a very slow database loading process or you have some sort of DNS lookup / network connection timeout issue for some blocking resource. Are you able to review the logs during the startup or watch the database for which queries it is running?

Now that you mentions DNS we do have this error. on the dashboard. could it be related?

There appear to be no DNS SRV records at all for this XMPP domain. With the current configuration of Openfire, it is recommended that DNS SRV records are created for this server.

alternatively, where could we investigate the timeout issue, if that were the case.

We had “XMPP Domain Name” on the OF console show an error till we configured it correctly in properties (xmpp.domain), but that did not solve our slow start. We observe that on startup it hangs on this like quite a bit:

03:35:35.239 [main] INFO org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl - Multi User Chat domain: conference.[REDACTED DOMAIN].

I have found this line in org.jivesoftware.openfire.muc.spi.MultiUserChatServiceImpl.java (like 544 or there about, search for “startup.starting.muc” in Openfire code base, it only appears once outside of resource files)

    Log.info(LocaleUtils.getLocalizedString("startup.starting.muc", Collections.singletonList(getServiceDomain())));

    // Load all the persistent rooms to memory
    for (final LocalMUCRoom room : MUCPersistenceManager.loadRoomsFromDB(this, this.getCleanupDate(), router)) {
        localMUCRoomManager.addRoom(room.getName().toLowerCase(),room);

        // Start FMUC, if desired.
        room.getFmucHandler().applyConfigurationChanges();
    }

this particular code seems to be loading all the MUC rooms in memory right at the start. In our case we have 48K rooms and I believe this is what slows down our startup. Our startup can take about 10 minutes easily.

I should also mention that we have a cluster setup with 3 nodes, and we are on version 4.5.4.

Could this be a problem? and is it possible to defer the loading of the MUC rooms till they are actually used (on demand), or add a property that defers this to post startup (like a separate thread)?

Thank you.
DT.

BTW, Tested with 4.6.2 on a standalone r5.2xl instance with Aurora RDS and still 15 minutes to start. Still waiting at loading the MUC rooms into memory. Also increased the Minimum DB connections in openfire.xml and still took 14 min (same time). Will keep digging but if you have any ideas or how to move this out of the startup path or deffer it it would help.
DT.

Any recommendations?