Openfire upgrade to 4.6 causes loss of group chat rooms

Upgrade form 4.5.0 to 4.6.0 the list of group chat rooms is empty.
All the chat rooms are still in the database
Reverting to 4.5.0 the group chat rooms are displayed.
I have tried 4.6.1, no difference.

Any suggestions as to what the problem might be?

Setup

OS Windows Server 2016 Datacenter
2 servers running as a cluster using the hazelcast 2.5.0 plugin
DB is a MS SQL server running as a cluster

Rooms that are not used vor more than 30 days will not be loaded per default.
You can change that behavior in the chatservice preferences and restart OF

That was my first thought but most of the rooms are active with multiple posts every day, I have also changed the value unload.empty_days to 60 so the less used rooms stay active.

3 posts were split to a new topic: Canā€™t edit rooms after 4.6.1 update

Hey Graham. When you say ā€œthe list of group chat roomsā€, are you then referring to the list of chatrooms as displayed on the Openfire admin console, or to some other list of rooms somewhere?

Are there any errors in the Openfire log files that might hint at the cause for this?

Hi Guss,
Yes, they have disappeared form the admin console.

Which logs? as I will probably need to sanitize it first.

Is there anything in particular I can look for in the logs?

Graham

Have a look at the directory in which you installed Openfire. In that directory, thereā€™s another directory named logs. There are a bunch of files in there, but weā€™re only interested in the ones called all.log

Iā€™m not sure what to look for. If you can afford to, you might want to shut down Openfire, remove/backup all content in that logs directory, and start Openfire again. That will generate new log files, that will capture the start of the server (which I assume has a good chance of containing relevant log lines, as rooms should be loaded when the server starts up).

As I donā€™t know whatā€™s causing this, it is hard to predict what you need to look for. Iā€™m hoping that youā€™ll know when you see it.

all.log (39.3 KB)

I havenā€™t been able to do this on the production cluster so I have build a new standalone server that used the same MS SQL database. I installed 4.5.0 first and checked that the group chat rooms were listed, they were. I then updated it to 4.6.0 and the group chat rooms disappeared, so the same problem.

I have attached the all.log from the test server, the procedure I used was

  1. Stop the service
  2. delete the all.log file
  3. Start the service
  4. log on to the server and check the group chat rooms (missing)
  5. log off
  6. Stop the service
  7. Take a copy of the all.log file (attached)

Looking at the log file there is a error right at the beginning about a database upgrade, could this be the problem.
Also as Iā€™m running a 2 node hazelcast cluster will there be any issues if the debase during updates when one server running 4.5.0 and other 4.6.0

Graham

1 Like

What in the world?! It seems that the comment block in an SQL update script is causing this. :face_with_raised_eyebrow:

Can you execute this query on your database please?

SELECT * FROM ofVersion WHERE name = 'openfire';

The script that seems to be causing your issue is this one: https://github.com/igniterealtime/Openfire/blob/master/distribution/src/database/upgrade/31/openfire_sqlserver.sql

If, and only if, the result of the query above is ā€˜30ā€™, you can try to execute that manually on your database (the usual disclaimer of ā€˜create backups firstā€™ applies). You should do this after Openfire has been shut down.

Iā€™d be very interested in learning if you have issues if you manually execute that script (specifically, including that comment block).

I have raised a bug for this issue in our tracker, here: https://igniterealtime.atlassian.net/browse/OF-2197

Iā€™ve run the query and it returns 30.

I not going to be able to take the Openfire servers off line to test this. So what Iā€™ll do is run a full backup of the database and restore it as new database on the SQL server. Then Iā€™ll point my test Openfire server at the ā€œnewā€ database and run the script and see what happens.

Iā€™ll put the results up once Iā€™ve run the script, but it will probably be tomorrow before I can do this.

Graham

This is correct. Itā€™s the block-comment in the sql update script. I suggest removing the block comment and changing it to be a single comment per line. This is a very simple fix. The schema manager code cannot handle a block-comment section. Itā€™s setup to parse on single line comments only. Running the script manually in sql manager executes just fine.

Fixes along those lines bare already in the works, thanks! It can take some time before theyā€™ll be available in a release though In the meantime, the issue can be worked around by manually executing the database upgrade script. Be sure to have Openfire shut down when you do so, and create a full backup before you start applying the fix!

For posterity:

A fix has now been applied to the code. The fix will be released as part of Openfire 4.6.2 and 4.7.0.

Which has been released moments ago.

Just installed it on my test rig, work OK the group chat chat rooms were there after the update
Waiting for clearance to put it on the live system

Thanks very much for the very quick turn around.

1 Like

I had precious little to do with it. Blame @danc and @akrherz.

1 Like

Happy to report live system now done all good.

Current setup is now 2 x Openfire 4.6.2 with Hazelcast 2.5.0 and a 2 node MS SQL Cluster

1 Like

Out of interest: how did you setup the SQL clustering? Also, how did you configure Openfire to make use of that cluster?

Not sure of how much detail to put so Iā€™ll just put the basics and hopefully not insult anybody. The cluster is, in principle, a standard MS 2 node SQL cluster. So, the process is

Configure the 2 servers to have access to common storage for the required number of disks, we have 4, Quorum, Data, Logs & Backups,

Install the Fail Over Cluster role in both servers and then create a cluster. There is no need to create a role in the cluster

Install MS SQL on both servers making sure to select the cluster installation option on the first server and then the Add node to cluster option on the 2nd server. The actual install is not that much different from a standard install.

Once the cluster is configured you access it over the network in just the same way any other MS SQL server so the connection in Openfire is just the same as for any other MS SQL server, if you want, I can post the string.

From memory I think I used the guid (link below) written by Starwind Software on setting up a 2 node SQL cluster when I set up the my first one. The guide is getting on a bit now as itā€™s for 2012 but it gives all the basic information needed. You just need to understand the difference with the versions of OS and SQL you are going to uses and what you shared storage is.

Starwind 2 node SQL Cluster guide

So the back end is an SQL Cluster with a 2 node Openfire cluster and in front of that we have a loadbalabcer.org Enterprise 1G to spread the load between the 2 Openfire servers.

Hope this is enough detail

2 Likes

Hi guys, I have problem with last update of openfire (4.6.*), the problem is similar to this thread, I donā€™t know if it is necessary to make a new one:

CURRENT SETTINGS

  • OS: Debian 10 (buster) (updated)
  • Version: Openfire 4.5.4
  • Database: Internal (donā€™t remember the name)
  • Users: LDAP (+1000)
  • Groups: LDAP (Ā±30)

Problem: When upgrade, in the administration web I can see all users, but all groups dissapeared (only one remains).
This happend with 4.6.1, try again with 4.6.2 and have the same problem.