@wroot @guus @akrherz
Currently we have single openfire server with external database(oracle) and its working fine. We require to setup a DR incase of server failure or any other reason.
What would be the best approach to follow. Below are some approach we thought of trying with.
Install openfire on DR server with same version as in production and connecting to same
DB instance. ( In this case, will the data will be cleared/overwritten when we install DR into same DB? )
Install and setup clustering using hazelcast plugin with production and DR instance. Once done, make active only the production version and stop the DR version. If any failure then make active the DR system.
Please let us know which approach would be better out of above two or is there any other approach .
Appreciate your usual support.
I’m assuming here that DR refers to “disaster recovery” or a similar concept.
By far most of the state that is important to retain after a disaster is maintained in the database used by Openfire. There is some on-disk data (notably, the TLS certificates, the
openfire.xml configuration file and possibly plugin-specific directories) to take into account too, but that data typically does not change a lot. If “disaster recovery” in your definition is “being able to quickly re-establish functionality” rather than “end-users should not notice the disaster at all”, then dismissing the in-memory state (which are primarily routing tables, cached data, etc) is probably acceptable.
Any disaster recovery setup that depends on the database content to survive the disaster that is to be recovered from is flawed. A new Openfire instance is typically spun up very quickly. If you’re strictly looking for disaster recovery solutions, I wouldn’t immediately look at a clustering solution. Clustering adds a lot of caveats to the setup (which might hurt your normal production), and will still depend on that singular database instance.
I would suggest to first look at some kind of hot-standby solution, that depends on the database to be recovered in same way, shape or form as supported by your database vendor.
To enable DR and business continuity for openfire my best advice is considering Hazelcast clustering. Of course as database is shared by two instances so you also need HA and DR at the database level.
I have recently implemented sucessfully several openfire docker instances with clustering enabled. This gives you failover using multiple nodes but sharing same database instance.
In order to get best performances and scalability, i would investigate using mysql docker cluster replicated instances but oracle can do the job as well.
You also need to have filesystems containing files that must be sync or replicated. In other words you need a comon HA/DR filesystem/volume between nodes.
Implementing DR/HA is always costly as it requires strong expertise at many levels + testing plan and procedures. You need to weight what would be the advantage of clustering-HA against a cold standby instance that could rapidly be started in case od problems(cheaper).
Budget is not the same and results too
I do provide expertise in consultancy mode for that type of projects DR-HA. There are softwares to install & configure correctly. Network and performances concerns as well as a complete DR/HA test plan with operating guide and instructions. It’s a complete job/project.
Hope this helps.
Feel free to contact me if you need help and technical expertise to build your HA/DR infra. I could probably help out if you have a budget for it.
Thanks for posting this question. Hope answers from @guus and myself helps you and others having same concerns.
@guus Thank you.
We tried installing same version of openfire on DR server and connecting the same DB instance of production. We were able to connect and retrieve the messages that were sent in prod. server. We did not install cluster plugin.
@claude_stabile Thank you .
We tried without cluster plugin, and its seems to be working when connected the same DB with DR server.