powered by Jive Software

Openfire and CentOS huge performance problems

Same performance problems with nightly build 2019-04-18.

Is there something more i can try to improve performance?

I think this relates to Openfire trying to retrieve the Roster of users. The problem might be caused by Openfire doing many, many LDAP requests (to construct the roster) that each take to much time to complete. A proper fix needs thorough investigation, but perhaps we can reduce the problem.

Openfire, by default, caches a Roster for 30 minutes. After that, the Roster of a user that has already been constructed in the past is reconstructed, which takes valuable resources.

Try increasing the cache entry lifetime for the roster cache to something that is a lot larger than 30 minutes (for example, 6 hours). Note that this means that changes to your rosters/groups made to LDAP take longer for Openfire to be detected!

Try setting the property cache.username2roster.maxLifetime to 21600000 (6 hours, in milliseconds). I think you need to restart Openfire for this to be applied. This might prevent Openfire from trying to construct Rosters that it already has.

If this helps, then I suggest that you change the value to a period that is longer than what it takes for all of your users to get online after a server restart.

Recently, https://github.com/igniterealtime/Openfire/pull/1069 was merged. It should be available in tomorrows nightly build. There’s a small chance that this impacts the issue you’re having.

Hi.

I’ll try new build with this PR and give a feedback.

Thank you.

Trying to download lastest nightly build gives 403 forbidden error - You don’t have permission to access /openfire/dailybuilds/openfire_2019-05-07-noarch.rpm on this server.

Used URL: http://download.igniterealtime.org/openfire/dailybuilds/openfire_2019-05-07-noarch.rpm

Deb package is available. Can’t download rpm from 30 Apr till today. Can you help with rpm, please?

Thank you.

Hey, how did that happen? I’ve fixed the downloads. Thanks for reporting that issue.

I’ve got latest nightly, thanks for the help!

I’ll try it and report here.

Hi!

Sorry for the time took to test new builds.

Here are the results:
Openfire 4.3.2 - 60 users online in 10 minutes, cpu usage 10-25%.
Openfire 2019-05-07 - 750 users online in 10 minutes, cpu usage 80-100%.
Openfire 2019-05-14 - 51 users online in 10 minutes, cpu usage 10-25%.

Test was performed with jmeter-5.1.1 with 1100 connections.
Before each run openfire cache was cleared.
Java version used with openfire 2019-05-07 and 2019-05-14 - 1.8.0_212, with 4.3.2 - bundled.
OS: CentOS 7.6

What can i try next?

Thanks!

Also tried ejabberd-19.02 - all 1100 users are online in 3 minutes.
Used same server, java version, same ldap server.

Maybe this info can help.

I think this issue is more about how we search and retrieve ldap objects more than anything else. We’ll probably need to seek out some additional resources on this…

For now we’re on 3.7 version. Will try new releases for some time.

Thanks for the help!

I’m hopeful that https://github.com/igniterealtime/Openfire/pull/1391 has a positive impact on this issue.

@vern It looks like this may have been introduced in 4.3.x. You may want to give 4.2.4 a spin… it will at least bring you up to a more current version than 3.7 https://github.com/igniterealtime/Openfire/releases/tag/v4.2.4

Looks like 4.4.0 does not impove the situation.
Having 2000 users (LDAP) and 100+ groups.
Sharing 1 group took about 20+ minutes.
When server goes down, all users cannot came back online at all, maybe because roster cache at 20 min lifetime.
Last version, than can handle simultaneous login of 1500 ldap users - 4.0.2, it came online in 10 minutes.
Group sharing very slow on 4.0.2 too :frowning:
Some cache stats from empty server (2 users online and 2000 at all).
Trying to share one group (70 members) for all users:

SQL Profiles shows a lot of strange queries during sharing - insert into ofroster with values of domain users and immediately delete this record.

can you give 4.2.4 a try please?

I’ve tried 4.2.3 and 4.3.2 before, mass login and group sharing troubles has been reproduced