powered by Jive Software

Openfire and CentOS huge performance problems

For now we’re on 3.7 version. Will try new releases for some time.

Thanks for the help!

I’m hopeful that https://github.com/igniterealtime/Openfire/pull/1391 has a positive impact on this issue.

@vern It looks like this may have been introduced in 4.3.x. You may want to give 4.2.4 a spin… it will at least bring you up to a more current version than 3.7 https://github.com/igniterealtime/Openfire/releases/tag/v4.2.4

Looks like 4.4.0 does not impove the situation.
Having 2000 users (LDAP) and 100+ groups.
Sharing 1 group took about 20+ minutes.
When server goes down, all users cannot came back online at all, maybe because roster cache at 20 min lifetime.
Last version, than can handle simultaneous login of 1500 ldap users - 4.0.2, it came online in 10 minutes.
Group sharing very slow on 4.0.2 too :frowning:
Some cache stats from empty server (2 users online and 2000 at all).
Trying to share one group (70 members) for all users:

SQL Profiles shows a lot of strange queries during sharing - insert into ofroster with values of domain users and immediately delete this record.

can you give 4.2.4 a try please?

I’ve tried 4.2.3 and 4.3.2 before, mass login and group sharing troubles has been reproduced

Any updates on this? Will it be fixed? Any logs\dumps needed?
Thanks.

Same question again…

@vern @magic
Thanks to @Dele_Olajide, This should be fixed in the next release. If you can’t wait til the next release, you can test things out with the nightly build

Thanks a lot, i hope to try nightly build soon

Openfire 4.4.3 is out the door now, please try it and report back your findings.

Hi!

I’ve tested 4.4.3-1 with LDAP Auth(same ldap server was in April) and 500 bots in jmeter - looks great. All 500 was online under 1 minute.

Group sharing works much faster now, on some groups of 10-20 users instantly, on groups of 100-150 users it was under a minute, but i need to test it a little bit more with more shared groups.

Tested on a fresh CentOS 7 VM with 2 VCPU and 4GB RAM with MariaDB 5.5.64.

I will test this version in production, but I don’t know exactly when I can do it.

Thanks for the release, support and feedback!

4 Likes

Thanks for the feedback! Note that things changed (have been optimized) in 4.4.4 - You might want to test using that, instead of 4.4.3!

3 Likes

Hi!

I’ll definitely try 4.4.4 version and give the feedback.

Thanks!

unbearable

1 Like

Hi!

I’ll try to test 4.4.4 this week, last week was very busy.

Hi @guus! Sorry for The Suspense.

Tried 4.4.4 with default settings, including caches.

  • 1000 bots with jmeter online under 1 minute, 4.4.3 took 3-5 minutes for 1000 and under 1 minute for 500 bots
  • Big groups(100+ users in each) shared in 5-10 seconds, small(10-20 users) - instantly, 4.4.3 shared big groups for 1-2 minutes(which is very fast already!)
  • Lower ram usage, 400 of 900MB was used (max 1GB is default for java process), 4.4.3 used max
  • Huge drop in database connections, from hundreds of thousands to 6-7K(6-7K after all 1000 bots were online), even less database connections than 4.4.3, on 4.4.3 - 10-12K

All stats while 1000 bots were online.

OS: CentOS 7
CPU: 2 cores
RAM: 4GB
DB: MySQL 8.0/MariaDB 5.5 on same server with openfire
Auth via LDAP. Same ldap server as in April, no changes.

Very impressive release, big improvement over 4.4.3 although 4.4.3 is very good release.

Going to test it with live users and migrate production after.

Thanks a lot!

2 Likes

Thanks for the elaborate tests! Happy to see things improved in the field!

1 Like

So i’ve migrated to 4.4.4 in production. Performance is good for two days.

I’ve adjusted caches and ram limit for java. With live users load and cache usage were higher.

1100 live users were online in 3-5 minutes after restart with tuned caches, tested multiple times.
Group sharing times with live users a little bit worse, 1-2 min, but it’s very fast comparing to all prev versions.

Current settings.
Java ram limit = 2GB

Mysql:
innodb_buffer_pool_size = 1024M
innodb_buffer_pool_instances = 4

Openfire cache:
group metadata = 10mb
ldap userdn = 100mb - will change to 10mb probably, but occasionaly it can get to 90-100mb in use, especially when all users logging in simultaneously, current usage - 1mb
roster = 400mb - current usage 300mb
vcard = 100mb - will change to 50mb, default size was small, 100mb too big, current usage 10-20mb

Thanks for your work!

1 Like

Thanks for sharing this information. Looks like things are under control.

As a side-note: I don’t think that there’s much of a downside to having caches that are ‘to big’. Having them ‘to small’ however can seriously affect performance. I suggest you err on the side of caution.