powered by Jive Software

Several running processes called: /opt/openfire/lib/startup.jar version 4.6.4

After I updated openfire from 4.6.3.1 to 4.6.4.1, users are complaining that Spark is too slow to start a conversation. Spark version is 2.9.4.

=> The openfire server framework is a virtual server in vmware 7.0.

  • Operating system: CentOS 8
  • Database: Mysql version 8.0.21
  • Memory: 52GB
  • Vcpu:6
  • Is integrated with Active Directory.
  • Total Active Directory group is 183
  • Total users is 1,700
  • Simultaneous connection on average is 700 to 900 users on Spark.

/etc/sysconfig/openfire configuration file is: OPENFIRE_OPTS="-Xmx36864m" to go up already with this total memory. If you leave it smaller or undefined, the time will come when the openfire panel will show an alert.
I noticed on htop, it’s showing a lot of processes called /opt/openfire/lib/startup.jar. According to the attached screen.
That cpu load is only in the first process. In the first one, it has reached 80% or even more usage and the others remain at %.

=> I would like to better understand this startup.jar process what is it?
=> If users are slow to start a conversation, there is some additional adjustment to be made in openfire to improve this performance.

Also attached all the settings and parameters of the openfire server.
If anyone can help me, I appreciate it.

Att. Antonio
image
image
image
image
image
image
image
image
image

These screenshots are to small for me to see.

I’m not sure about the multiple processes. It is perfectly reasonable to assume that this is your OS showing multiple threads or something. Probably nothing to worry about.

Do the log files contain any clues?

On larger instances, it is often desirable to reconfigure the cache sizes of Openfire. Was that done on your instance? Can you show a screenshot of the Cache summary page of Openfire?

Do you have monitoring enabled in your database? Does that show increased usage?

If all else fails, you could try to install the Thread Dump plugin, and use that to generate a couple of thread dumps while the system is under load. If we’re lucky, that gives an indication as to what is causing the delays.

→ Actually, htop was enabled “Hide kernel threads”. Now it’s ok.
→ Log files only have some alerts nothing to worry about, example " org.jivesoftware.openfire.ldap.LdapManager - Using unencrypted connection to LDAP service!" because I’m not using TLS with active directory. Attached image of caches.
→ Bank monitoring is not enabled.
I’m going to try to do some java-related variable tweaks and let’s see how it behaves.


Thanks.

It appears that your roster cache has been resized (which is good), but your vcard cache has not - while it is not being very efficient. You can probably gain performance by increasing that cache.

Can you give me which of the cache options you gave that would be good to increase?
Analyzing the behavior of openfire and I identified that even increasing the variable OPENFIRE_OPTS="-Xmx36864m" in /etc/sysconfig/openfire, on htop it showed that it is using a maximum of 8GB of memory and high CPU processing. Analyzing the openfire console in Java Memory, according to the screen, the consumption was also very low, it was around 8 to 12GB, at most.
So talking to an internal analyst, who works with java configuration, he started to use the parameters below in the /etc/sysconf/openfire file to improve java performance.
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+UseG1GC"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+ParallelRefProcEnabled"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+UseStringDeduplication"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+ExplicitGCInvokesConcurrent"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:MaxGCPauseMillis=200"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+AlwaysPreTouch"
OPENFIRE_OPTS="$OPENFIRE_OPTS -XX:+UseCompressedOops"
OPENFIRE_OPTS="$OPENFIRE_OPTS =XX:+UseCompressedClassPointers"
From the moment I configured these parameters in htop it started to show 30GB of memory usage and in the openfire panel with 931 users connected with spark it started to use UP to 55% of memory, varying to less.
Even access to the openfire console is much faster.
Until yesterday at the end of the day, there were users complaining that they closed the spark or couldn’t connect. Adjusted the above parameters last night and so far no user has complained about spark problem.
I believe it has normalized with these adjustments.
Some of the parameters mentioned can be found at Garbage-First Garbage Collector Tuning .
image