I just solved it an hour ago. I also solved a year ago, which is why I’m so pissed.
Unix has a per-process file-descriptor limit. On my machine, it happened to be 1024. At about 220 users (i.e., not very many), it would run out and start kicking people out. Since they would just try to reconnect, the problem spiralled up until the CPU was pegged.
The solution, for those who care, us lonely few who use Unix-based machines and have more than a handfuls of users, is simple. As superuser,
ulimit -n 5000 # or whatever
sh bin/openfire start
Since I brought down my last company’s system in 2008 as well as this one’s yesterday, I for one hope this issue is made more prominent in the control console and the documentation.
When is 3.7 going into production?
“Experience is that marvelous thing that enables you to recognize a mistake when you make it again.”
I am seeing this same problem with Openfire 3.6.4 using http-bind with Sparkweb. Is this issue fixed in 3.6.4 trunk or is the recommended solution to upgrade to 3.7 beta?