While doing some load testing of clustered Openfire I had a worrisome experience. Too much load caused a node to run out of memory and become unresponsive. The process (Openfire) for that node was forcefully killed. After that, a variety of problems appeared. When brought back up, the node’s log showed database unique constraint related errors (e.g. duplicate key…). Also, all nodes in the cluster encountered NullPointerException when Server->Statistics was accessed in the administration console.
This leads me to believe that if a node in the cluster suddenly dies that the persistent state (e.g. the db) may be corrupt. Might anyone be able to share some information/experience on this issue?