I am writing this small post to share information about what I observed doing some tests with Shared Groups and Rosters.
First of all some general information:
I created two databases with 10K users and creating “large rosters” of 300 items/users. In wftestR database, rosters were handled in the jiveroster and associated tables while in wftestG database, rosters were handled as Shared Groups with jivegroup and associated tables.
I first started some spark clients to evaluate the connection response time. It behave much better with rosters than with shared groups (a few seconds vs around 1minute). Repeating the operation didn’'t changed anything to rosters but increased the Shared Groups behavior (a few seconds). I think this is better thanks to caching.
Someone of my company that had studied shared groups told me that at the begining, all shared groups are loaded which makes the response time so long. I hadn’'t the time to check but to improve that I started to write a small pluin whose purpose is just to “load” all users, rosters and groups when the server starts. This increased the behavior significantly but needed also to study the caching.
In both cases (Rosters or Shared Groups) what needed the maximum caching size was the rosters. I imagine that there is a trade-off between shared groups and rosters. Rosters in the database don’'t need process but much memory while shared groups need more CPU inside WF server to be “converted” as rosters to the clients. I suppose the generated rosters corresponding to shared groups are also stored as rosters in the roster cache.
I haven’'t yet tuned precisely the roster cache but for the same number of users I observed some things:
With shared groups, it is like User cache is bigger than with rosters 1.8M instead of 0.8M (not really critical).
the Group caching for shared groups only needed around 16M.
the MetaGroup caching for shared groups only needed 0.9M.
the Roster cache size is as said the largest. With rosters I were around 100M and it was filled around 94% and I suppose I should go larger, it was filled around 85%. My first impression but it is maybe wrong is that roster cache can be smaller with shared groups.
Have someone some remarks or advices related to that ?
Then I started some load testing to see the behavior of the server in such a situation. Trying to connect around 6000 users, behave well with rosters. Load was quite high on server and on database but the system worked fine. With shared groups everything behave well also but the connection delay and time to get the rosters made the server loaded much higher. And suddenly around 6K users connected it is like the system was “saturated” and load becam really high, users could not connect anymore, users got timeouts.
I used simple scenarios where users connect at a given rate, synchronize together and then start to chat (light scenario) and change status (light scenario) and then disconnect.
I hadn’'t yet time to study precisely the origin of the load on the server and expect to have details in the coming days.
I don’'t know if these pieces of information are helpfull but large rosters can really increase the load of the server. If someone has advices or suggestions to reduce load and increase response time, I am interested.
By the way I heard about a Caching project developped by Apache/Jakarta - JCS if I am right. have you heard about it ? Do you think this could be interesting ?