Bug: duplicate roster items in database (+fix)

During a database survey, we discovered a lot of duplicate entries in the ‘jiveRoster’ table. The table contained a lot of rows where the values for ‘username’ and ‘jid’ are equal. One of our users had 10,000 items on his roster this way (of which ‘only’ 800 are unique).

At least two things seem to be going wrong:

  • Some of our custom code is causing these duplicates to be created in the database;

  • The checks in Openfire that should prevent duplicates to be created are failing.

I won’t bother you guys with the first part - it’s custom gateway code that appears to be messing things up. In this thread, I’d like to discuss two things: a) how to change Openfire in such a way that it no longer allows for duplicates; and b) how to clean up the current mess.

So, what’s going on in Openfire?

the RosterItemProvider class is the responsible class for creating new RosterItems. For this, the method createItem(String username, RosterItem item) is used. Looking at the method signature and javadocs, the method appears to be checking for duplicate roster items, as it declares to throw UserAlreadyExistsException instances. Obviously, this is failing.

If you look at the internals of the method, you’ll notice that the implementation relies on a single database query that adds a new row in the database. If an SQLException occurs, the UserAlreadyExistsException is thrown. This would suggest that the database should not allow for duplicate rows - in other words, a UNIQUE constraint should be put on the table. This constraint currently does not exist. This causes a weird situation in which a UserAlreadyExistsException can be caused by any random database related problem, EXCEPT for a situation involving duplicate users!

The most obvious way of fixing this is by ensuring that a UNIQUE constraint is added to the database table. This however doesn’t feel right to me. Any database problem will be blindly translated into a duplicate-user situation. It hides actual database problems, which is messy.

Instead, I suggest to add a new query, that checks if the roster item to-be-created already exists.

SELECT 1 FROM jiveRoster WHERE username = ? AND jid = ?

If the resultset of this query includes an actual result, the UserAlreadyExistsException should be thrown. The resultset will not include any results if no other rosteritem currently exists. Using this method, we retain the functionality of the original code, add the ‘missing’ functionality, and improve on the reporting of actual database errors. The new code for the createItem() method would look something like this:

public RosterItem createItem(String username, RosterItem item)
    throws UserAlreadyExistsException {
Connection con = null;
PreparedStatement pstmt = null;
try {
    con = DbConnectionManager.getConnection();
    pstmt = con.prepareStatement(CHECK_FOR_ROSTER_ITEM);
    pstmt.setString(1, username);
    pstmt.setString(2, item.getJid().toBareJID());
    if (pstmt.executeQuery().next()) {
        throw new UserAlreadyExistsException(item.getJid().toBareJID());
    }
    pstmt.close();     long rosterID = SequenceManager.nextID(JiveConstants.ROSTER);
    pstmt = con.prepareStatement(CREATE_ROSTER_ITEM);
    pstmt.setString(1, username);
    pstmt.setLong(2, rosterID);
    pstmt.setString(3, item.getJid().toBareJID());
    pstmt.setInt(4, item.getSubStatus().getValue());
    pstmt.setInt(5, item.getAskStatus().getValue());
    pstmt.setInt(6, item.getRecvStatus().getValue());
    pstmt.setString(7, item.getNickname());
    pstmt.executeUpdate();     item.setID(rosterID);
    insertGroups(rosterID, item.getGroups().iterator(), con);
}
catch (SQLException e) {
    Log.error("Error trying to insert a new row in jiveRoster", e);
}
finally {
    DbConnectionManager.closeConnection(pstmt, con);
}
return item;
}

The changes that I described up until now should prevent new duplicates from being inserted into the database. Simultaneously, we should also remove existing duplicates from the database. An added difficulty here is the table jiveRosterGroups, which maintains a foreign key constraint.

As with any direct database update in Openfire, the XMPP server itself should be shut down first. This is the only way to ensure that memory caches and database states don’t get misaligned.

This is the code that I’d like to use to purge the database from the duplicates. It makes sure that:

  • corresponding values in the jiverostergroups table are deleted;

  • the database structures are eventually back in their old state;

  • the query runs fast enough not to delay the restart of Openfire by an insane amount of time.

If anyone can improve on this SQL, please let me know.

ALTER TABLE jiveRosterGroups DROP CONSTRAINT jiverostergroups_rosterid_fk;
ALTER TABLE jiveRosterGroups ADD CONSTRAINT jiverostergroups_rosterid_fk FOREIGN KEY (rosterid)
      REFERENCES jiveRoster (rosterid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE CASCADE DEFERRABLE INITIALLY DEFERRED; CREATE TEMPORARY TABLE tmp_unique_rosterids AS SELECT DISTINCT ON (username, jid) rosterid FROM jiveroster;
CREATE INDEX myindex ON tmp_unique_rosterids USING btree (rosterid);
DELETE FROM jiveRoster WHERE rosterid in (SELECT rosterid FROM jiveRoster EXCEPT SELECT rosterid FROM tmp_unique_rosterids); ALTER TABLE jiveRosterGroups DROP CONSTRAINT jiverostergroups_rosterid_fk;
ALTER TABLE jiveRosterGroups ADD CONSTRAINT jiverostergroups_rosterid_fk FOREIGN KEY (rosterid)
      REFERENCES jiveRoster (rosterid) MATCH SIMPLE
      ON UPDATE NO ACTION ON DELETE NO ACTION DEFERRABLE INITIALLY DEFERRED;

Note that I’m using Postgresql - other DBMSses might require variations of this code.

Hi Guus,

I did create JM-1176. The HSQLDB scripts do not contain the foreign key, so there is also the problem with the quality of the scripts.

As Openfire uses foreign key constraints it should also use unique constraints, there is no database available which does not support them and it makes things much more easy. But I guess that a redesign of the database is something which will not happen soon.

LG

Note that adding a unique constraint in an database upgrade script will

cause problems for Openfire instances that already have duplicate

entries. That’s another reason why I think that the final solution to this problem shouldn’t depend on such a constraint.

I prefer to have this solved without depending on a unique contraint on the database. By detecting ‘duplicates’ without using the constraint (but with the help of an additional query instead, as I explained above) Openfire will be able to distinguish between ‘normal’ database exceptions.

Hi,

I assume that one wants to increase the database version when running your SQL ALTER TABLE fix script, so one could create the new tables for all databases with foreign keys and also with a proper unique key.

LG