Improvements to the ICU Rating Algorithm
by Mark Orr
created September 18, 2012; last updated February 14, 2014

The algorithm used by the ICU to perform rating calculations remained unchanged between 1999-2012 and this web site, initially at least, implemented the same algorithm. However, the detailed breakdown of calculations available from the web site revealed some hitherto unknown issues requiring some minor adjustments to be made.

The old algorithm is deemed to be version 1.0. Subsequent improvements, as detailed below, are referred to as versions 1.1, 1.2, 1.3 and so on in the order they were implemented. There will only ever be a version 2.x series if there are major changes to the algorithm.

Version 1.1: Provisional rating calculations after bonuses

In the standard algorithm, provisional ratings are calculated before bonus points are awarded. If any bonus points are awarded, provisional ratings are recalculated. These calculations are performed by an iterative algorithm controlled by two parameters:

  • a maximum number of iterations (a safety measure to prevent an infinite loop if convergence does not occur)
  • a threshold rating change between successive iterations which, if not exceeded, signifies that a stable solution has been reached

For the first calculation (before bonuses) the maximum iterations are set to 30 and the threshold change to 0.5. In practice, the iterations tend to converge to a solution well before the maximum. For example, in the 126 tournaments in the 2011-12 season the largest number of iterations required was only 16 and the average was about 5.

An issue occured only when bonuses were awarded. In that case when provisional ratings are recalculated for the second time the maximum number of iterations used (in version 1.0) was only 1. One iteration is usually not enough to reach a stable solution and this was causing a problem.

The symptom was provisional ratings which were slightly different from what one would expect from the standard formulae involving performance ratings. For the 2011-12 season about 20% of provisional ratings calculated were out by at least one point. The largest difference was 13 but most were small (the average being 2.2).

The solution is simply to allow more time for the second set of iterations to converge by using the same maximum as the first set, namely 30 (instead of 1). When the 2011-12 tournaments were rerated after this change and the September 2012 rating list republished, 59 out of 893 published ratings changed, no changes were more than 10 points and only one change was a decrease.

Version 1.2: Recalculating bonuses

There is a second issue related to the recalculation of provisional ratings after bonuses. The problem is that bonuses depend on provisional ratings and provisional ratings depend on bonuses. The last step in version 1.1 and before was to recalculate bonuses (for a second time). That means bonuses will be correct relative to provisional ratings but the converse may not be true.

To resolve this problem, that last step, recalculating bonuses a second time, was removed. The only drawback is that there may now be occasions where a player might get a slightly different bonus to the expected one if they have any provisionally rated opponents. However, this was deemed the lesser evil.

In December 2012 this modified algorithm was used to rerate all tournaments from September 2011. This successfully eliminated nearly all inaccurate provisional ratings while having a limited overall impact on players' ratings. 87% of published ratings for September 2012 were unchanged while for those that did change, none did so by more than 10 points and most by only 1 point.

The fraction of inaccurate provisional ratings remaining after this change was only 4% with a maximum error of just 1 point. This is negligible and into the territory of rounding errors.

Version 1.3: The maximum iterations in general

In August 2013 a tournament, the Connaught U16 Championship, where the majority of players were unrated, failed to rate. The error message from the system was "performance rating estimation did not converge". On further investigation it was discovered that one more iteration than the maximum of 30 was required to get to convergence for this particular tournament. Since there had already been one or two previous tournaments that had needed close to 30 iterations to converge, it was apparent that 30 was too low and it was therefore bumped up to 50 for both phases (before and after bonuses).

Since no previous tournaments had exceeded the old maximum of 30 it was unnecessary to rerate any of them again after this change. Only the Connaught U16 tournament was rerated (this time successfully) and all subsequent tournaments will use the new maximum number of iterations.