First off, if you checked this before, know that I improved it. I'm currently in the third version, and I think it gives excellent results!
Introduction
There have been more and more suggestions these last months to create a consistent and reliable ranking of players that uses competitive results instead of ladder. We all agree that ladder is completely irrelevant (Liereyy is 51st on ladder as I write this, that's enough proof). There are enough competitive games on DE to create granularity without needing ladder, especially at the highest level.
I personally find the work done by aoe-elo fascinating and that's definitely the basis of my own work. I still have some issues with the way they operate and I think there are a few flaws within their system. I've been working on a system I find more accurate, even if you could probably find problems with what I designed.
In the end, it's not so much intended as a pure official way to seed players, and more as a way to discuss elo and rankings, and also to dwelve in DE history.
What I kept
I use the elo system, as it's universally accepted as the best way to rank players. I think everyone understands elo, but here is a basic summary.
Just like aoe-elo, I consider games individually, and not series. This way, a 3-0 win is more rewarding than a 3-1 win or a 3-2 win. This makes it more accurate and rewards players who hold their ground against a theoretically stronger opponent. The consequence of that is that, just like on aoe-elo, we sometimes end up with someone winning a set 3-1 or 3-2 but losing point if he was a heavy favourite, because that's treated as 3 wins and 1/2 loss. Once you're very high, it becomes extremely hard to keep climbing, even if you win almost every game.
K-factor and importance of games
An important difference with aoe-elo is that I only used results from DE. There are enough games on DE to have accurate ratings, and the data is also easier to find. I'm also doing that alone ^^
One of the main issues I have with aoe-elo is that all games are weighted equally. That should not be the case: the first round of a qualifier is not as important as the final of an S-tier tournament. Series have no weighting either, so the length of the series actually indirectly decides their importance: a 6-4 set is not necessarily twice as important as a 3-2 set, even if that tends to be the case. When that's the case, that should be because it's a final of a big tournament, not because it has more games.
Thus, my most important change is that I gave a different K-factor to sets based on the round, the number of games of the set, the tier of the tournament, and some other factors in specific circumstances.
That also allows me to consider showmatches, which is not the case in aoe-elo. Their K-factor is kept low, but I think it's obvious that a 1000$ showmatch is treated more seriously by the players than the first round of a100$ tournament, so having it included seems obvious.
I obviously don't consider for-fun showmatches and extremely weird settings.
I also take into account every game, but I average the games of a set. This way a 2-1 and a 4-2 results are as impactful (but usually BO7 tend to be in important sets so they have a higher K-factor naturally).
It's easier to look at an example with the first game of DE:
TaToH and DauT played a BO9 showmatch.
That kind of showmatch without an especially high prizepool has a K-factor of 8 in my ranking.
TaToH and DauT both start at 2000 elo, like every player: this means they each have a 50% chance of winning each game and can gain a maximum of 4 elo.
By winning 5-1, TaToH gains 4-(1/5*4) which is average to 3 points. As you see, it's not that impactful and it's safe to include showmatches.
Extra points added
There's another potentially controversial change I made. The issue is that top players tend to only face other top players. They don't match up often enough with lower players. There's a real issue with qualifiers: taking part in a qualifier is a huge boost to your aoe-elo ratings, even if you lose. That's something I definitely wanted to change.
The best thing I came up with is that I gave a boost to players who are invited/who autoqualify to an event, based on the K-factor of the qualifier, to compensate for the fact that they did not have the opportunity to gain points by qualifying. That breaks the strict rule of the elo system where the sum of points should not change, but that gives me extremely satisfying results. That compensates for the fact some players almost exclusively play in big tournaments and avoids overevaluating players who take part in numerous smaller tournaments and qualifiers. Before adding that, I had run several months and Max was way too low (he was invited to NAC3 and HC3 but did not really perform, even if he proved he deserved his top 16 spot by qualifying to RBW1) and a player like Vinchester was 6th (invited nowhere, but started qualifying everywhere).
That also creates ladder inflation over time, which I would argue is a good thing here. With everyone starting at 2000, I aim for the average of the top players to be at least 2100, which takes time and is significantly helped if I inject points before events. I'll probably decrease the number of points I inject after some time, or entirely remove it, but I found that it really helped early ratings.
Rankings
I'm honestly pretty happy with the early results, you can find them in "Archives" just below.
I give my first real ranking right after. There are still flaws to discuss and like any system it has its outliers, but this looks very promising. I think the basis is very good. I'll run it until at least the end of 2020 and see if I'm satisfied with that version.
After those provisional ratings, I give my first real ranking at the end of Visible Cup, in early June 2020. It includes 32 players who either took part in an S-tier event or played at least 10 competitive games on DE.
There's a lot to unpack.
I talked about the top players in "archives", but overall I think it's hard to argue with that top 8. It's simply HC3 top 8, with Viper still considered the best despite his loss of RBW1, Yo looking like a clear second, Hera like a clear third, Liereyy like a clear fourth, TaToH like a clear fifth, MbL and DauT pretty close at sixth and seventh, and dogao the most questionable inclusion because he reached top 8 in HC3 but not in RBW1.
Things start changing from ninth place, with the rise of Vinchester. It is entirely deserved: he qualified for RBW1 by beating Vivi, he won Visible Cup convincingly (4-1 in the finals) and he even beat TaToH in the finals of a Hun War tournament. You could maybe argue he should be around 12th, but he clearly deserves a solid spot.
If we look at top 20 (all the players above 2k), it's actually almost perfect. It includes all the players who qualified to the first three S-tier events of DE, with one exception. Vinchester only took part in one event but I explained his momentum so ninth is ok-ish, otherwise the top 14 is 13 players who took part in both HC3 and RBW1.
Among them, LaaaaaN ranks first (so 10th) after reaching QFs in RBW1, Nicov is 11th as the strongest player who did not reach a QF, then others are pretty close from one another. 15th is Vivi who missed RBW1, 16th is Max who qualified to RBW1 but in the second qualifier and who had slightly worst performances overall, those make sense. Slam qualified to RBW1 but got swept first round and barely played, so he deserves to be on the lower end of those players. That leaves Barles as 17th who is the Visible Cup finalist, Belgium as 18th who reached semis, and the other semifinalist was Kasva who only played nine games so isn't included, but is currently at 2004.
The outlier there is repard: he is 19th mostly thanks to RusAOC. I lowered their K-factor over and over to compensate for the uneven games and regional restrictions, but they still have some impact. I don't want to just not include them, that seems ridiculous.
That's one of the many reasons why I think some ladder inflation from S-tier events is good: you'll never gain 100 points from RusAOC alone, so if the average of top players is at 2100, you need to actually beat top players to reach that. I expect that to be fixed after a few more months.
dench being twenty-third comes from RusAOC too.
Introduction
There have been more and more suggestions these last months to create a consistent and reliable ranking of players that uses competitive results instead of ladder. We all agree that ladder is completely irrelevant (Liereyy is 51st on ladder as I write this, that's enough proof). There are enough competitive games on DE to create granularity without needing ladder, especially at the highest level.
I personally find the work done by aoe-elo fascinating and that's definitely the basis of my own work. I still have some issues with the way they operate and I think there are a few flaws within their system. I've been working on a system I find more accurate, even if you could probably find problems with what I designed.
In the end, it's not so much intended as a pure official way to seed players, and more as a way to discuss elo and rankings, and also to dwelve in DE history.
What I kept
I use the elo system, as it's universally accepted as the best way to rank players. I think everyone understands elo, but here is a basic summary.
Basically, you gain points for each win, and lose points for each loss. You gain/lose as many points as your opponent loses/gains. That's exactly how it works on ladder.
That number of points won/lost depends on the relative strength of the two players: if you are a heavy favourite, you will win less points for a win and lose more points for a loss.
The exact number of points depends on something called K-factor. A K-factor of 32 means up to 32 points are distributed: two even opponents gain/lose 16 points, and then it scales up to a theoretical +32/-32.
That number of points won/lost depends on the relative strength of the two players: if you are a heavy favourite, you will win less points for a win and lose more points for a loss.
The exact number of points depends on something called K-factor. A K-factor of 32 means up to 32 points are distributed: two even opponents gain/lose 16 points, and then it scales up to a theoretical +32/-32.
Just like aoe-elo, I consider games individually, and not series. This way, a 3-0 win is more rewarding than a 3-1 win or a 3-2 win. This makes it more accurate and rewards players who hold their ground against a theoretically stronger opponent. The consequence of that is that, just like on aoe-elo, we sometimes end up with someone winning a set 3-1 or 3-2 but losing point if he was a heavy favourite, because that's treated as 3 wins and 1/2 loss. Once you're very high, it becomes extremely hard to keep climbing, even if you win almost every game.
K-factor and importance of games
An important difference with aoe-elo is that I only used results from DE. There are enough games on DE to have accurate ratings, and the data is also easier to find. I'm also doing that alone ^^
One of the main issues I have with aoe-elo is that all games are weighted equally. That should not be the case: the first round of a qualifier is not as important as the final of an S-tier tournament. Series have no weighting either, so the length of the series actually indirectly decides their importance: a 6-4 set is not necessarily twice as important as a 3-2 set, even if that tends to be the case. When that's the case, that should be because it's a final of a big tournament, not because it has more games.
Thus, my most important change is that I gave a different K-factor to sets based on the round, the number of games of the set, the tier of the tournament, and some other factors in specific circumstances.
That also allows me to consider showmatches, which is not the case in aoe-elo. Their K-factor is kept low, but I think it's obvious that a 1000$ showmatch is treated more seriously by the players than the first round of a100$ tournament, so having it included seems obvious.
I obviously don't consider for-fun showmatches and extremely weird settings.
I also take into account every game, but I average the games of a set. This way a 2-1 and a 4-2 results are as impactful (but usually BO7 tend to be in important sets so they have a higher K-factor naturally).
It's easier to look at an example with the first game of DE:
TaToH and DauT played a BO9 showmatch.
That kind of showmatch without an especially high prizepool has a K-factor of 8 in my ranking.
TaToH and DauT both start at 2000 elo, like every player: this means they each have a 50% chance of winning each game and can gain a maximum of 4 elo.
By winning 5-1, TaToH gains 4-(1/5*4) which is average to 3 points. As you see, it's not that impactful and it's safe to include showmatches.
Extra points added
There's another potentially controversial change I made. The issue is that top players tend to only face other top players. They don't match up often enough with lower players. There's a real issue with qualifiers: taking part in a qualifier is a huge boost to your aoe-elo ratings, even if you lose. That's something I definitely wanted to change.
The best thing I came up with is that I gave a boost to players who are invited/who autoqualify to an event, based on the K-factor of the qualifier, to compensate for the fact that they did not have the opportunity to gain points by qualifying. That breaks the strict rule of the elo system where the sum of points should not change, but that gives me extremely satisfying results. That compensates for the fact some players almost exclusively play in big tournaments and avoids overevaluating players who take part in numerous smaller tournaments and qualifiers. Before adding that, I had run several months and Max was way too low (he was invited to NAC3 and HC3 but did not really perform, even if he proved he deserved his top 16 spot by qualifying to RBW1) and a player like Vinchester was 6th (invited nowhere, but started qualifying everywhere).
That also creates ladder inflation over time, which I would argue is a good thing here. With everyone starting at 2000, I aim for the average of the top players to be at least 2100, which takes time and is significantly helped if I inject points before events. I'll probably decrease the number of points I inject after some time, or entirely remove it, but I found that it really helped early ratings.
Rankings
I'm honestly pretty happy with the early results, you can find them in "Archives" just below.
I give my first real ranking right after. There are still flaws to discuss and like any system it has its outliers, but this looks very promising. I think the basis is very good. I'll run it until at least the end of 2020 and see if I'm satisfied with that version.
Here is my first provisional ranking, after NAC3:
I only included the 9 players who took part in NAC3, but I think it's extremely interesting to see how quickly players moved even if everyone started at 2000 one month earlier.
Viper is the first player to break 2100.
Here is the second provisional ranking, with the 16 players qualified for HC3 right when qualifiers end:
Hera was 1 point over Viper for one day, but otherwise Viper was firmly ahead. Hera still has a lot of points; he traded wins with Viper during the many A-B tier events of January 2020 like Bonjwa Fight Club, e-Paradise, Fair Civs Cup or Empires Showdown.
I think everything else looks really accurate. 7 of my top 8 will reach quarters in HC3.
Here is the third provisional ranking, with the same 16 players, but after HC3 (and a few events/showmatches):
Viper completely dominates the competition, he is the first player to break 2200. The top 8 is the top 8 of HC3, with Villese and Nicov being the best players outside of that top 8 which looks pretty accurate. Yo is a bit low because he did not take part in smaller events and had unlucky brackets, but nothing extraordinary.
Here is the fourth provisional ranking, including the 16 players who will take part in RBW1 and the 2 players who took part in HC3 but won't participate in RBW1:
This is extremely provisional as some players will quickly lose points during the first round of RBW1, but it's interesting to remember that BacT's start to DE was terrible. Viper also has an absolutely insane lead of 99 points over Hera, meaning it's almost impossible for him to maintain his rating without extraordinary performances.
Here is the fifth provisional ranking, displaying a change of dynamic:
Viper climbed to 2265 elo after his 3-0 win to Hera in semis, with a 114 points gap to Hera and a 119 points gap to Yo. But that quickly changed in the seismic final of RBW1 where Yo gained 42 points and Viper won 42 points, the single most impactful set in DE.
That gives an extremely interesting an accurate rating. Viper is still first, Yo is clearly second, Hera clearly third, Liereyy clearly fourth, TaToH clearly fifth, MbL and DauT sixth and seventh but close. This honestly looks very similar to what I remember from that time.
dogao is still eighth from Hidden Cup, LaaaaaN is ninth from reaching top 8 in RBW1, and Nicov is tenth as the strongest player who has not yet made it into a top 8.
I only included the 9 players who took part in NAC3, but I think it's extremely interesting to see how quickly players moved even if everyone started at 2000 one month earlier.
Viper is the first player to break 2100.
Here is the second provisional ranking, with the 16 players qualified for HC3 right when qualifiers end:
Hera was 1 point over Viper for one day, but otherwise Viper was firmly ahead. Hera still has a lot of points; he traded wins with Viper during the many A-B tier events of January 2020 like Bonjwa Fight Club, e-Paradise, Fair Civs Cup or Empires Showdown.
I think everything else looks really accurate. 7 of my top 8 will reach quarters in HC3.
Here is the third provisional ranking, with the same 16 players, but after HC3 (and a few events/showmatches):
Viper completely dominates the competition, he is the first player to break 2200. The top 8 is the top 8 of HC3, with Villese and Nicov being the best players outside of that top 8 which looks pretty accurate. Yo is a bit low because he did not take part in smaller events and had unlucky brackets, but nothing extraordinary.
Here is the fourth provisional ranking, including the 16 players who will take part in RBW1 and the 2 players who took part in HC3 but won't participate in RBW1:
This is extremely provisional as some players will quickly lose points during the first round of RBW1, but it's interesting to remember that BacT's start to DE was terrible. Viper also has an absolutely insane lead of 99 points over Hera, meaning it's almost impossible for him to maintain his rating without extraordinary performances.
Here is the fifth provisional ranking, displaying a change of dynamic:
Viper climbed to 2265 elo after his 3-0 win to Hera in semis, with a 114 points gap to Hera and a 119 points gap to Yo. But that quickly changed in the seismic final of RBW1 where Yo gained 42 points and Viper won 42 points, the single most impactful set in DE.
That gives an extremely interesting an accurate rating. Viper is still first, Yo is clearly second, Hera clearly third, Liereyy clearly fourth, TaToH clearly fifth, MbL and DauT sixth and seventh but close. This honestly looks very similar to what I remember from that time.
dogao is still eighth from Hidden Cup, LaaaaaN is ninth from reaching top 8 in RBW1, and Nicov is tenth as the strongest player who has not yet made it into a top 8.
After those provisional ratings, I give my first real ranking at the end of Visible Cup, in early June 2020. It includes 32 players who either took part in an S-tier event or played at least 10 competitive games on DE.
There's a lot to unpack.
I talked about the top players in "archives", but overall I think it's hard to argue with that top 8. It's simply HC3 top 8, with Viper still considered the best despite his loss of RBW1, Yo looking like a clear second, Hera like a clear third, Liereyy like a clear fourth, TaToH like a clear fifth, MbL and DauT pretty close at sixth and seventh, and dogao the most questionable inclusion because he reached top 8 in HC3 but not in RBW1.
Things start changing from ninth place, with the rise of Vinchester. It is entirely deserved: he qualified for RBW1 by beating Vivi, he won Visible Cup convincingly (4-1 in the finals) and he even beat TaToH in the finals of a Hun War tournament. You could maybe argue he should be around 12th, but he clearly deserves a solid spot.
If we look at top 20 (all the players above 2k), it's actually almost perfect. It includes all the players who qualified to the first three S-tier events of DE, with one exception. Vinchester only took part in one event but I explained his momentum so ninth is ok-ish, otherwise the top 14 is 13 players who took part in both HC3 and RBW1.
Among them, LaaaaaN ranks first (so 10th) after reaching QFs in RBW1, Nicov is 11th as the strongest player who did not reach a QF, then others are pretty close from one another. 15th is Vivi who missed RBW1, 16th is Max who qualified to RBW1 but in the second qualifier and who had slightly worst performances overall, those make sense. Slam qualified to RBW1 but got swept first round and barely played, so he deserves to be on the lower end of those players. That leaves Barles as 17th who is the Visible Cup finalist, Belgium as 18th who reached semis, and the other semifinalist was Kasva who only played nine games so isn't included, but is currently at 2004.
The outlier there is repard: he is 19th mostly thanks to RusAOC. I lowered their K-factor over and over to compensate for the uneven games and regional restrictions, but they still have some impact. I don't want to just not include them, that seems ridiculous.
That's one of the many reasons why I think some ladder inflation from S-tier events is good: you'll never gain 100 points from RusAOC alone, so if the average of top players is at 2100, you need to actually beat top players to reach that. I expect that to be fixed after a few more months.
dench being twenty-third comes from RusAOC too.
The one who should be probably be in that top 20 instead of repard is BacT. The main issue here is that he only played 9 sets, and he lost a lot of them. He failed to qualify in both RBW1 qualifiers, losing to Luca and to Lyx who are far from the biggest names. Even if he qualified for HC3, he simply can't have a high rating with 2 huge underperformances over only 9 sets. He just needs to play more and slowly regain elo.
I changed quite a lot of things since the first version, so I'm more than happy to receive feedback.
Last edited: