Voobly is using a modified version of the ELO rating system for their RM/DM Team Games. Instead of distributing the points equally among all team members, the higher rated players win less and lose more points than the lower rated players.
This modification is called “Fair team balancer (lower rated players get more points when game is won)” (https://www.voobly.com/pages/view/65/Rating-System).
In the website’s section “Team Fairness” it is proclaimed that this modification prevents noob bashing by rewarding players less points for playing in lower rated games.
Before discussing the direct impact of this rating modification, I want to show an in my opinion bad effect on the overall ELO economy.
The numbers I use are not completely identical with the actual numbers, but they represent Voobly’s algorithm well enough.
Let’s say we have 4 players with following ratings: 1800, 1600, 1600 and 1400.
These ratings provide two different possible matchups.
First, we have the “fair” matchup where the 1800 and the 1400 team up together, have the same average rating as the two 1600’s and theoretically should win 50% of the time.
TEAM 1
50% chance to lose:
1800 --> (-10) --> 1790
1400 --> (-6) --> 1394
50% chance to win:
1800 --> (+6) --> 1806
1400 --> (+10) --> 1410
We are in a situation where the 1800 loses 10 points 50% of the time and wins 6 points 50% of the time. Hence his expected rating after one match is two points lower than his original rating. The reversed thing happens with the 1400:
1800 --> 50% * (-10) --> 50% * (+6) --> 1798
1400 --> 50% * (-6) --> 50% * (+10) --> 1402
As you can see the 1800 gets punished just for playing in a “fair” game where he teams up with the 1400.
TEAM 2 doesn’t get their ratings changed for playing this matchup (because they win or lose equal amounts of points):
TEAM 2
50% chance to win:
1600 --> (+8) --> 1608
1600 --> (+8) --> 1608
50% chance to lose:
1600 --> (-8) --> 1592
1600 --> (-8) --> 1592
1600 --> 50% * (-8) --> 50% * (+8) --> 1600
1600 --> 50% * (-8) --> 50% * (+8) --> 1600
If TEAM 1 plays enough games against TEAM 2 (and the higher rated player always gets rewarded less points than the lower rated player), the 1800 and the 1400 would both eventually end up with the rating of 1600 – leaving the 1800 massively underrated and the 1400 massively overrated.
Now let’s look at the other matchup: 1800 & 1600 vs 1400 & 1600.
With average ratings of 1700 and 1500 I looked up the Voobly ratings calculator (https://www.voobly.com/pages/view/46/Rating-Calculator (still uses the original ELO rating system for TGs)). It gives the 1700 player 8 points for winning and takes 24 points for losing.
If we take these numbers and assume ratings are exact and stable, we can calculate the theoretical winning chance of the 1700 team:
1700 + p * (+8) + (1-p) * (-24) = 1700
--> p = 75%
Now we can calculate the expected ratings after one game:
TEAM 1
25% chance to lose:
1800 --> (-14) --> 1786
1600 --> (-10) --> 1390
75% chance to win:
1800 --> (+3) --> 1803
1600 --> (+5) --> 1410
1800 --> 25% * (-14) --> 75% * (+3) --> 1798.75
1600 --> 25% * (-10) --> 75% * (+5) --> 1601.25
TEAM 2
75% chance to lose:
1400 --> (-3) --> 1397
1600 --> (-5) --> 1595
25% chance to win:
1400 --> (+14) --> 1414
1600 --> (+10) --> 1610
1400 --> 75% * (-3) --> 25% * (+14) --> 1401.25
1600 --> 75% * (-5) --> 25% * (+10) --> 1598.75
Again, we see each team’s higher rated player getting punished just for playing, while the lower rated players get rewarded just for playing. If this matchup was played often enough, TEAM 1 would end up with two 1700’s and TEAM 2 with two 1500’s.
We can also calculate the expected rating changes assuming random teams (no fmt):
1800: 33.3% * ( - 2 ) + 66.7% * ( - 1.25 ) = - 1.5 --> 1798.5
1400: 33.3% * ( +2 ) + 66.7% * ( +1.25 ) = +1.5 --> 1401.5
1600: 33.3% * ( +0 ) + 33.3% * ( +1.25 ) + 33.3% * ( - 1.25 ) = 0 --> 1600
1600: 33.3% * ( +0 ) + 33.3% * ( - 1.25 ) + 33.3% * ( +1.25 ) = 0 --> 1600
Here we see how the “Fair team balancer” compresses the ratings towards the average rating – not only in stacked and “noob bashing” games, but also in random team games.
One well known result is the difficulty to get very high or low TG ratings on Voobly. Lots of high end players have higher 1v1 ratings then TG ratings. If we compare the current Voobly ladders, we see that the highest RM 1v1 rating is 2667, while the highest RM TG rating is only 2206. In my eyes the “Fair team balancer” causes top end TG players to be notoriously underrated and moves the RM TG ladder away from its original purpose to give the best possible approximations of the players’ skill.
The effect of underrated players
The argument for the “fair team balancer” on https://www.voobly.com/pages/view/65/Rating-System is, if I understand correctly, the assumption that it keeps “noob bashers” away from lower rated games, because they get less points for playing in these games compared to games with equally rated players. Or, if they play regardless, the “noob bashers” get at least less points than they originally would.
Most people don’t want to play against smurfs, because smurfs are (usually) underrated and stronger than their rating suggests. Thus, the smurfs get more points for winning than they should and people who play vs them effectively donate points.
The problem with smurfs is not their amount of games, but them being underrated and them therefore stealing points from “well rated” players.
It looks to me like the ones who actually get punished by the “fair team balancer” are the ones playing against the “noob bashers”. They get punished in the sense that they lose more games than they should in regard to everyone’s ratings. Sure, the noob bashers get a lower rating, but this also allows them to continue playing in lower rated games where they win more often than their ELO suggests.
Noob bashing should not be a problem in the first place. If ratings were accurate, the bashers would win most of the time, but they would also get rewarded accordingly, leaving them with the same ELO as before. If the bashers get too many points in these games, this only means their rating was too low compared to the other players ratings and everyone’s ELO is now more likely to be accurate.
I know, with the original ELO system, some players would go ahead and point trade their way up to a foolish rating (by exclusively playing vs overrated players), but ratings aren’t supposed to be framed and showed to you grandparents, they are supposed to represent your true skill. They are nothing but an indicator. Your true skill doesn’t depend on your rating anyway.
All in all, my point is that the “fair team balancer” squeezes the ELO economy towards the average rating, decreasing the average rating differences in the process and ultimately making it harder to judge a player’s true skill by their rating.
This modification is called “Fair team balancer (lower rated players get more points when game is won)” (https://www.voobly.com/pages/view/65/Rating-System).
In the website’s section “Team Fairness” it is proclaimed that this modification prevents noob bashing by rewarding players less points for playing in lower rated games.
Before discussing the direct impact of this rating modification, I want to show an in my opinion bad effect on the overall ELO economy.
The numbers I use are not completely identical with the actual numbers, but they represent Voobly’s algorithm well enough.
Let’s say we have 4 players with following ratings: 1800, 1600, 1600 and 1400.
These ratings provide two different possible matchups.
First, we have the “fair” matchup where the 1800 and the 1400 team up together, have the same average rating as the two 1600’s and theoretically should win 50% of the time.
TEAM 1
50% chance to lose:
1800 --> (-10) --> 1790
1400 --> (-6) --> 1394
50% chance to win:
1800 --> (+6) --> 1806
1400 --> (+10) --> 1410
We are in a situation where the 1800 loses 10 points 50% of the time and wins 6 points 50% of the time. Hence his expected rating after one match is two points lower than his original rating. The reversed thing happens with the 1400:
1800 --> 50% * (-10) --> 50% * (+6) --> 1798
1400 --> 50% * (-6) --> 50% * (+10) --> 1402
As you can see the 1800 gets punished just for playing in a “fair” game where he teams up with the 1400.
TEAM 2 doesn’t get their ratings changed for playing this matchup (because they win or lose equal amounts of points):
TEAM 2
50% chance to win:
1600 --> (+8) --> 1608
1600 --> (+8) --> 1608
50% chance to lose:
1600 --> (-8) --> 1592
1600 --> (-8) --> 1592
1600 --> 50% * (-8) --> 50% * (+8) --> 1600
1600 --> 50% * (-8) --> 50% * (+8) --> 1600
If TEAM 1 plays enough games against TEAM 2 (and the higher rated player always gets rewarded less points than the lower rated player), the 1800 and the 1400 would both eventually end up with the rating of 1600 – leaving the 1800 massively underrated and the 1400 massively overrated.
Now let’s look at the other matchup: 1800 & 1600 vs 1400 & 1600.
With average ratings of 1700 and 1500 I looked up the Voobly ratings calculator (https://www.voobly.com/pages/view/46/Rating-Calculator (still uses the original ELO rating system for TGs)). It gives the 1700 player 8 points for winning and takes 24 points for losing.
If we take these numbers and assume ratings are exact and stable, we can calculate the theoretical winning chance of the 1700 team:
1700 + p * (+8) + (1-p) * (-24) = 1700
--> p = 75%
Now we can calculate the expected ratings after one game:
TEAM 1
25% chance to lose:
1800 --> (-14) --> 1786
1600 --> (-10) --> 1390
75% chance to win:
1800 --> (+3) --> 1803
1600 --> (+5) --> 1410
1800 --> 25% * (-14) --> 75% * (+3) --> 1798.75
1600 --> 25% * (-10) --> 75% * (+5) --> 1601.25
TEAM 2
75% chance to lose:
1400 --> (-3) --> 1397
1600 --> (-5) --> 1595
25% chance to win:
1400 --> (+14) --> 1414
1600 --> (+10) --> 1610
1400 --> 75% * (-3) --> 25% * (+14) --> 1401.25
1600 --> 75% * (-5) --> 25% * (+10) --> 1598.75
Again, we see each team’s higher rated player getting punished just for playing, while the lower rated players get rewarded just for playing. If this matchup was played often enough, TEAM 1 would end up with two 1700’s and TEAM 2 with two 1500’s.
We can also calculate the expected rating changes assuming random teams (no fmt):
1800: 33.3% * ( - 2 ) + 66.7% * ( - 1.25 ) = - 1.5 --> 1798.5
1400: 33.3% * ( +2 ) + 66.7% * ( +1.25 ) = +1.5 --> 1401.5
1600: 33.3% * ( +0 ) + 33.3% * ( +1.25 ) + 33.3% * ( - 1.25 ) = 0 --> 1600
1600: 33.3% * ( +0 ) + 33.3% * ( - 1.25 ) + 33.3% * ( +1.25 ) = 0 --> 1600
Here we see how the “Fair team balancer” compresses the ratings towards the average rating – not only in stacked and “noob bashing” games, but also in random team games.
One well known result is the difficulty to get very high or low TG ratings on Voobly. Lots of high end players have higher 1v1 ratings then TG ratings. If we compare the current Voobly ladders, we see that the highest RM 1v1 rating is 2667, while the highest RM TG rating is only 2206. In my eyes the “Fair team balancer” causes top end TG players to be notoriously underrated and moves the RM TG ladder away from its original purpose to give the best possible approximations of the players’ skill.
The effect of underrated players
The argument for the “fair team balancer” on https://www.voobly.com/pages/view/65/Rating-System is, if I understand correctly, the assumption that it keeps “noob bashers” away from lower rated games, because they get less points for playing in these games compared to games with equally rated players. Or, if they play regardless, the “noob bashers” get at least less points than they originally would.
Most people don’t want to play against smurfs, because smurfs are (usually) underrated and stronger than their rating suggests. Thus, the smurfs get more points for winning than they should and people who play vs them effectively donate points.
The problem with smurfs is not their amount of games, but them being underrated and them therefore stealing points from “well rated” players.
It looks to me like the ones who actually get punished by the “fair team balancer” are the ones playing against the “noob bashers”. They get punished in the sense that they lose more games than they should in regard to everyone’s ratings. Sure, the noob bashers get a lower rating, but this also allows them to continue playing in lower rated games where they win more often than their ELO suggests.
Noob bashing should not be a problem in the first place. If ratings were accurate, the bashers would win most of the time, but they would also get rewarded accordingly, leaving them with the same ELO as before. If the bashers get too many points in these games, this only means their rating was too low compared to the other players ratings and everyone’s ELO is now more likely to be accurate.
I know, with the original ELO system, some players would go ahead and point trade their way up to a foolish rating (by exclusively playing vs overrated players), but ratings aren’t supposed to be framed and showed to you grandparents, they are supposed to represent your true skill. They are nothing but an indicator. Your true skill doesn’t depend on your rating anyway.
All in all, my point is that the “fair team balancer” squeezes the ELO economy towards the average rating, decreasing the average rating differences in the process and ultimately making it harder to judge a player’s true skill by their rating.