When will the devs fix the Elo calculation?!

:arrow_forward: GAME INFORMATION

  • BUILD #: All
  • PLATFORM: All
  • OS: All

:arrow_forward: ISSUE EXPERIENCED

:question: DESCRIBE THE ISSUE IN DETAIL (below). Limit to ONE issue per thread.

There are lots of issues with the current elo rating system.
For more details:

:arrow_forward: FREQUENCY OF ISSUE

:question: How often does the issue occur? CHOSE ONE; DELETE THE REST.

  • 100% of the time / matches I play (ALWAYS)

:arrow_forward: REPRODUCTION STEPS

:question: List the DETAILED STEPS we can take to reproduce the issue… Be descriptive!

For more details:

:arrow_forward: IMAGE & ATTACHMENTS

:question: Attach a relevant PICTURE (.jpg, .png, .gif), VIDEO (.mp4, YouTube), DXDIAG FILE (.txt), or CRASH/GAME LOGS (.aoe2record, .txt) below.

For more details:

@GMEvangelos When will this be fixed?

1 Like

@GMEvangelos When will this be fixed?

This is a real issue. I am almost not playing TGs anymore. Why? Matchmaking matches the ELOs, and because the TG elo is as inflated as it is people who play more often get higher in ELO, with the same skill. So if you are in lobby and see all your mates have 500 TGs played, you already know they will be hardly any help, whereas a player with “only” 100 TGs played will be waaaaaay better. If that would be fixed TGs would be way better balanced, really hope for this one being fixed

I believe the mistake that causes this RM tg inflation is using the highest elo of the enemy team to do the MMR calculations, instead of just using the average.

2 Likes

This is indeed the root of the inflation. If they fix only this, i will be happy. So the fix is known, but it isnt high on the prio list for the devs. @GMEvangelos never reply to this kind of issues. It is already in the game from the start. So this bug already exists for over a year. Still no fix.

There are some minor other things, but they are more likely ‘nice to have’ and not a ‘must’.

1 Like

I don’t think there is an issue here that can be easily fixed, at least to players’ satisfaction.

There’s another understated effect of how ELO should work for team games. Even if players only join team game queues with other players of the same exact skill, TG ELOs will need to span a much larger range than 1v1 ELOs: two people who are only 100 elo apart in 1v1 should be much higher elo apart in TGs. Intuitively, it’s because if you have 4 players on a team who are each 100 1v1 elo higher than their opponents, they are a lot less likely to lose the game than a single player who is 100 1v1 elo higher than his opponent. Let me try to explain the math of why, for example, someone with 1400 ELO 1v1 should have a much higher elo in 4v4:

The elo system assumes a distribution with a fixed standard deviation in terms of ELO, and that is reasonably good for a lot of skill-based games and sports like AoE2. From the wikipedia page: “Two players with equal ratings who play against each other are expected to score an equal number of wins. A player whose rating is 100 points greater than their opponent’s is expected to score 64%; if the difference is 200 points, then the expected score for the stronger player is 76%.” What this means is, a player who is 100 ELO above another player in 1v1 should win 64% of games against them. That is, a 1100 1v1 elo player should win 64% against someone else that’s 1000 elo, in 1v1s. Intuitively, I think of a game as a sum of individual decisions with somewhat-random outcomes, I’ll call this the “performance” of the players. By chance, a player with a lower elo can win against a player with a higher elo (by “performing” better by chance), and the ELO system measures how likely it is to happen.

Similarly, the system attempts to do the same for team games. Let’s assume a system where we only have 4v4s. Four 1100-TG-ELO players should win 64% of games against four 1000-TG-ELO players, and four 1200-TG-ELO players should win 76%. However, if you take four players who are each 1100 1v1 ELO, and match them against four players who are each 1000 1v1 ELO, then the probability of the lower-skilled players winning drops significantly below 36%, to something much closer to 24%. This is because, roughly, you have to average the “performance” of each team, and having four players, each with a 100 1v1-ELO advantage over the opposing players, would significantly decrease the likelihood of their average losing to the opposing team. In particular, you can think of the outcomes of TG matches as taking standard errors of the “averaged” performance of 4 players instead of standard deviations of the performance of 1 player; in this case, the standard error of a team’s average performance goes as Sqrt(1/N), where N is the number of players on each team, so the higher skilled players will need to settle at an even higher TG ELO to compensate for this. In summary, (tl;dr:) players who are apart by 100 ELO in 1v1s should be apart by 200 ELO in 4v4s, and this is part of the ELO system. This is assuming players only queue together with others of their own skill level, their 1v1 elos are accurate, and they ONLY queue for 4v4s.

The same argument can be applied for 3v3s, in which case every 100 ELO in 1v1s should be 173 ELO in 3v3s (because sqrt(3) = 1.73…), and for 2v2s, in which case 100 ELO difference in 1v1 should be 141 ELO difference in 2v2s. This means, if we center elos on 1000 ELO, then theoretically, a 1400 ELO 1v1 player should have a TG elo of 1566 ELO in 2v2s, 1693 ELO in 3v3s, and 1800 ELO in 4v4s. This is why it’s normal for TG elos to be way more spread out than 1v1 elos: if you are 200 ELO below someone in 1v1s, you should expect to be ~400 TG ELO below him if you both play mostly 4v4s. Similarly, someone who has 2000 ELO TG and only plays 2v2s should be more skilled than someone who has 2000 ELO TG and only plays 4v4s, assuming people never mix queues.

However, if the game tracked 2v2s, 3v3s, and 4v4s separately, it would be way too many stats to keep track of (a player would need to play 40 games just to initialize all 4 rankings), so the developers decided simply to combine all TG ELOs into one statistic. Furthermore, players intentionally queue with friends of varying skill, and other players get randomly combined with each other, so you can expect a lot of unpredictability with how TG elos change. There’s also the problem that over time, players who are lower ELO are more likely to either abandon the game, or just play less often, and players who just join the game start off with 1000 ELO. The last factor contributes to TG elos drifting higher (the “inflation”), and the first few factors exacerbate this drift. But I still roughly believe the overall math: for example, looking at the plot from the github link, since the median TG ELO is around ~1400 and the median 1v1 ELO is around ~1000, you can expect a ~1500 ELO 1v1 player (500 ELO above average) to be ~2400 ELO in 4v4s (500 * Sqrt(4) = 1000 ELO above average), and they both indeed map to the same percentile (95%).

This is why TG elos can vary so much compared to a player’s performance, and I always take them with a grain of salt. I don’t think the ELO system needs to be revamped – I can’t imagine a system that avoids all the systemic issues I mentioned last paragraph while still being useful to players, and I think it’s intended to just be a rough estimate of a player’s TG skill. I just expect that if I face an opponent that’s 200 ELO higher or lower than me in a 4v4 TG, he’d feel about 100 ELO higher or lower than me in 1v1s, and that has felt accurate on average.

2 Likes

@StrawCube487203 Thanks for you input. I think you have posted some valuable points. Some of them might also be posted in the thread i refer to in my first post.

There is just a little note i want to make:
In your post you assume that you must compare the average rating of both sides. But you can also have a look at the sum of all elo’s an use such that rating for calculation the win rates. 4x 1000 vs 4x 1100 will have a 400 elo total elo difference, which means it is much more likely for them to win the game. That is something else then the 100 elo which you used in your post for this case. This is a major flaw in your post.

Overal i do agree there are really some other issues. Like having a 2k+1k player face up against 2x 1.5k players. The sum / average on both sides is equal, but is this really a balanced game? The 2k player can probably lead his team to victory.

In the end it is much more difficult to get good balance team games then good balanced 1v1s. My suggested easy fix (compare you elo to the average of the opposing team instead of the max) would solve the inflation for the most part, but wont solve all other issues.

2 Likes

I agree with the suggested MMR fix of taking the teams’ average instead of max in the updates. I think that needs to be done. The net points transferred / given out after a match should be close to zero.

I was pointing out (possibly like many in the earlier thread) we can expect issues to still remain after that. Using a sum won’t fix it either, because summing their elos will also distort the scale in the opposite direction. We can use other calculations to get TG elo differences to match 1v1 elo differences, but there’s no reason to make it this complicated. The community could adapt to the idea of the TG elo scale being more spread out: you can still roughly think of a player with much higher TG elo as better; if they don’t have 1v1 ratings, this is the best estimate you can use.

I actually think if the 1k, 1.5k, and 2k players are accurately rated (a big if), the 2v2 should be decently balanced. If by chance the 1.5k players 2v1 the 2k player instead of the 1k player, they have a much better chance of winning (while if they 2v1 the 1k player, they’ll likely be set back). The reason we see the 2k+1k team win so often is because oftentimes the 1k player just hasn’t had his TG elo settle to a true rating yet.