Analyses of the ratings - Spotting the issues

the problem with exp average is, that it isn’t consistent. It heavily depends on the divisor of the elo and will give different relative results for different absolute values.
exp average of 1000/500/500/500 * 2 <> exp average of 2000/1000/1000/1000 (except for a single specific devisor/exponent combination).

But you can make a polynomial average, like elo ^ 4 or something like this. This way it would weighten higher elo much more but still be consistent.

Formula: Team Elo = root4(avg(individualelo^4))
For example.

I think this could solve the smurf issue.

Exp average of 1000/500/500/500 = exp average of 2000/1500/1500/1500 - 1000. Add any number to the rating of all players and the exponential average increases with the exact same amount. So it is perfectly consistent.

3 Likes

I was playing with the numbers on Team Elo and how it is distributed.
For me the Elo 1v1 system works great (maybe I would argue about the k factor of 32 which is fine but could be better, k = 32 for newcomers to the game with less than 10 games and then go to k = 16 more or less)
The Elo team on the other hand is a disaster, with inflation due to the non-uniform distribution of the points in the game between the teams and between the players of each team.
My proposal is to continue balancing Team Elo as 1v1 Elo but distribute the points unevenly to the players:
-If your team wins, distribute the points in a weighted average way where the player with the fewest Elo Team points gets more points than the other players
-If your team loses, then take the points in a weighted average way where the player with the most Elo Team points would lose more points than the rest of the team.
The weighted average would be made according to the Team Elo points of each player within the Team.

Example 4v4:

• Team A:
• Player 1: 2200 Team Elo
• Player 2: 1500 Team Elo
• Player 3: 800 Team Elo
• Player 4: 300 Team Elo
• Average Team A: 1200
• Team B:
• Player 1: 1650 Team Elo
• Player 2: 1350 Team Elo
• Player 3: 1250 Team Elo
• Player 4: 1000 Team Elo
• Average Team B: 1312
Team A has a 34% chance of beating team B based on the average Elo team.
• If Team A wins, then 21.01 points must be taken from Team B to Team A (as 1v1):
-Team A:
• Player 1: 8% of the points → +1.67 → New Team Elo = 2202 Team Elo (was 2200)
• Player 2: 12% of the points → +2.46 → New Team Elo = 1502 Team Elo (1500)
• Player 3: 22% of the points → +4.60 → New Team Elo = 805 Team Elo (800)
• Player 4: 58% of the points → +12.28 → New Team Elo = 312 Team Elo (300)
• Average New Team A: 1205 (5 point increase)
-Team B:
• Player 1: 31% of the points → -6.60 → New Team Elo = 1643 Team Elo (was 1650)
• Player 2: 26% of the points → -5.40 → New Team Elo = 1345 Team Elo (1350)
• Player 3: 24% of the points → -5.00 → New Team Elo = 1245 Team Elo (1250)
• Player 4: 19% of the points → -4.00 → New Team Elo = 996 Team Elo (1000)
• New team B average: 1307 (decrease of 5 points)
• Now if team A loses, 10.99 points must be awarded from team A to team B (as 1v1):
-Team A:
• Player 1: 46% of the points → -5.04 → New Team Elo = 2195 Team Elo (was 2200)
• Player 2: 31% of the points → -3.44 → New Team Elo = 1497 Team Elo (1500)
• Player 3: 17% of the points → -1.83 → New Team Elo = 798 Team Elo (800)
• Player 4: 6% of the points → -0.69 → New Team Elo = 299 Team Elo (300)
• New team average A: 1197 (decrease of 3 points)
-Team B:
• Player 1: 19% of the points → +2.12 → New Team Elo = 1652 Team Elo (was 1650)
• Player 2: 24% of the points → +2.59 → New Team Elo = 1353 Team Elo (1350)
• Player 3: 25% of the points → +2.79 → New Team Elo = 1253 Team Elo (1250)
• Player 4: 32% of the points → +3.49 → New Team Elo = 1003 Team Elo (1000)
• New team B average: 1315 (3 point increase)

This weighted average distribution model punishes the highest Elo player within each team the most and rewards the lowest Elo player within each team; and the points distributed (taken or earned) are the same without putting any inflation in the system
What you think? What are its weak points?

Example 2v2:

• Team A:
• Player 1: 2200 Team Elo
• Player 2: 1000 Team Elo
• Average Team A: 1600
• Team B:
• Player 1: 1650 Team Elo
• Player 2: 1550 Team Elo
• Average Team B: 1600
Both team have 50% chance of beating the other team based on the average Elo team.
• If Team A wins, then 16 points must be taken from Team B to Team A (as 1v1):
-Team A:
• Player 1: 31% of the points → +5 → New Team Elo = 2205 Team Elo (was 2200)
• Player 2: 69% of the points → +11 → New Team Elo = 1011 Team Elo (1000)
• Average New Team A: 1608 (8 point increase)
-Team B:
• Player 1: 52% of the points → -8.25 → New Elo team = 1642 Team Elo (was 1650)
• Player 2: 48% of the points → -7.75 → New Elo team = 1542 Team Elo (1550)
• New team B average: 1592 (decrease of 8 points)
• Now if team A loses, the same 16 points must be awarded from team A to team B (as 1v1):
-Team A:
• Player 1: 69% of the points → -11 → New Team Elo = 2189 Team Elo (was 2200)
• Player 2: 31% of the points → -5 → New Team Elo = 995 Team Elo (1000)
• Average New Team A: 1592 (8 point decrease)
-Team B:
• Player 1: 48% of the points → +7.75 → New Elo team = 1658 Team Elo (was 1650)
• Player 2: 52% of the points → +8.25 → New Elo team = 1558 Team Elo (1550)
• New team B average: 1608 (increase of 8 points)

The weighted average distribution by team is made according to the Team Elo points that each player has: the% of points lost or won in direct relation to the Team Elo points (simple mathematics without complicating the calculation too much and without weighting each game or each player against each opposing player)

You’re right. This is the way it should be, consistent with adding elo, not multiplying.
Because the elo system is designed exactly this way. A specific absolute elo difference is equivalent to a specific win/lose chance.
Also it would make it harder and harder to push the team elo with smurfing, so another advantage to this system.

The question would just be which divisor to chose for the exponential average, as each divisor would give way different results.
As base I would chose either the natural exp or 2. I don’t know which is faster to calc for high exponents for the computer. I think maybe base 2 should be easier to calc - but I read somewhere that they actually use the taylor seris with a table, so this would mean the natural exp should be the way to go, at least if the exponent doesn’t exceed a specific value.

My proposal:

Team elo = ln(avg(exp(ind.elo/500)))*500
This would mean the team elo for a 3000/3000/1000/1000 team would be 2663 opposed to the current 2000.

There is also another method which could possibly be viable. This method would need some calc to do. It would work like this:
If Team A wins against team B the EACH teammember of A) would get elo according to if they would have won against a single player with the Team elo of Team B. And Team B would lose elo individually in the same manner.
If done like this, we need to calculate the team elos in a very specific way so it is “balanced”. It must be possible, but it will take time to figure out the formula for this.

WIth that method increasing the elo of the highest ranked players would be almost impossible as they would almost always fight against teams with comparably low elo and therefore only get the minimum increase even if they would win all their games. So elo smurf to push would be practically impossible with that calc method.

I think this would be the optimal solution, but Idk how much work it would be to figure out the calc of these “balanced” team elos.

Edit: I just made some tests with it. It looks like it is possible, but then the team elos would be depending on the individual elos of the opposing team, too… Which makes sense as they would be put inside the calc formula… I just hoped they would negate themselves out somehow, but seems it isn’t the case. Maybe this approach isn’t achievable with “elo”, maybe a different ranking method would be nevessary to make this work. I don’t speak about it’s not possible, but the calc would be so complicated that you can’t explain it anymore to anybody, what would cause even more problems if some new shenannegans occur.

So the best I read so far is the “exponential average” approach. Though I didn’t thought it was consistant initially, I think it is the best “consistant” approach as it fits perfectly with the behaviour of the elo distribution characteristics.

I hope devs will have a look on it and decide to use one of the various forms of that team elo calc, one that is most representative for the actual impact players with high elo have on low elo teams. I think that way the TG smurf (and probably even altf4) issue can hopefully be solved.

1 Like

I have never played SC2, so i have no idea how this really works. Based on your statement i have some questions about this from a theoretical approach. These are just meant to get a better understanding of this solution:

Does having seperate Elofor every premade mean that you have to play about 10-20 games with every team before you kind of reach your true Elo? That will mean that before that point you only have unbalanced games. You whipe your opponent or you will get whiped. That doesnt really sound like a good idea at all. How do SC2 deal with this?

How does this system deal with solo queueing? Does solo queueing has its own ranking? Will you also get a different rating for each random team? In that case everyone seems to be 1k if queueing solo. The probability of getting the same team twice while queueing solo is really, really small.

How does this system deals with partially premades? Like 2 friends queueu up for 4v4 and end up with 2 other players. Is this seen as premade and will this team get a premade Elo?

Again, this is post isnt mean to say the SC2 system is terrible. But by hearing this idea i just get some question. I would love to hear your answers, so i can form an opinion about this idea afterwards.

1 Like

@ casusincorrabil Thank you, and it’s nice to see someone spending time and effort on this too.

I mostly agree with you and will reply to the parts of your post where I have commentary.

They should use whatever base is most convenient for them and change the number they divide the individual elo by to compensate. There is actually only one degree of freedom in the exponential average calculation. Since (x^a)^b = x^(a * b), any change in the base could be translated to a change in the exponent and vice versa. For example if I am not mistaken (quick maths), then ln(avg(exp(ind.elo/500))) * 500 is the same as log_2(avg(2^(E * ind.elo/1000)))*(1000/E). (I multiplied the bases with 2/E and did the same with the ‘500’.)

In my first post where I suggested to use the exponential average (Analyses of the ratings - Spotting the issues - #93 by Mercy9545), I chose to put my degree of freedom (and I called it the parameter ∆) in the exponent in a specific form such that it conveys intuitive meaning: ∆ is the rating difference we think you need to have with other people in order for a 2v1 match with them to be fair. If we agree on ∆, then there are no more degrees of freedom.

This is how the old system worked (only there they would get Elo as if they had beaten the highest rated player on the other team, while at the same time they would get matched based on average Elo, a combination that did not make sense). In this approach, if a group of players play together for a while, their ratings will converge. I don’t like that. It means that if two players with different rating play together, the higher rated player will start to unintentionally smurf and the lower rated player will be unintentionally boosted.
I remember that during the times when the old calculation was still in place, someone told the following story: ‘My lower rated friend and I regularly play together and when we do, we have a 50% winrate. However, whenever my friend plays without me, he almost never wins, causing him to have an abysmal winrate overall. How is this possible? Should the rating system not make sure everyone gets a winrate that is close to 50%?’ I explained how I thought the story he sketched was consistent with a rating system where players who play together converge in rating and ended up advicing him to use a smurf account to play games with his friend in order to result in matches that were paradoxically more fair (!).

This point was not about a specific rating system that I support, but nonetheless I think this is an interesting point: aren’t we making things too complicated for people to follow? I think on the one hand yes: many people will not be able to fully understand our discussion. But on the other hand no, I don’t think it is bad for us to make things complicated. The 1v1 Elo system is already very complicated. Almost no one understands the mathematics and assumptions behind it. Yet, it is not difficult for people to get an intuitive grasp on how it works. People know that if they beat someone stronger than them, they get lots of points, and if they beat someone weaker than them, they get not so many points. How the mathematics ensures this happens does not need to concern them. Similarly, though some people will not be able to follow our discussion on exponential average, they can understand that we want the higher rated players in the team to have a bigger impact on the matchmaking. In fact, if Microsoft were to adapt the ‘exponential average’ formula, they could just write in the patchnotes something like “The higher rated players in a team now have a bigger influence than the lower rated players in determining what other team they will be matched against.” and leave it at that.

2 Likes

Even me. I initially get tricked by my intuition that the exponential average would be inconsistent. But it actually is the consistent of the average models we have if paired with the elo system, as the elo system is actually a logarithmic projection of the distribution.

And i’m glad you made me rethink about it, cause otherwise I wouldn’t have come to the conclusion myself.

I just hope that devs understand what we are talking about. And see that they can improve the team matchmaking and elo gains/losses by implementing the exponential average,

True. I wasn’t sure initially about it cause there are sums involved and exp isn’t distributive. I didn’t wanted to call it too early. I must be honest, I never worked with exponential average before.

But I think here it would fit perfectly.

2 Likes

Pretty sure he will not explain, since I do not believe that system can work at all 11
Imagine the amount of possible teams… like if we have 50k players, there are 50k^4 possible teams (yeah yeah 50.00049.99949.998*49.997, who cares…)
That alone would be 6,5 times 10^6 Terabyte of ratings (assumed we save the rating in 1 Byte)

Ok but yeah you dont need all ratings, just the ones that are unequal to start value. Ok I get that, but smurfing is the easiest thing you could do with this system. One friend changes account for complete reset of the rating for all of them, which is absurd

you have a separated solo rating for 2v2, another for 3v3 and another for 4v4.

So you can be diamond in solo 2v2, plat in solo 3v3, plat in solo 4v4, master in 2v2 in premade with one friend you play a lot, and gold premade in a rare premade 4v4.

When I played with my friend in 2v2 we were really strong at beginning with a good amount of all-ins, so we played with player who were individually stronger than us (better APM but less coordination)

How does this system deals with partially premades? Like 2 friends queueu up for 4v4 and end up with 2 other players. Is this seen as premade and will this team get a premade Elo?

it uses the solo rating in the match. so if you play a 4v4 with 1 friend and 2 randoms, both your friend and you will use the solo 4v4 rating.

well you play another match till you reach balanced games. In my experience it reaches balance way earlier, in 5-8 games. I even suspect it initially uses the solo 4v4 ELO as an starting point. so if you are diamond and your friend is gold, then your team starts with a hidden MMR like if it was platinum.

Bump since it’s still broken with 5k elo smurf abusers

Already some news about a fix? Probably not…

I think bothering them on twitter about the issues would have faster response.

1 Like

I have no Twitter…

I don’t either but the only time I’ve seen the devs actually work was when T-West exposed them on twitter with the wallhack bugs.

He will have a large group of followers. That will also make a difference. Having an account with no followers wont make them sweat on Twitter…

Any news about a fix already? Probably not…

Honestly I think the ELO system will never get fixed at this point. After 1000 games I am still around the 125 more losses than wins, someone like me should be lower on the scale and not be lumped with people that have 220 wins and 175 losses. It’s unfair for them as much as it is for me. It’s getting to the point where people are alt-f4’ing me cause of my lose to win ratio. Your guys rating system is also a contributor to the alt-f4 problem just as much as the people who do it that hate certain maps.

2 Likes

idk why they’re still using this archaic elo system when it takes 500 games to place people where they belong and still fails even after people have played thousands of games

half my games now involve some exploiters bringing along a smurf with 85-100% winrate

anyone who has played aoe2 would know that the community is full of toxic players who do not respect the concept of fair play. but the developers chose to cater to them instead of catering to the legitimate players who just stick to 1 account without abusing the system

it’s a shame how poorly the game is managed.

the last rating patch just made the problem worse.
the map script updates usually make the maps worse.
the settings updates (eg. forcing position-picking on every map, letting people see the map before picking their civ) kill most of the strategy/variety in the game and create so many balance issues.
and the balance patches usually make teamgames worse because they’re just knee-jerk reactions to complaints from 1v1 players.

1 Like

Honestly I don’t know why and will never understand people that can’t just stick to one account. Do they do it because they want to make themselves look like the best with a 92 percent wine rate with their new account? As for me I don’t care about that, what I do care about though is an honest, fair, and balanced game, but for the past 2 years since this games been released we haven’t gotten that. Plus I’m sick of being told to “get good” when there’s an obvious problem with the rating system, and joining a clan is a whole other toxic hell so that’s also out for me cause again my win to lose ratio has me be well ratioed, to the point where the higher member elo players within the clan don’t even play with the low elo players out of fear of “losing their own elo” which in the end makes the clan function utterly pointless.

Just tagging some random dev accounts in the hope they will have a look at this issue. TG ratings are in need of a quick fix. They are broken since the release of AoE II DE. They still havent fixed it. I really hope this will get a really high priority. Otherwise the devs are killing the TG ladder. This thread is already full of solutions. So they can just pick the best one in their opinion and implement it into the game.

1 Like