ADVERTISING:

Triple Metagame Fortnight #6: Did "bad teams" win more often in 6.83?

Metagame Fortnight is back with a triple-issue! Get caught up on the last SIX weeks of draft and metagame changes. After being under the weather Gorgon has returned to bring us the final Metagame Fortnight ahead of the new patch. With 6.84 rapidly approaching, Gorgon asks an important question: did bad teams win more often on 6.83? Find out inside.

After six weeks of metagame information, I tackle the question I've seen approached everywhere from message board fans to pro players, especially with the Starladder upsets of last week:

are teams beating their betters more often than they used to?



I could only answer this question thanks to the tremendous resources provided to me by Ben "Noxville" Steenhuisen. Either of us can be reached with any questions or comments on Twitter. Nox didn't, however, participate in my interpretions, so please don't spam him if you disagree with my conclusions.






































For months now I’ve been seeing some interesting complaints about the legitimacy of competitive Dota. Although it’s not a strictly Metagame topic, I wanted to investigate anyway.




Is there something about this patch, this meta, these heroes, or these tournament structures that is making it uniquely more common for good teams to lose?



This is a very different question than asking if a team that is behind is expected to make a comeback. We know that this is more the case in 6.83 than in patches 6.81 and earlier. We are really asking, “Is there something about the changes made after 6.81 which made rankings less representative of skill because something other than skill is allowing teams to win more often?”

It’s hard to answer this question without a reliable way to rank teams; at the moment, the best rankings available for use in Dota 2 are Team Elos, which are based on the ranking system which has been used in Chess since 1960. In Elo, teams win/lose points based on the win/loss history of their opponents: in short, good teams get few points for beating bad teams and bad teams get many points for beating good teams.

Understanding Elo



I use the Elos curated by Ben “Noxville” Steenhuisen for many reasons, but two reasons stand out in this situation:


Firstly, Nox’s Elos only include games counted by datdota, which is the database I use to do most of my own research. Using his Elos allows me to cross-reference seamlessly (as opposed to using our own joinDOTA ranking system or some other website’s ranking system which would be likely to create data conflicts).
Secondly, Nox’s Elo has a built-in decay. The reason this is important is it preserves the average score over time and forces teams to remain active in order to maintain their position. This rating system has remained very stable across time, which makes it very easy for us to compare across different patches, years, or seasons.

Why do I say Elos are the best rankings available? Because they’re the most predictive (of the rankings that I’ve seen): for teams calibrated with at least fifty prior games,

Elo predicts the correct winner in 85% of games and an even greater percentage of matches.



I removed all scores from teams with fewer than 50 prior games for all the data in this research. The reason I did this is because I do not want to know if new teams are beating established teams more often—

I want to know if low ranked teams are beating high ranked teams more often.



Fifty games might seem a bit extreme, but I chose such a large number so every team will be likely to have played their most skilled opponents at least a few games and have played some opponents from outside their region. By waiting so long, we get very stable scores for each team. When a lower-ranked team beats a higher-ranked team in this data set, it’s much more likely due to an upset rather than miscalibration.

Even with the 50 game calibration period, we still have a sample size of over 9000 games… so we can afford to be picky.



Defining Skill



Skill is the ability to win consistently against a variety of opponents. Even executing a strategy which requires opponents to make mistakes is skillful if it is consistent. What isn’t skillful is winning based on factors outside of player control--

in other words, winning based on chance.

So how can we tell if a patch or meta has increased the influence of randomness in victory?

For the remainder of this article, I will be referring to winning in terms of single games, not matches.

. Matches (best of three is a common format) are a fantastic way to control for the influence of randomness in victory, but because there isn't a standard match format from league to league and stage to stage, we can't really base the Elo on match wins. Bear in mind, though, that when using a game-based Elo to predict matches, the correct prediction rate is much higher than the rate for predicting games (which is already fairly high).

Compressing Ranks



Let's talk about a few core concepts:

If teams win games due to non-skill factors, they also win Elo due to non-skill factors.
The odds of winning a game based on chance benefits all teams equally (in an individual game)
Over time, worse teams benefit more from chance than better teams (because better teams have more to lose).

The second and third point might seem to contradict each other, but they don't.

Good teams are more likely to win games without the influence of chance, so they are more vulnerable to the redistribution of wins that randomness provides.

If I own 8 apples and you own 2, odds that I'll find a worm in a specifc apple I own are the same as the odds that you'll find a worm in yours. But the odds of me finding more worms than you across all of my apples are pretty high.

Think of it as a tax: if a team earns 75 of 100 games, but the great socialist hand of Icefrog reaches in and states “30% of games will be evenly distributed between teams,” then they will end up losing 30% of those expected wins (a total of 22.5). Their opponents' will also lose 30% of their expected wins (7.5). Chance will end up giving the worse teams (22.5 - 7.5 = 15) more games across the series of 100. While this seems extreme, it's basically the complaint being made if somebody states that bad teams are regularly beating good teams: if that's possible, it means teams must be getting wins based on redistribution by luck.



In addition,

when Red team wins games it widens the gap just a little while when White team wins it closes the gap quite a bit

. That's because ranking is based on odds, and the odds of White winning here are very low. For a very simplified example with the above scenario, Red earns 10 points per win but loses 30 points per loss. Based on skill games, the teams don't change relative to each other (they both earn 750), implying that Red is three times better against White. But if 30% of games are won based on chance, even though it means Red is winning some games it otherwise wouldn't, the overall shift is hugely in White's favor.



The result is that teams across the board are compressed to closer Elos

, because climbing or dropping ranking is harder to achieve--the further away from the center you rise or fall, the stronger random chance pulls you back toward the middle.

If we were seeing even a much smaller increase in the influence of chance on victories, what we would end up with is a compression. Elo would plateau around the average and drop steeply at the edges--the steeper the drop, the greater percentage of wins are luck based rather than skill based. Here's what the Elo distribution looks like in reality:





Interesting note about team rankings: there is a second “bump” around 1250—that “bump” is what we would call the average for “top tier” teams. It is where Elo tends to gravitate for teams who are good enough that they consistently beat most teams but not good enough to consistently beat each other.

Back on point, is there compression here? Yes, a very small amount. Does this means that 6.83 is dominated by chance and all pro teams that aren’t (insert your favorite team here) are skillless hacks not suited to stock water fridges at League of Legends tournaments?

Not really. Notice that this compression has been happening over time; 6.81 is more compressed than 6.79.

This trend keeps across every patch

back to 2012. There's another explanation for why teams might group toward the average.

Because of how minimal, gradual, and long-lasting this process has been,

the compression is more likely due to the fact that teams are becoming more competitive.

Dota’s tremendous growth creates more skillful players and bigger prizepools; this all means that this compression will naturally occur over time as there are fewer teams who can dominate the entire world over entire patches.

This explanation is also consistent with the fact that Elo compression is much more notable in the "tier 1 bump" than near the average. If teams were winning by a factor other than skill, in most cases we would expect to see the Elo compression to pull all teams toward the overall average.



We can check this hypothesis by looking at the standard deviations; if this number is large, then there is a wide spread of rankings and if it is smaller rankings are more compressed.



Standard deviation has been slowly dropping since 2013, but this is the first era we’ve heard widespread complaints that chance plays too much of a role. The standard deviation is basically the same going all the way back through 6.81.

The compression, where teams are less likely to dominate each other in the rankings, is probably in small part due to design across the last six patches (at least). The game was made more open, more heroes made viable, and early game execution made less actionable. In a VERY small part, this has led to more random distribution of our winners and losers…

but this patch is not unique in creating more opportunity than previous patches.

Good teams are still winning enough to create the same range of scores as on 6.81 and the same spread of points.

So with regards to distribution, we would only say that good teams are losing to bad teams more consistently if good teams are improving and bad teams are remaining the same skill level... but the distribution remains basically unchanged. That isn't very likely, even if top players may think it is the case.

Winning Against the Odds



We can check this hypothesis against another figure: the percentage of games which are taken by an underdog. Like Elo compression, this has been steadily on the rise throughout the last several patches; also like compression, this growth has been decelerating slowly over time. This figure is valid to look at because we know the Elo distribution hasn't changed its course (if the distribution had been sufficiently compressed, then there would be fewer underdog matches in the first place since rankings wouldn't be accurate).



Interestingly, where this patch has really shown an increase in underdog victories is when there is a substantial advantage to one team. Each category below represents an additional standard deviation above the mean.



Notably, if we were expecting to see randomness truly impact games, we’d expect the largest increase to happen when there is the biggest advantage, because that is where there is the most uneven redistribution of wins.

"But teams are more likely to comeback, and bad teams are more likely to fall behind!" I hear the collective screams of Dota fans saying already. Not true:

lower-ranked teams were much less likely to win against a team even slightly higher ranked than them in 6.82

(the "comeback patch") while higher-ranked teams were more likely to upset against close opponents and equally as likely to upset against other opponents. The comeback mechanics aren't comeback mechanics at all: good teams knew them, planned on them, and used them as a standard game mechanic, counting themselves ahead even if the net worth or experience implies that they are behind.

Did 6.83, like 6.82, have mechanics which allowed teams to win from behind? Most definitely, and in a relatively unique way to other stages in Dota's development. In fact, lower-ranked teams were the least likely to beat higher-ranked opponents (by about 4%).

There’s no reason to believe that “bad” teams were more able to win on 6.83 than on any prior patch.

However, we do have very good reason to believe that the upper echelon of professional Dota 2 teams are more equivalently skilled than ever before… which is good news for this year’s International!

Special thanks to Noxville for his continued work on team rankings and Datdota for access to the information I always use to investigate metagame. Thanks to Nahaz for the methods edit and MalystryxGDS for his continued work helping make this series readable.

Comments