This Regressing entry is brought to you by our clever friends at the Harvard College Sports Analysis Collective. Today: how Premier League diving might be a cultural phenomenon.
If you spend a Saturday afternoon in an English pub, you might hear someone say that foreign players fall too easily in order to win fouls. Last month, Stoke City striker Michael Owen made the same criticism to the BBC, complaining that diving in the Premier League is "worse than ten years ago with the influence of players coming from South America, Spain and Italy."
Owen isn't alone in his sentiments. Manchester United manager Sir Alex Ferguson has agreed that "there are plenty of players diving…particularly foreign players." The argument spans across sports and continents, as European players in the NBA are often considered the source—falsely—of its flopping epidemic.
Are Owen and the others right? They are suggesting that diving is a cultural phenomenon. Until recently, there was no way to address the question with any kind of rigor. Pub arguments are influenced not only by Guinness but by confirmation bias too: If you already believe that international players dive more often, you may interpret every dive by a foreign player as proof of those beliefs, ignoring the times when Ashley Young and his British colleagues do exactly the same.
That all changed this season when Opta (a football analytics company) and Manchester City announced that they would be releasing Opta's database for public consumption and analysis. They cited Bill James and Moneyball as their inspiration for bringing American "sabermetrics" across the pond. Now we can use data to evaluate Owen's claim.
I used Opta statistics covering every game in the 2011-2012 season. If South American, Italian, and Spanish (from now on I will refer to these as "SAIS") players do really fall more easily than players of other nationalities, we might expect to see them win more fouls per minute.
But that alone would not be enough to answer the question. It's possible that certain players are simply fouled more than others. If it happens to be that SAIS players are fouled more, for whatever reason, then the fact that they receive more fouls per minute doesn't demonstrate they go to ground more easily.
To deal with this, I use a statistical technique called linear regression. Using regression, I control for the number of situations where a player is likely to be fouled in each game by finding indicators in the Opta data. Specifically, I control for the number of tackles a player wins and loses, the number of touches they have on the ball, the number of "duels" they win and lose (where a duel is a "50-50 contest between two players of opposing sides"), and their position.
My regression found that SAIS players do win more fouls per minute. A player from South America, Italy or Spain will on average receive 28 percent more fouls than will players of other nationalities. A team with an extra three players from South America, Italy or Spain would receive an extra foul, on average, per game. There is less than a one percent chance that this difference was caused by random variation.
There could exist alternative explanations for these results. Perhaps, by pure chance, SAIS players happened to play more often in games against teams that are more prone to fouling. But I ran the numbers, controlling for the team a player plays against, and the results still stood.
Another alternative explanation I have not ruled out is ability. It might just be that South American, Italian and Spanish players are better than players of other nationalities, and that better players receive more fouls (it is harder to tackle a good player fairly without fouling them). To try to deal with this, I added controls for a wide set of indicators that measure a player's ability. Since indicators of a defender's ability differ from midfielders/strikers, I control for different indicators in each case. For both groups, I control for goals, assists, key passes, unsuccessful and successful passes and dispossessions. For the defenders, I additionally control for indicators such as clearances and blocks. For strikers: shots on target, dribbles, and through balls.
After adjusting for ability, SAIS defenders no longer draw more fouls than defenders of other nationalities. But the earlier results still stand for midfielders and strikers. South American, Italian, and Spanish midfielders and strikers appear to receive significantly more fouls per minute than players of other nationalities even after controlling for a wide set of indicators of ability.
This adds up over the course of a season. Suppose a team had three SAIS midfielders/strikers play for all 90 minutes of all 38 games (in Manchester City's case, three from Mario Balotelli, Sergio Aguero, Carlos Tevez and David Silva). My results suggest you could expect them to win 40 more fouls per season than a club like Stoke City, which has no midfielders or strikers of these nationalities. That might partly explain why Owen's manager, Tony Pulis, has been loudest in demands for banning players caught guilty of diving.
I based my study on Owen's comments, which separated SAIS players instead of making a point about international players in general. Owen might have been on to something: I couldn't find evidence of a significant difference between English and non-English players. However, Scottish, Welsh, and Northern Irish players receive significantly fewer fouls per minute, suggesting players from these countries may be more prone to stay on their feet.
There are many possible limitations to my study. First, it considers only a single season; a completely different pattern may exist in other seasons. Second, it may be that referees are, for some reason, more likely to award fouls in favor of players from the SAIS countries. There is also the simple chance that my methodology could be improved upon. Most obviously, it could be the case that the set of indicators of ability I have included are not sufficient to fully control for ability. For now, with the data currently available, Owen's claims do not appear inconceivable.