By Dan Brown
A less technical version of this post appears on Deadspin.
Earlier this month, Stoke City striker Michael Owen was quoted by the BBC as stating that diving in the Premier League is “worse than 10 years ago with the influence of foreign players coming from South America, Italy and Spain.” The argument that certain foreign players go down too easily can be heard in most pubs, but for a current player to single out particular countries as responsible for creating a culture of diving is quite striking. Sir Alex Ferguson has agreed that “there are plenty of players diving…particularly foreign players,” although Owen claims that nowadays “English players are as guilty as foreign players of doing it.” So what does the data have to say? Do foreign players, particularly those from the countries Owen highlights, go to ground too easily?
In the following, I have used Opta statistics for the 2011/2 season publicly released by Manchester City, obtained foreign player nationalities from http://www.myfootballfacts.com, and player nationalities within the British Isles from http://www.football-lineups.com. I group all South American, Italian and Spanish players together, Owen’s suggested countries, and look to see if there exists a significant difference in the number of fouls won per minute between them and players of all other nationalities. The unit of analysis in my regressions is the ‘player-game’, in other words a single observation for every game in which every outfield player played in the 2011/2 season. Considering only outfield players I have a total sample of 9,603 observations.
Given that some players may just be genuinely fouled more than others, I try to control for the number of situations in which a player is likely to be fouled. To do this, I control for the number of touches, number of duels won and duels lost (a duel is defined as a ’50-50 contest between two players of opposing sides’), and the number of tackles won and lost, by a player in a game. In my basic regression I also control for a player’s position. I run simple linear regressions using OLS. Given the disturbance term will likely be correlated across observations for different games for the same player, not least since some independent variables for that player will be the same across all games (such as position), I cluster standard errors at the player level.
The coefficient on the ‘South America-Italy-Spain’ (SAIS) dummy variable in this first regression (1) (see table below) is positive and significant at beyond the 1% significance level (p-value 0.002). The magnitude of the effect is significant too. A player from South America, Italy or Spain on average receives more fouls by 28% of the mean number of fouls received in the sample. Since this represents 0.0036 more fouls per minute (with an average in the sample of 0.0129) it would require on average approximately three South American, Italian or Spanish players to play a full 90 minutes for a team to gain an extra foul per game as a result of the nationality of their players (0.0036 x 90 x 3 = 0.972).
To see if the effect is being driven by only South Americans, or only Europeans within the SAIS countries, in (2) I replace the SAIS dummy variable with a dummy variable for South Americans (SOAM), and a dummy variable for Spanish and Italian players (SPAITA). I find the effect is statistically significant and of similar magnitude for both groups.
1. It could be that SAIS players play more often in games against teams more prone to foul. In (3), in addition to the variables controlled for in (1), I control for the full set of 19 dummy variables to indicate the identity of the opposition. The size and statistical significance of the coefficient on SAIS remains virtually unchanged.
2. The clearest criticism of regression (1) is that it does not control for ability. By ability, here, I mean the quality of their performance in a specific game. If SAIS players in the Premier League are on average better than other nationalities, and if better players are more prone to be fouled (as it is harder to tackle them fairly), we would expect an upwards bias on the coefficient on SAIS. The indicators of ability for players of different positions are very different. I split the full sample into two: (4a) defenders, and (4b) midfielders and strikers, and control for a set of indicators of ability of a player of that type in a game.
(4a) For the sample of defenders I include all variables (obviously other than position dummies) in (1), as well as: total clearances, clearances off the line, last man tackles, blocks, interceptions, errors leading to a goal, errors leading to an attempt, successful ball touches, whether the player kept a clean sheet, how often they were dispossessed, recoveries, total successful and unsuccessful passes, goals, key passes and assists. I find that the size of the coefficient on the SAIS dummy variable falls to less than a quarter of its size in (1) and (3), and is no longer statistically significant.
(4b) For the sample of midfielders and strikers I include all variables in (1) (including a position dummy for strikers), as well as: goals, key passes, assists, shots on target, total successful passes, total unsuccessful passes, successful dribbles, unsuccessful dribbles, how often they were dispossessed, through balls and successful ball touches. I observe a coefficient on SAIS very similar in size and statistical significance to (1) and (3).
Evidence for a more general domestic/foreign difference?
Finally, I wanted to understand if there exists a difference between domestic and foreign players more generally in their propensity to win fouls. In (5a), I altered (4a) by including two additional dummy variables: one to indicate if a player is English (ENG), and the other if they are Scottish/Welsh/Northern Irish (SWN) (i.e. British non-English). I add those same two dummy variables in (5b) from (4b). There appears to be no significant difference between English players and the base category of all nationalities other than those explicitly controlled for, for either defenders or midfielders/strikers. However, it appears as though British non-English midfielders and strikers receive significantly fewer fouls per minute (at beyond the 1% significance level: p-value 0.002), of a magnitude not dissimilar to the gain for SAIS midfielders/strikers in (4b).
To summarise, there exists some evidence to suggest that South American, Spanish and Italian midfielders and strikers are prone to receive more fouls; and British non-English midfielders and strikers fewer. This might be construed as evidence that players from SAIS countries ‘go down more easily’, if the reason they receive more fouls is that they are more prone to go to ground, though that is just one plausible explanation. From the estimates in (4b), a team with 3 SAIS midfielders or strikers playing 90 minutes every game for all 38 games of the season (for example if Manchester City played 3 from Mario Balotelli, Sergio Aguero, Carlos Tevez and David Silva all game every game) would expect to receive 40 more fouls in that season than a team with no SAIS midfielders or strikers.