Who gets credit for a goal? This is not, I think, a simple question. The guy who took the shot surely deserves some credit. And the guy who passed it to him. That's easy. We have goal and assist numbers for all that. But what about the fullback whose clever cross-field ball opened up the defense and started the action that led to that pass and shot? How about the striker who never touched the ball but whose intelligent run to the left of the box dragged the opposing center back out of position and created space for the eventual goal-scorer? And further, what about the midfielder whose tireless work forced the opposing manager to bring in another central player, opening up the wide area into which that cross-field ball was sent?
The joy of football is that it is a relentlessly dynamic game. You can never fully separate out discrete actions. Instead, every action on the pitch has knock-on effects twenty, thirty, even one hundred yards away. Any goal, or goal-scoring opportunity, is a result of a huge array of actions. Some of these are categorized statistically, but even Opta misses most of them. So what is a statistician to do?
The answer that's always made the most sense to me—besides getting your head out of your spreadsheets, moving out of your grandma's basement and watching a game some time you nerd—is plus-minus. The method behind plus-minus is simple. Because we don't know who deserves credit or blame within a match, we give the credit equally to everyone. You log who's on the pitch, you track the basic stats. How well did the team play when Kyle Walker was on the pitch? How about Tommy Carroll?
So what I've done is take the minute-by-minute database and code every player on the pitch for Tottenham Hotspur. I look at shots in the box on target and shots outside of the box on target to create an expected goals scored and expected goals allowed number. I compare these numbers to the average goals scored and goals allowed by Spurs' opponent, adjusted for home or road games. That gives me simple plus-minus.
As I'll explain, there are quite a few problems with using simple plus-minus. The data should not be taken at face value. Indeed, the problems in the data are so significant that for a bunch of important Spurs players, I don't have plus-minus numbers that I think are particularly useful one way or the other. But I think the best way to explain the inherent problems in simple plus-minus is in reference to the plus-minus numbers. So here they are. The numbers are expected goals plus or minus per 90 minutes.
A good xG+ number is positive, a good xG- number is negative. So per minute, the best plus-minus figure on the club is, yup, Tom Carroll's. (The sample is too small to really be meaningful, but come on, that's fun.) You can also see here the humongous effects of the shift from Brad Friedel to Hugo Lloris in goal. No two players at the same position show such a wide disparity in plus-minus.
There are two basic problems with plus-minus statistics. Their technical names are collinearity and sampling. I'm going to try to explain them here using examples from the above table.
Nerdery I: Gareth Bale, Tom Carroll and Sampling
As you can see, Tom Carroll, has the best plus-minus numbers per 90 minutes of any player with 100 or more minutes played. On the other hand, Gareth Bale's plus-minus stats are roughly the same as the club average. oughly team average doesn't seem all that good for a player supposed to be worth £80m or whatever. And if a statistic can't pick up the importance of Gareth Bale to Tottenham Hotspur, what good is it anyway?
The problem here is one of sampling. Obviously we have only a tiny sample of Tom Carroll's time on the pitch, and the club's strong performance could have been caused by any number of factors other than the profound psychic effect Carroll's mere presence has on all the people nearby. You can't gauge much on 100 minutes. We have a great sample of Gareth Bale's performance on the pitch, but a plus-minus statistic requires also a sample of the club's performance when he was off the pitch, and here the other side of the sampling issues arise. Bale missed only six games, and Walker and Vertonghen missed even fewer. There is too much variance in team performance in a smaple of just a handful of games to draw any conclusions here.
This is a problem that I didn't exactly solve. There's no way to make a bad sample into a good sample. To even out any extreme effects, I have employed a method of regularized regression, regressing plus-minus numbers to the team mean based on the size of the sample involved. This should prevent really crazy numbers, but it won't help us identify the value of guys like Vertonghen, Walker, and Bale. I think I'm a bit stuck here.
Nerdery II: Sandro, Scott Parker and Collinearity
I'm guessing a lot of eyes went straight for Scott Parker's name on the list, and you saw his significantly above average defense rating of -.37. You might even have compared this rating to Sandro's pretty blah -.25. What's up with that?
The issue is collinearity. The biggest effect the simple plus-minus table shows is the effect of the keeper switch. With Brad Friedel in goal, Spurs were a solid attacking team and no more than an average defensive side. The switch to Hugo Lloris only mildly improved the attack, but the club's defense shot up to being one of the best in the league. This fits perfectly with the observations of basically everyone. A sweeper keeper like Lloris is integral to defending the high line that Spurs manager Andre Villas-Boas prefers, and there may be no keeper in the world better at sweeping up behind the defense than Lloris. So what is collinearity? Basically it refers to the many cross-correlations between different players' plus-minus numbers. The more you played with Lloris in goal, the better your defensive numbers, and vice versa. Scott Parker and Michael Dawson both played almost exclusively with Lloris, while Sandro and William Gallas played much more with Friedel. Collinearity prevents us from determining if the underlying cause of a player's plus-minus is his own performance or the performance of the players with whom he happened to share time on the pitch.
For this problem, I have devised a solution, or at least I have improvised a way around it. The first thing I did was to adjust every player'z plus-minus based on the plus-minus numbers of their teammates, weighted by how many minutes they spent on the pitch with every other player, but this did not solve the collinearity problem related to the keepers. Too many players had collinear relations in their playing time, so one blanket adjustment didn't get rid of this effect. So I tried sort of a hack. I gave each player two plus-minus ratings, one with Friedel in goal and one with Lloris in goal. I regressed both of these ratings to the mean, as described above, and took a weighted average based on playing time. This seems to have mostly worked.
Bill James, the great baseball writer, once argued that there's a certain aesthetic quality to a working statistic. It should present you with information that looks reasonable, with ratings that mostly fit your expectations. If you know the sport, you know who's good and who's bad. If the statistic gives you ratings that look unlike anything you've ever seen before, it's more likely the problem is with the statistic than with you. But if the statistic does nothing but confirm your prejudices, then probably you haven't created anything interesting. I'm pretty happy with the results, on this account. There are a bunch I think aren't worth much—the sampling problem players above—and a few I'm a bit skeptical of for reasons I'll describe below—Dawson and Gallas particularly. But I am at least intrigued by how much this stat likes Kyle Naughton and Steven Caulker, how much it dislikes Benoit Assou-Ekotto, Clint Dempsey and Gylfi Sigurdsson, and how totally it worships at the feet of Moussa Dembele. I think there might be something here.
One other problem of collinearity before I get to the table. The other issue that can affect adjusted plus-minus is not only the players with whom you play, but also the players by whom you're replaced. If a club has a really awful regular sub, his terrible performances with make the guys he replaces look better. If a club is very deep at a position, then the various good players who replace each other won't get credit for playing well because their replacements are equally good. I think that we're seeing a bit of this issue at central midfield, where some of Dembele's excellent numbers have to do with how poor Huddlestone and Livermore were. I think there might be a similar issue in central defense, where Vertonghen's numbers—in the small sample of his time off the pitch—are muted by the quality of replacements like Steven Caulker and Michael Dawson. Ultimately, that's not a problem I can solve when looking at just one club, so I haven't tried.
Adjusted Plus-Minus Table
These ratings, then, should be understood as relative to the team and relative to the quality of replacements rather than relative to the whole league. A below average rating doesn't mean the player is below some theoretical league average, it means his numbers are below the team average for the season.
This means the numbers in the adjusted plus-minus table are quite small. A player who is .1 G / 90 min better than his teammates has very strong plus-minus numbers.
So what have we got here? First, there are the numbers I don't trust. I haven't listed Lloris and Friedel because they are the basis for the split I'm using in the first place. I think the raw plus-minus numbers tell the story of the keepers reasonably well. Second, sampling makes it very hard to say much of anything about Vertonghen, Bale and Walker. It does appear that Aaron Lennon had a moderately positive effect on the attack balanced by some weaknesses in defense. Finally, I also don't think that the numbers for William Gallas and Michael Dawson are worth much of anything, because Gallas played literally every minute of the club's defensively poor first twelve matches, while Dawson played only in the defensively strong second half. So the split season that I've used to clear out the collinearity effects makes it impossible to truly evaluate them. I think it is likely that Dawson and Gallas both deserve some of the credit (and blame) that went to Lloris and Friedel, as players who were responsible for the club's defensive struggles and defensive turnaround.
|The Best||Moussa Dembele, Sandro, Hugo Lloris|
|The Good||Steven Caulker, Tom Carroll, Kyle Naughton, (Michael Dawson)|
|The OK||Aaron Lennon, Emmanuel Adebayor, Jermaine Defoe, Lewis Holtby|
|The Not Good||Scott Parker|
|The Bad||Clint Dempsey, Jake Livermore, Gylfi Sigurdsson, (William Gallas)|
|The Worst||Tom Huddlestone, Benoit Assou-Ekotto, Brad Friedel|
|The Insufficient Data||Kyle Walker, Gareth Bale, Jan Vertonghen, Tom Carroll (if I'm being honest)|
I've summarized the numbers in the handy text table above. Assou-Ekotto, love him to death, but the club just played better football when he wasn't around. Steven Caulker's numbers are very impressive, while Kyle Naughton's—a significant loss in attack more than balanced by strong defensive results—make quite a bit of sense. One of the main lessons I've drawn from this is that Steven Caulker deserves some real opportunities, and I would not be surprised at all to see Benny sold,
I think these numbers also might help us understand the pursuit of winger Nacer Chadli. If Clint Dempsey isn't in the club's plans as a wide forward on the left, then it makes all the sense in the world to add one more good piece there. I think it's reasonable to hope that Gylfi Sigurdsson, who is still very young and was clearly excellent for Swansea a year ago, can settle in to help the club. But Dempsey, at his age, I worry about committing to him for another season in the hopes that these numbers were fluky.
The plus-minus analysis here has focused on individual players. The same methodology can work also for pairs or groups of players. I am planning on doing the analysis and writing for that in the next week or so. If there are pairs or groups of players you'd like to see plus-minus numbers for their time together on the pitch, let me know in the comments. I will try to do as many as I can next time around.