There was a big piece by Michael Bertin about expected goals on Deadspin's "Regressing" blog. This is a pretty major place for people to be talking about ideas from the world of soccer analytics. The article was critical of expected goals, and it was critical in particular of my work and the methodology I published on this website. I figured this would be a good chance to walk through some issues in the field with expected goals.
1) What is expected goals (xG)?
It's a method for estimating the quality of chances that a football team creates or concedes in a match. This is the thing I like a lot about expected goals. It may take a lot of data crunching to create specific xG values, but the underlying idea makes football sense.
How many good chances did a team create? How many half-chances? Just how "good" were they? How many good chances did they concede, and so on? These are intuitive football questions. When you're following a match, you're watching for the creation of chances, getting excited when it appears a scoring chance might be conjured for your team or getting worried when the other team is building one. We all watch for something like "expected goals." Managers and players create tactics aimed at creating good chances and preventing them for their opponents.
Right now my model evaluates shot attempts across a variety of axes—where was the shot attempted from? What sort of pass assisted the shot? With what body part was the shot taken? Did the attacker dribble past his defender before trying the shot? How fast was the attacking move that led to the shot? Was the shot off a rebound or from a set play? All of these factors clearly influence the likelihood of scoring a goal. By aggregating this information into a model, I can estimate the likelihood of scoring different shooting chances in a match or over a season.
I like my model, and I think the results it spits out usually look pretty good. I've shown that xG outperforms other shot-based systems in a variety of ways, and studies by Sander of 11tegen11 have demonstrated similarly the superiority of xG. In the wake of the article, Colin Trainor ran a good study showing how xG outperforms simple shot models in a new way. (Some "shot-plus" models not based on xG have also done well, such as James Grayson's team rating.)
Further, expected goals also can be useful for descriptive analyses that exceed the capabilities of the simpler shot methods. It can help break down by type or by value the kinds of chances teams create, how many are from fast attacks, how many from crosses, to what degree does the team just pepper shots at goal and to what degree do they carefully pick their opportunity to shoot? Good recent analytics writing like Trainor's article on Borussia Dortmund and Mike Goodman's recent piece on Manchester City, as well as a lot of my own work, are dependent on these descriptive capacities of expected goals.
But of course the current incarnation of expected goals falls short of that ideal of beautifully and clearly describing the quality of every chance and its likelihood of leading to a goal. The model does not perfectly evaluate the quality of chances, and it does not account for chances that do not lead to shots. So this is all still very much a work in progress.
But the underlying concept of expected goals, to me, remains sound. It's the intuitive, common-football-sense quality of expected goals that makes me want to keep working within its confines.
2) Ok. So what did Bertin argue about expected goals?
Bertin must have done a lot of data acquisition and data crunching in putting this article together, and I can vouch for him that's one heck of a lot of work. He created his own expected goals system based on location, game state, shot type (header/footed) and a few other factors. That's cool.
In the article, he uses this data set to test how well expected goals works, using r-squared. He writes:
I started with the shots from two seasons' worth of games in England, Germany, and Spain (sorry Serie A, nothing personal) and made a very basic model using just two variables: the distance from the shot to the center of the goal (technically the log of the distance), and the visible angle of goal (the one formed from each of the goalposts to the shooter). After removing penalties and own-goals, it's about 50,000 shots.
Running that regression gives you an r-squared of .121. That's solidly in the poor end of the scale*. From a prediction standpoint, it's nothing to get super jazzed about. The coefficients themselves are really significant. In other words, the probability you make a shot absolutely does decrease as you move away from the goal—no shit, right, but at least the math agrees with what seems obvious—it's just there is a ton of deviance that we're not explaining any better than just drawing straight line through an average number of goals scored.
He then built an improved expected goals model that included game state and an adjustment for headers, among other things. These additions improved the r-squared up to about 0.165.
As Bertin acknowledges in his footnotes, the type of logistic regression he did here rarely produces very high r-squared numbers. Predicting either "0" (no goal") or 1 ("goal") from a fractional number between 0 and 1 is hard. Further, we know that sometimes a mediocre Andros Townsend cross will end up in the back of the net and sometimes Roberto Soldado will smash a sitter into row Z from four yards out. It's football, sometimes a 1% shot is scored and sometimes an 80% shot is missed horrifically. We should expect a lot of these "errors" no matter how good the model is.
So Bertin says these are low numbers, but are they meaningfully low? What are they low in comparison to? Does a simple shot model which uses 0.095 as the value for every shot produce a similar or better r-squared, for instance? What should be expected from a model within this framework? I don't see in the piece any specific argument explaining why this r-squared is meaningful. If this is the primary critique of expected goals, I'm looking for an explanation of why these two particular r-squared values suffice as a critique.
3) And Bertin wrote some stuff specifically about how I developed my xG method.
The discussion of my work in the article deals with one chart in my methodology article. It's a graph showing the regression line of an exponential decay model plotted against adjusted distance to goal. Here's how I explained it in the piece:
The basis of my model is an "exponential decay" formula. Obviously as you move further from goal, your chance of scoring decreases. But at what rate? How much better is it to be three yards from goal, compared to six, twelve or eighteen yards away? Exponential decay models suggest that your chance of scoring decreases non-linearly as you move away from goal, with larger drops occurring in moves from three to six yards, and relatively smaller drops coming at twelve to sixteen yards. This is what an exponential decay curve looks like. The data points are non-headed non-cross assisted shots bucketed by adjusted distance from goal (0-6 yards, 6-9, 9-12, 12-15 and so on).
The R-Squared on this is a comically high 0.997. That's just the luck of the bucketing to some degree, but I think it shows that exponential decay is the right model.
Here are the key points that Bertin makes about that graph:
First, an r-squared that high should be suspicious in itself. Complex systems will almost never have such a great fit with just one variable doing the prediction. Near perfect? In soccer? With 12 data points? No.
Finally, and most importantly, if it really were the case that just the distance by itself almost perfectly predicts (.997) the probability that a shot goes in, then you have reached the Singularity. You don't need to bother with any of the subsequent conditions (was it from a cross, what's the speed of the attack, etc.). Your work is done. Close up the laptop, ask for a raise, order drinks.
First, as I say in the article, this is not a graph of all shots. This plot excludes headers and shots assisted by crosses, and it does not include any adjustment for speed of attack. There are no particular claims being made here about the specific r-squared other than that it's high and it's kind of funny. My conclusion is merely that an exponential decay model will be a good equation to use as I continue elaborating the system. As best as I can tell, Bertin has simply misread what I was doing and what I used that 0.997 r-squared to signify.
So after coming up with that one exponential decay curve on a bucketed subset of shots, I did not close up the laptop or order drinks. (I did ask Graham for a raise. He said no.) Instead, I continued elaborating the expected goals system, building the model from shot type, assist type, speed of attack, dribbles before the shot and a few other adjustments. Why didn't I just stop when I got a super high r-squared number?
Because the r-squared I published was never meant to be determinative. I used it merely an indicator that I was using a good curve onto which to fit my data, so that I could proceed with building a more elaborate and specific model based on that equation type. I did not use r-squared to prove my point because it requires much more than a bare r-squared number to make a strong argument.
4) There are lots of great critiques of expected goals
The expected goals method is very much a work in progress. It is open to critique from a wide variety of angles. I published my expected goals method openly in order to facilitate critique, as well as to help other analysts who want to use it for their own purposes, to build upon it or edit it.
Here are a couple of my biggest problems with expected goals. I already mentioned its failure to include non-shot chances (at least in most of its incarnations, Daniel Altman has produced a method which is not based on shots in the same way).
A) It doesn't know where the defenders are.
This is a big one. We simply do not have the data. Opta tracks ball actions, not player locations. And we know simply from experience that if you are well-defended and there are players between you and the goal, you are much less likely to score than if you're free one-on-one with the keeper. There are a bunch of ways that my expected goals method attempts to estimate defensive pressure, but it's not the same as actually knowing about it. An excellent paper by Lucey, Bialkowski et al presented at the Sloan conference demonstrated that knowing defender location significantly improves the accuracy of expected goals, just as we would expect.
Perhaps over time defender location numbers even out to some degree, but certainly in looking at single matches defensive location can make a huge difference, and expected goals can miss badly.
B) It doesn't care who's shooting.
As everyone knows, some footballers are better at scoring than other footballers. In its current incarnation, expected goals does not care if Lionel Messi or Chris Brunt took the shot. It doesn't care if a striker or a center back made the attempt. And it is clearly true that the best shooters score more of their shots and that players in attacking positions score more of their shots. I hope to add in player and positional regressions to my next iteration of expected goals, but for now expected goals is missing out on some important data.
C) It may be missing on the very best teams.
This is something that Bertin notes—that the big misses for most xG models appear to be the elite clubs of Spain and Germany: Barcelona, Real Madrid and Bayern Munich. (My expected goals model has rarely had trouble with Dortmund.) There are a lot of possible causes here, but this finding may reflect some emergent properties of having a whole bunch of great players on the pitch at once which allows these teams to create better chances than expected goals can account for. It is a topic that requires significant study, and it has yet to be done publicly as far as I know.
5) I still like expected goals.
We should always be improving our methods and we should always be critiquing them. A minutely descriptive, intuitively articulated, highly accurate expected goals model remains no more than a goal itself. But the only way we're going to get there is by continuing to trudge this road and build and improve and discuss the models we have.
So I'm working on new improvements to the system. I want to adjust for the identity and formation position of the shooter. I want to account for the location from which the assist pass was played—it turns out that assist passes from good locations massively increase the rate at which shots are scored. (And I can explain why I think this makes all sorts of sense.) I want to work on including some non-shot chances—at the very least I want to include last-ditch tackles and particularly tackles or interceptions by the goalkeeper in the model. By adding in more specificity to expected goals, I think we can make progress toward that ideal xG system that really describes clearly the kind and quality of chances created.
The key question for me remains, as I said in my methodology article and as I said in section (1) above is—does this make football sense? I only include factors in my expected goals regressions that are in advance intuitively related to the creation of good attacking chances. Expected goals as an idea is all about intuitive football sense. We can continue improving our models and more importantly we can keep learning more about football if we maintain that focus.
The statistic is open to critique, but reasons I like it are pretty straightforward. First, it has been shown clearly and at length to be better than currently existing alternatives for a wide variety of uses. Second, its model design reflects the reality of football and the ways in which players on the field attempt to create or prevent good chances. Because of this, expected goals constitutes an important step along the path toward a truly robust model of scoring chances and of football itself.