We all know what happened yesterday. Tottenham Hotspur hosted Newcastle United. We lost despite an advantage in shots on target of 14-to-4. We also had an advantage in shots on target inside the box of 9-to-3, in close/central SiBoT 8-to-1, and we were 4-to-1 in big chances to boot. We lost one-nil.
In this article, I want to address an argument I saw quite a lot in the reaction thread. Nine goals! From eleven matches! Only six from open play! These are statistics. If you make an argument based in statistics, it's important to follow the basic structures of inquiry that apply in the field.
What is the expected sampling error on goals scored over eleven matches of a season? I have found, in studying goals scored and shooting statistics, that a huge confounding factor in estimating a club's goal-scoring ability is the variation of shot conversion and shot on target conversion. There is very little week-to-week correlation of shot conversion, and so goals scored numbers which are highly dependent on outlying conversion numbers should be expected to regress to the mean. A sample of even a half season is not large enough to confidently isolate expected shot-on-target conversion rates.
So is that what's going on with Spurs? Emphatically yes. The club's problems in goal scoring are all about shot conversion, and in particular our terrible rate of conversion of shots on target inside the box. The following table is all of the clubs in the EPL listed in order of their rate of SiBoT conversion.
|West Bromwich Albion||10||26||39%|
|West Ham United||7||25||28%|
So your top five in SiBoT conversion are: Arsenal, Newcastle, Sunderland, West Brom, Manchester City. Your top five in total SiBoT are Manchester City, Liverpool, Arsenal, Chelsea, Southampton. I think it's pretty clear which statistic is telling us more about the quality of a club's attack. You can squint and see a bit of a pattern in the above table, with relatively more good teams toward the top, but the effect is faint at best. It's nothing compared to the clear team quality effect seen in total SiBoT.
I actually do think there is real and important variation between teams in shot conversion and SiBoT conversion. I'll be writing more soon using the shot matrix database to identify team average shot quality by location, type of shot and type of pass that sets up the shot. So I am not saying that I think conversion rate is "random" or "luck."
Theory and method in statistics matter to me because I'm a huge dork, and so the following sections get a bit abstract. These are points I care about and I'll belabor a little bit. But if you want to just get to the damn results, you can scroll down past the next two sections and read "Results" and "Discussion."
Theory: Statistics and Uncertainty
The fun thing about statistics is that as a field, it's all about how you deal with the inevitable uncertainty of life in the world. Statistics almost never tells you what is, it tells you what is most likely to be, and it gives you a rough estimate of how likely it is. When you run a statistical analysis and find a low r-squared correlation between two sets of figures, it doesn't mean the two figures are unrelated. It doesn't mean the variance between the two sets of figures is just random chance. What you learn, rather, is that at the sample size being studied, no relational effect can be found. It always leaves open the possibility that with a larger sample, or with better data, such an effect could be found.
So I'm going to spend a bunch of this piece arguing that SiBoT conversion rate over eleven matches lacks predictive utility. This does not mean I think there is no such thing as finishing skill, or that I think there's no difference between teams in the average quality of chance they create. What it means, rather, is that at the sample size being studied, with the data available to us, the particular statistic G/SiBoT does not have significant predictive utility.
If we're going to be talking about Spurs stats through eleven games—and when you say "nine goals in eleven matches" you're making a statistical argument—then these are the questions we need to be asking. How useful a statistic is this, at this sample size?
This is a point that is kind of important to me, so I want to belabor it a bit more. When someone makes the statistical argument that Spurs have a poor attack based on the evidence of nine goals from eleven matches, it's possible to respond in two different ways. One way would be to say that Spurs attack struggles because of their extremely poor G/SiBoT rate, and G/SiBoT rate is just luck so don't worry about it. I am not saying that. Instead, I am saying that G/SiBoT rate over an 11 game sample is not predictive of future goal scoring rates. This stat, at this sample size, cannot determine whether a team is particularly good or particularly bad at creating better shots inside the box or converting shots inside the box. I can't show, with the limited data that I have, whether G/SiBoT rate is "all luck." There may be meaningful differences between clubs—I think there are—but you can't identify them using an 11-game sample. So if you want to argue that Spurs suck at goal scoring, you need to use a different statistic than goals scored, since it's so influenced by G/SiBoT, and G/SiBoT is not predictive of future goal scoring.
Method: Shot Matrix Database and Model Testing
This is one of the first studies I've done using my Shot Matrix database. It's still incomplete, but I now have three full seasons plus the partial 2013 season. This gives me 60 team seasons to test, and a sample of 80 team seasons to compare. Here's my method.
I divided each season into two sections, the first 11 matches and the following 27 matches. I collected each club's total SiBoT for both period as well as their total goals scored from inside the box. (I am removing penalties from the data.) So for all 60 team seasons, I have a G/SiBoT rate for the first 11 matches and a G/SiBoT rate for the next 27.
To test the predictive utility of G/SiBoT, I'm using two methods. The first is R-Squared correlation. This method tells me if two data sets show any level of relationship. The data sets [1,2,3] and [1,2,3] would have a perfect R-Squared correlation of 1.0, they have a relationship of identity. An R-Squared near 0 means that the data does not demonstrate any evidence of a relationship.
Second, I'm using Root Mean Square Error and Mean Absolute Error. This are the two method used most often to test predictive models. For this test, I'm predicting goals scored using two different calculations. First, I predict goals scored in the final 27 matches of the season using the G/SiBoT rate seen in the first 11. So if a club scored 16 goals on 38 SiBoT in the first 11 matches (42%) and then put 93 shots on target from inside the box over their next 27 matches, this calculation predicts the club would score 42% * 93 SiBoT = 39 goals.
Then I predict goals scored using the league average SiBoT conversion rate of 35%. This method projects for that club about 33 goals scored. I do this for all 60 team seasons in the database.
Then I take the RMSE and the MAE, to test which model made better predictions. The club in question actually scored 30 goals from their 93 SiBoT in the final 27 games of the season. So that means I have an error of 9 goals for the first expected goals calculation and an error of 3 goals for the second expected goals calculation. For MAE, I just take the average error for all 60 team seasons. For RMSE, I add up the squares of all the errors and then take the square root for the sum. RMSE is generally the better method for evaluating a model because you don't want a model that is occasionally really horribly wrong. Taking the squares "punishes" the model for making very large mistakes.
Results: G/SiBoT Over an 11-Match Sample Is Not a Predictive Statistic
If G/SiBoT were a usefully predictive statistic over an 11-match sample, then I would expect an R-Squared number close to 1.0 (heck. 0.5 would be pretty good). I would expect the RMSE and the MAE of the team G/SiBoT model to be significantly lower than the RMSE and the MAE of the league average G/SiBoT model. My findings:
Team G/SiBoT Model RMSE: 7.9
Lg Avg G/SiBoT Model RMSE: 4.3
Team G/SiBoT Model MAE: 6.4
Lg Avg G/SiBoT Model MAE: 3.6
I found basically no relationship between a team's G/SiBoT rate over the first 11 games and their G/SiBoT rate over the next 27. I found that I got errors nearly twice as large when I projected future goal scoring using first 11 match G/SiBoT than when I used a dummy projection of league average rate of G/SiBoT.
That 15% G/SiBoT number way up in that table at the top? It's not a number that usefully predicts Spurs' shot on target conversion over the remainder of the season.
Discussion: Some Examples From the Database
In my "method" section, I talked about a club who scored 42% of their SiBoT through their first 11 matches, then only 32% through their next 27. That club was Tottenham Hotspur 2012-2013. We started the season on quite a nice little hot streak of goal conversion from SiBoT, but we fell off to below league average through the end of the season. Last year, our G/SiBoT through 11 matches did not predict future finishing brilliance. It was just a number.
I also want to note just how bad Spurs' 15% G/SiBoT conversion rate is. Over 80 team seasons, it's the second-worst number in my database. I have only two clubs under 20% through 11 matches, and I have no clubs at all which were under 20% over the final 27 matches. The other team that couldn't convert were Wigan Athletic in 2011-2012. They started the season with just 5 points from 11 league matches, converting only two goals from 25 SiBoT, a shocking rate of 8%. Wigan of course added to the "Wiganlona" legend by improving over the season and escaping relegation. Their G/SiBoT from the final 27 matches was a slightly below average 33% (19-for-58), good enough to keep the Latics safe for one more season.
I have two other clubs in the database who were nearly as bad as Spurs at SiBoT conversion: Stoke City 2012-2013 and Liverpool 2011-2012. Stoke began the year at 21% conversion, 5-for-26 on SiBoT. They turned things around entirely in the second half, converting at an excellent 40% rate (22-for-55). Perhaps the cautionary tale here is Kenny Dalglish's Liverpool, who couldn't convert chances in the first half and were still quite bad in the second. They were 20% through 11 matches and just 24% over the final leg.
So, it is possible to suck at conversion over a whole season. The average of the three worst teams in my database for SiBoT conversion in the final 27 matches of the season is 32%. That's a little worse than league average. I don't think I can say, statistically, that the three percentage point difference is meaningful, but it is interesting. Maybe with a larger database I'll be able to find some small effect.
Even if Spurs had shot at the depressing 24% rate of Dalglish's Liverpool from the second half, they have scored not 5 goals from shots in the box on target but 8, for a total of 12 goals scored. At the Wigan/Stoke/Liverpool average of 32%, we'd have scored 11 goals inside the box for a total of 15 on the season. At a league average rate of 35%, we'd have scored 12 goals from SiBoT and 16 total.
The clubs most comparable to Tottenham in G/SiBoT conversion over the past several seasons have bounced back nearly to leagueaverage rates of conversion. Spurs themselves, under Andre Villas-Boas, have not shown any previous tendency to convert SiBoT at a poor rate. In the overall statistical analysis, G/SiBoT does not appear to be a statistic with any predictive utility over an 11-game sample.
Honestly, I really think this club is kind of fine. Our attack isn't great, but it's producing good chances at a good rate. (Spurs actually lead the EPL in "big chances" from open play as rated by Opta, with 22 total.) We haven't been converting those chances, but I am confident that if we keep playing this kind of football with these players, the goals will come. There are obviously ways the club can improve, and we've played some real stinkers (West Ham, Hull City), but the baseline is that this club has good attacking numbers and great defensive numbers. We're good. We can be better, but we're good. "Nine goals in eleven matches" is an unconvincing statistical argument.