Expected Assists (xA) - a {THREAD}
During the summer, @Reuser5, @StatOnScout and I were discussing Expected Assist numbers and realised that they could be quite materially different depending on the source we were using - OPTA (via FFS) or Understat.
1/n https://twitter.com/rogue_wee/status/1166250542533959680?s=20
During the summer, @Reuser5, @StatOnScout and I were discussing Expected Assist numbers and realised that they could be quite materially different depending on the source we were using - OPTA (via FFS) or Understat.
1/n https://twitter.com/rogue_wee/status/1166250542533959680?s=20
By assigning the xG value of a shot to the player who passed them the ball, we can get an instantaneous xA measure - the expected amount of assists generated by a player *given the shots that took place*.
This is the method being employed by, for example, Understat.
2/n
This is the method being employed by, for example, Understat.
2/n
However, the method employed by OPTA is rather different and, from a footballing perspective, perhaps more natural.
Their method is based on the final location of the pass (the point at which the responsibility of the passer ends) - shot, or no shot.
3/n
https://www.optasportspro.com/news-analysis/blog-expected-assists-in-context/
Their method is based on the final location of the pass (the point at which the responsibility of the passer ends) - shot, or no shot.
3/n
https://www.optasportspro.com/news-analysis/blog-expected-assists-in-context/
Comparing the xA values of certain players between the sources, it doesn't take long to realise that they can be quite different.
Below are the Top 10s between Understat & OPTA (via FFS) for total xA to date in the 19/20 season.
4/n
Below are the Top 10s between Understat & OPTA (via FFS) for total xA to date in the 19/20 season.
4/n
For #FPL purposes the main reason I am interested in xA stats is from the perspective of how likely they are to predict future assists.
I analysed player-level xA90 (Expected Assists per 90 Minutes) from the first & second halves of the 18/19 season in three different ways:
5/n
I analysed player-level xA90 (Expected Assists per 90 Minutes) from the first & second halves of the 18/19 season in three different ways:
5/n
1. Descriptiveness: how well does the xA metric describe the assists that happened over the same period?
Using Half 1, GWs 1-19 (H1), I looked at how the xA90 compared to the actual observed assists per 90 mins (A90) for the same period.
6/n
Using Half 1, GWs 1-19 (H1), I looked at how the xA90 compared to the actual observed assists per 90 mins (A90) for the same period.
6/n
1. Descriptiveness (cont.)
As might be expected, the method used by Understat describes what happened better; the x-coefficient is ~1 and the variance explained (R-Squared) is greater.
This is intuitive as it is using only the events that resulted in actual shots.
7/n
As might be expected, the method used by Understat describes what happened better; the x-coefficient is ~1 and the variance explained (R-Squared) is greater.
This is intuitive as it is using only the events that resulted in actual shots.
7/n
2. Self-prediction
In order to be a useful measure of predictiveness, we expect a metric to be able to predict itself quite well.
Advanced metrics are known to be much better at this than typical stats: here are 2017/18 H1 vs H2 correlations for Assists & Key Passes.
8/n
In order to be a useful measure of predictiveness, we expect a metric to be able to predict itself quite well.
Advanced metrics are known to be much better at this than typical stats: here are 2017/18 H1 vs H2 correlations for Assists & Key Passes.
8/n
2. Self-Prediction (cont.)
Now looking at 2018/19 H1 vs H2 for the two different xA approaches, we see that OPTAs predicts itself better; there's a tighter correlation and less variance.
We should feel more confident about OPTAs xA figures projecting into the future.
9/n
Now looking at 2018/19 H1 vs H2 for the two different xA approaches, we see that OPTAs predicts itself better; there's a tighter correlation and less variance.
We should feel more confident about OPTAs xA figures projecting into the future.
9/n
3. Assist Prediction
Finally we look at which H1 xA measure predicted actual assists better in H2.
Whilst noting that the strength of the relationship & the variance explained are both weakened (assists are harder to predict than goals!), OPTAs measure is more successful.
10/n
Finally we look at which H1 xA measure predicted actual assists better in H2.
Whilst noting that the strength of the relationship & the variance explained are both weakened (assists are harder to predict than goals!), OPTAs measure is more successful.
10/n
3. Assist Prediction (cont.)
Reducing our sample to players playing at least 900 minutes in both H1 and H2 of the season gives us some more robust statistics.
Here, the gap in predictive ability between the two methods grows somewhat, in favour of the OPTA method.
11/n
Reducing our sample to players playing at least 900 minutes in both H1 and H2 of the season gives us some more robust statistics.
Here, the gap in predictive ability between the two methods grows somewhat, in favour of the OPTA method.
11/n
Conclusions
~ Understat's method will tie up better to past data, because it is directly linked to shots that occurred.
~ OPTAs method is somewhat more predictive, both of itself and of future assists.
~ Goals are hard to predict; Assists add another degree of difficulty!
12/n
~ Understat's method will tie up better to past data, because it is directly linked to shots that occurred.
~ OPTAs method is somewhat more predictive, both of itself and of future assists.
~ Goals are hard to predict; Assists add another degree of difficulty!
12/n
Note: all of this is looking at conventional, OPTA-defined assists only and doesn't scratch the surface of FPL assists (shots leading to rebounds, fouls leading to penalties, etc), those are another matter entirely! 
Hope you've found this useful, now leave me alone.
(13/13)

Hope you've found this useful, now leave me alone.
(13/13)