Our A.I uses Expected Goals (xG) as one of it’s parameters for success and I can confidently say that without xG it wouldn’t be anywhere near as good. But of course there are many more factors at play that will take several articles to describe.
However, if you want to know more about how Football Machine and are interested in joining you can click here.
Ever since the introduction of xG (expected goals) in 2012, this metric has evolved into one of the most pervasive and illuminating aspects of football analytics.
Expected goals was among the pioneering advanced metrics that found widespread recognition, even among casual football bettors.
Consequently, it has naturally encountered its fair share of skeptics over the years.
This marks a battle between the conventional manner of perceiving the sport and the emerging realm of data analytics. However, before we form our conclusions, it is crucial to grasp the mechanics of this metric and its practical application.
Let’s go through it again: What Is Expected Goals (xG)?
You probably heard of the term before and seen it explained but I’m going to go through it again for those who need a different perspective.
Expected goals (or xG) measures the quality of a chance by calculating the likelihood that it will be scored by using information on similar shots in the past.
It uses nearly one million shots from historical database to measure xG on a scale between zero and one, where zero represents a chance that is impossible to score, and one represents a chance that a player would be expected to score every single time.
We know that a chance from the halfway line isn’t as likely to result in a goal as a chance from inside the penalty area.
With xG, we can give numbers to these scenarios.
For example, suppose the chance from inside the box is assigned an xG of 0.1. This means that a player would, on average, be expected to score one goal from every ten shots in this situation or 10% of the time.
The terminology may be new, but these phrases have been used by football fans and commentators for years before xG was introduced – “he scores that nine times out of ten” or “he should’ve had a hat-trick today”.
How Do We Calculate Expected Goals?
When we watch a football game, our intuition often guides us in assessing the likelihood of different chances resulting in goals. We consider factors like the shooter's proximity to the goal, the angle from which they're shooting, whether it's a one-on-one situation, or even a header.
The challenge arises from the sheer volume of shots in a game, averaging around 25 shots per match. Across a weekend in most competitions, that adds up to 250 shots. Even the most astute eyes in football would struggle to accurately assign scoring probabilities to all these unique scenarios. And, let's face it, who has the time for that?
But here's where artificial intelligence comes to the rescue!
With the aid of a model, A.I. can swiftly calculate the probability of a goal for all 9,609 shots taken during the 2022-23 Premier League season in a matter of seconds. This remarkable efficiency extends to the 45,764 shots from the top five European leagues in the previous season and even the colossal 768,394 shots across all competitions. That's a significant workload.
Our xG model relies on a machine-learning technique called XGBoost, empowered by data drawn from nearly one million shots in historical records. This training dataset spans 40 different competitions from 2018-19 to 2021-22.
The model takes into account various variables leading up to the moment the shot is taken, evaluating how over 20 different factors influence the likelihood of a goal being scored. Some of the key factors include:
Distance to the goal.
Angle to the goal.
Goalkeeper position, providing insight into the likelihood of them making a save.
Shooter's view of the goal mouth, considering the positions of other players.
The level of defensive pressure faced by the shooter from opponents.
The type of shot, whether it's a volley, header, or one-on-one.
The pattern of play, such as open play, fast break, direct free-kick, corner kick, or throw-in.
Information on the preceding action, including the type of assist, like a through ball or cross.
One notable and innovative aspect of xG is the goalkeeper position feature, which enables A.I to estimate the probability of a goalkeeper making a save. It considers factors such as the goalkeeper's proximity to the shot (indicating their reaction time) and their positioning in relation to the shot's line of sight to the goal. This feature also accounts for whether the goalkeeper was inside the penalty box and capable of using their hands.
In addition to the interaction between these variables, our xG model predicts where the shooter is likely to aim in the goal and how this choice affects the likelihood of the shot being saved. These features allow us to assess goalkeeper positioning and identify the optimal location for making a save.
We recognize that certain situations are highly unique, and these are addressed independently. Penalties, for instance, are the most consistent type of shot in football and are attributed a constant value reflective of their historical conversion rate, which is 0.79 xG.
Common Misconceptions
Game-Level xG
Common criticisms of expected goals often arise when the metric is misapplied, especially at the game level. It's a misunderstanding to assume that a team with a higher xG total in a match should have necessarily won the game. xG strictly measures the quality of chances, not the expected game outcome.
Understanding "Expected" Goals
A common misinterpretation lies in the literal name of the metric. "Expected goals" does not imply that goals are expected to occur exactly as the likelihood predicts. We recognize that goals come in whole numbers, not fractions. The term "expected goals" is drawn from the mathematical concept of "expected value," and it serves as a measure of the likelihood of an outcome occurring.
Think of it like a fair coin toss, where heads and tails each have a 50% expected likelihood. This doesn't mean we expect exactly half of the tosses to land on each outcome; instead, over a large number of tosses, the total number of each outcome should closely follow this pattern or regress to this mean. The same principle applies to expected goals. Variance from the expected value is inherent, and it offers valuable insights for analysis in football.
Understanding xG Overperformance
When a player or team consistently outperforms their xG, it doesn't mean they are bound to underperform in the future, which would be a manifestation of the Gambler's Fallacy. Rather, we expect them to eventually regress to scoring in line with their expected goals, but it's essential to acknowledge that they've already "banked" this overperformance.
For instance, if a player has scored five goals more than their expected goals total at the start of the season, it's likely they may still surpass their expected total by those same five goals by season's end. It doesn’t mean that they will all of a sudden score less to “even things out”.
Similarly, if a coin toss lands on heads ten times in a row, future tosses still have an equal chance of landing on heads or tails, but the ten instances of heads have already occurred.
Expected Goals Depth
Football is a relatively low-scoring sport and so our ability to measure the likelihood of a goal being scored is essential context.
With expected goals, we can arm Football Machine with another tool to quantify the stories that every football fan wants to hear. Which striker is struggling with their finishing? Which team’s form suggests they should be higher in the league table?
The unrivalled depth of xG’s data means that we now have over 4.5m shots enriched with xG values for more than 100,000 players, which allows us to compare and understand the performances of teams all over the world.
xG is a metric that goes beyond the traditional shot counts, but it is important to remember that it is still just a metric.
We can use it to evaluate underlying performances, but it is actual goals that are going to win you football matches.
Football is unpredictable and goals can come from any number of unexpected outcomes but with expected goals, we can explain just how unlikely these were.