AI predicts performance of Premier League's top strikers

22 March 2019 8 min. read
More news on

The future of football will likely see managers and fans alike turn to artificial intelligence to inform decisions on and off the pitch. With the race for the Premier League’s golden boot an intense contest between five Premier League stars, data analytics is being used to predict how prolific the top goal-scorers will be over the coming months.

The world of sport is increasingly becoming augmented by technology. Digital enhancements range from Infosys and ATP’s AI tennis platform – aimed at providing performance insights within seconds of points being played on-court – to Capgemini’s introduction of a collection of insight and statistic tools at the Rugby 7s World Cup, or Atos providing digital broadcast services to the Olympics via the cloud.

Boston Consulting Group recently even had a stab at predicting the outcome of Wimbledon, leveraging an artificial intelligence system to suggest Roger Federer would win the tournament. While this last example eventually revealed the world has some way to go before it can rely on technology to reliably forecast the results of elite sport, even with regards to all-time-greats such as Federer, it still demonstrated the leaps and bounds that AI development has seen in recent years, as well as the potential for future systems to get the call right in the world of tomorrow.

Mohammed Salah goals vs xg 2018/19

Following on from these stories, leading football data provider Opta has initiated a new AI system, which will seek to predict the expected number of goals for Premier League footballers. In news that will no doubt please real life talent scouts and fantasy football managers alike, the xG system uses historical averages to determine how prolific a player may be in the future. Adding together numerous factors – including the quality of a shot, assist type, shot angle, distance from goal, and whether it was defined as a big chance – can give an indication as to how often a player should score, on average.

According to Haidar Altaie, a Data Scientist at analytics firm SAS, football has long lagged behind other sports in the use of analytics. While the average football fan often swears by the apparent wisdom that “my eyes tell me everything I need to know”, this is largely due to the long-time misuse of statistics by other fans. However, the average NBA fan is likely to be more equipped than ever with statistics to understand a game, including points, assists, blocks, steals, percentage field goals, free throw differentials and much more, widely available and most importantly, used and discussed in mainstream media.

This has demonstrated exactly how football fans and the sport’s leadership have historically missed out on the use of analytics, as other sports have increasingly shown its potential to anticipate the outcome of investments in certain players. Altaie added that the new xG system allows for the examination of underlying performance compared to actual results with the naked eye, helping to comprehend the game. If used correctly, it could predict future outcomes more accurately than any other measure. 

The race for the Premier League’s golden boot, a trophy presented annually to the league’s top scorer, is currently spearheaded by five Premier League stars. Sergio Agüero holds a slim lead with 18 goals, while Mohamed Salah, Pierre-Emerick Aubameyang, Harry Kane and Sadio Mané follow, all on 17 goals respectively. At the start of the season, this might have surprised a few people, particularly in the case of Mohammed Salah; however, Altaie argues that his performance since his “slow” start of just three goals by mid-October shows how AI can now be more trustworthy than the ‘gut instinct’ many armchair pundits still rely upon.

“When Mo Salah scored a record breaking 32 goals to become Premier League top scorer of the 2017/18 season, expectations were high for him... However, when he only scored 3 goals in 8 games… many described him as a “one-season wonder”... Using SAS for visualization in SAS Visual Analytics, we were able to look at the xG metric. Salah was leading the league with expected goals, yet he wasn’t anywhere near the top scorers, where 8 players had scored more than him, suggesting he was merely underperforming (or, unlucky) rather than in decline."

Jamie Vardy goals vs xg 2018/19

Fast-forward four months and another 17 league games, and Salah now sits as one of a gathering of four Premier League scorers on 17 goals. With Liverpool’s talisman now much more aligned to his expected goals, this suggests that underlying performances may well signify more than just the output, over the long run. 

At the other end of the spectrum, Leicester City striker Jamie Vardy shows how xG can anticipate a drop off in performance to reveal the true value of recruiting a certain player to a team’s squad. Vardy finished the 2017/2018 Premier League season with an impressive 20 goals – only four fewer than the 2015/16 season when he famously helped Leicester win the league – but his numbers suggested this was unlikely to be workable the following season.

“[Last season, Vardy] outperformed his xG by nearly 5 goals!” explained Altaie. “xG suggests that a trend like this isn’t sustainable over a long period, despite consistently keeping this up through a 38-game season. If we have a look at this season’s numbers so far below, we see the reverse is true. So the effect of combining this with last season (and extending the sample size), is that that we revert more to the norm of xG equals actual goals.”

Nobody’s perfect

It is important to note that the xG system still has blind spots which it cannot anticipate. While it has the potential to open a whole new world where data can now tell a story, beyond just ‘reporting’ on the game, some variables will always remain. That mean players can still buck the trend to either outperform or miss their xG predictions.

Altaie explained, “Scoring goals is a skill, and while one could rightfully argue that getting into scoring positions is more important than the finish, it’s still possible for one player to be better than others in putting the ball in the back of the net… Anyone who’s ever watched or played football knows that it is the most open game with more unique possibilities than any other sport (which could perhaps be a reason why it’s behind on ‘useful’ analytics compared to other team sports). It means only accommodating several variables does not tell the full story… and perhaps we will never be able to fully accurately calculate those.”

Despite these limitations, however, Altaie expects that football will soon catch up with the rest of the sporting world in terms of analytics with xG. He added that he believes it will allow teams and fans to scout up-and-coming talent, and see if they’re over performing or the real deal, while in the future the analytics on offer will be reliable enough for coaches to alter tactics to focus on chance creation over the long run, as opposed to focusing purely on the next result.

The Data Scientist concluded, “Finally, it will allow us to open a new chapter in football analytics, where the use of AI can play a valuable role, as it’s already doing with SciSports. The Dutch sports analytics company gives a ‘quality’ rating to each individual player using model based mathematical algorithms, based on the contribution of a player to the teams result, all in real-time.”