The data revolution in football

Data analysis is widely used these days to improve the performance of football clubs. There are many positive examples. Brentford went from League One in 2014 to be a solid mid-table Premier League team today. Similarly, Toulouse FC climbed to Ligue 1 and won the Coupe de France 2023 with a team recruited with extensive use of data. Access to data in football is a strategic issue. Some solutions are specialized in this sector, such as Wyscout, presented in this article, which is one of the leaders in the field.

Wyscout, a rich, international database

Wyscout is an Italian company specializing in football analysis, founded in Genoa in 2004. The solution offers video and data analysis tools to improve recruitment and match analysis. The company provides in-depth data on player performance, team tactics and match patterns for coaches, teams and players. 

It is one of the largest libraries of football videos and data in the world, covering over 600 international competitions. Australia’s A-League, Mexico’s Liga MX, Brazil’s Serie A and Germany’s Bundesliga are all covered for example. A large number of youth tournaments are also available.

Wyscout can be used to develop remote scouting of players. It is particularly useful for teams in a geographical area that is ‘isolated from a football point of view’. Urawa Reds, the Japanese J-League team that won the AFC Champions League 2022, use Wyscout extensively. The tool enables them to scout widely in Europe and South America while remaining in Japan.

Videos of players performing specific actions are available on demand. The scouting experience can be carried out 100% remotely by combining data and video. This type of remote analysis has also been boosted by COVID. It is possible, for example, to search for all the tackles made by a defender during a given period. The time and resource savings are colossal with this type of solution. 

Wolverhampton Wanderers make extensive use of Wyscout for their scouting, as described in the video at the end of this article. The club is much more efficient in its scouting and the solution allows it in particular : 

  • Saves time: By carrying out an initial analysis on Wyscout, it’s easy to make an initial selection of players to monitor and avoid unnecessary travel.
  • Saves money: By carrying out a large number of actions remotely, there is no longer the need to send a recruiter in another country. The video uses the example of a youth competition in Brazil. With Wyscout, the club can analyze all the matches in the competition without having to pay all the costs involved in sending a group of scouts.
  • Information gains: With Wyscout’s video and statistical support, recruiters can draw up a complete profile of a player, complete with video, and present it to the club’s coach or management. This was not possible before, and it’s a real information booster when it comes to the final decision, which is usually made by the management.

These gains are particularly interesting for smaller clubs. Before the existence of this type of solution, they were really limited by their resources and budget. Now they can do a lot more despite having a smaller workforce.

The importance of data for analysis

The data is very precise and extensive, with over 100 pieces of data collected per player for each match, including : 

  • Successful passes – Progressive passes – Passes in the final third 
  • Time-saving fouls – Protest fouls – Simulation fouls – Violent fouls

This information is used to refine scouting. A player who has committed 10 time-saving fouls must be differentiated from a player who has committed 10 violent fouls. Similarly, a very high rate of successful passes can be interesting when combined with a number of passes in the final third. Without this precision on passes, there is no way of knowing a player’s tendency to take risks and the verticality of his game.

Below is an example of the data presented for Real Madrid right-back Dani Carvajal during his match against Manchester City in 2020 (data taken from the Wyscout website).

Carvajal Wyscout data

From all the events, metrics are calculated for each player. In particular, expected goals (xG) or expected assists (xA) are calculated. These metrics are no longer events that occur, but estimates of the number of times an event should happen as a function of the player’s actions. Expected goals are based on a pre-shot model that estimates the probability of a shot resulting in a goal. It is based on factors such as the distance and angle of the shot in relation to the goal, the type of pass and the pressure from the opposition. For example, a penalty kick has an xG of 0.76 because it has a 76% chance of being scored. 

This type of data gained in popularity in the 2010s because it offers a different view of reality while being highly correlated with it. They are particularly useful for eliminating the random factor. Indeed, the random factor is quite significant in football, which is a sport with a low number of goals compared with basketball or handball, for example. On 14 September 2024, Nottingham Forest won against Liverpool 0-1 at Anfield. The expected goals were 1.17 for Liverpool to 0.59. Based on this statistic, Liverpool can blame the lack of effectiveness of their strikers. If the match had been replayed a large number of times, Liverpool would have won 52% of the time according to Understat and lost 15% of the time. Liverpool’s tactics should therefore be called into question only to a limited extent after this match, as the ingredients for victory were, on the whole, in place. 

The same logic applies to players with goalscorers who can over- or under-perform their expected goals. During the 2022-2023 season in Ligue 1, for example, two scorers exploded: Elye Wahi with 19 goals and Folarin Balogun with 21 goals. But the two players accumulated 14.5 xG for Wahi with Montpellier compared with 26 xG for Balogun with Reims. Wahi was more effective than Balogun during the season, but he also created far fewer chances. Depending on a team’s needs, it might be preferable to recruit one or the other. 

So we see that data can bring a new vision to football. It is now essential for making the best possible decisions.

The limits of scouting with data

Although data is very important, there are limits to its use. First of all, certain elements can only be observed by being present on the pitch. For example, a player’s behavior off the pitch and his attitude on the bench. Secondly, remote scouting can be dehumanizing. Scouting is the basis of recruitment. It’s a meeting between a player and a club through its representatives. This part of the process is crucial if you are to gain a better understanding of a player’s environment, by meeting those close to him, for example. It will be easier to convince a player to sign if he has already met members of the club. Finally, the international factor and a player’s uprooting should also be taken into account. With this type of solution, it is easy to forget how different the environment around a player is from that of the club where he is wanted. With players being recruited at an increasingly young age, it is important to put in place an appropriate acclimatization procedure.

Artificial intelligence, the next step?

Wyscout is a solution for improving scouting in a number of ways. With such comprehensive databases, artificial intelligence models can be developed. In a way, certain statistics such as expected goals are already forms of predictive AI. Solutions of this type already exist, such as Sentients Sports, which can calculate scores for tactical compatibility, chemistry with other players or potential using this multiple data. 

Below is an interview with John Marshall, Head of Recruitment at Wolves, on the use of Wyscout at his club.

To find out more about the use of AI in tactics, read our article on the collaboration between TacticAI and Liverpool.

If you have any questions about the use of data or AI in football, please contact our team.

Source :