Football, a breeding ground for machine learning
The football environment abounds in structured data (e.g. player statistics) and unstructured data (e.g. match videos), making it fertile ground for the development of machine learning solutions.
Machine learning is defined as the use of a computer program that learns from experience (the data) to model, predict or control a situation. It is a type of artificial intelligence.
Machine learning is particularly effective for classification and prediction tasks.
Classification with machine learning
Let’s take the example of identifying the teams represented in the Premier League photo library. After a learning phase, a machine learning algorithm could easily classify new photos submitted to it in one of the 20 club categories. The algorithm will find the common and unique points. It will then identify which team shirt is in the image. Based on experience, machine learning can distinguish the Arsenal shirt from the Chelsea shirt. It can also tell apart the Liverpool shirt from the Manchester United shirt, even if they are the same color.
To manage the learning phase, a method must be chosen. The two best known are supervised learning and unsupervised learning:
- Supervised learning consists of putting a label on the data used by the algorithm to learn. This allows the categories sought to be predefined. So we’re going to make it clear to the algorithm that this shirt is a West Ham shirt, that shirt is an Aston Villa shirt, and so on.
- Unsupervised learning lets the algorithm identify and create its own categories based on the differences it finds.
In our case, supervised learning will give us 20 categories, one for each Premier League club. Unsupervised learning, on the other hand, will probably give us 40 categories, two per Premier League club, as it will not match the club’s home and away shirts. The choice of the machine learning method used is therefore important and depends on the case.
Here, supervised learning is more interesting. But, unsupervised learning may be better for other cases. It is best when we want to classify data without pre-established categories.
There are many variations on these types of learning, such as semi-supervised learning. If we have a training dataset of 1000 photos, putting a label on each one is a tedious task. We can limit ourselves to putting a label on 10 photos per club, for both home and away shirts. The algorithm will then be able to label the other photos on its own with the semi-supervised method.
Prediction with machine learning
Machine learning is particularly useful for predictive tasks. For a soccer club, a predictive model could be applied to the club’s official online store. We aim to find, at the time of purchase, the best product to offer the customer. This should boost satisfaction and increase the average basket size. For example, if a customer buys tickets for the first time on the site, a club scarf can be offered for purchase.
The algorithm will learn from some of the store’s historical data. For example, this data could be :
- Day and time of order
- Proximity of the order to a home match
- Location, age and gender of customer
- Initial cart value
- Items in basket
Once all this data has been analyzed, a model can be identified and the best product suggestions made to the customer. Machine learning then predicts the most suitable product.
Find out more in this article: Liverpool’s use of AI to optimise its corners.
The image illustrating this article was created using an artificial intelligence image generator called Leonardo.ai. This application does not specialise in football-related content, yet it quickly produces interesting results.