The identification of talent is a crucial problem in football. Traditional scouting is prone to subjective bias and faces challenges such as limited observation time and geographical reach. Football clubs are increasingly seeking to overcome these limitations by enforcing a data-driven player recruitment strategy. This thesis develops a deep learning framework to embed football players statistical profiles into low-dimensional vectors. Leveraging event data from over 26,000 matches played across 16 leagues from 2020 to 2024, the methodology constructs heatmaps of player actions, segmented into key attributes such as passing, carrying, defending, and attacking. These heatmaps are processed using convolutional neural network (CNN) autoencoders to generate compact embeddings, which are subsequently aggregated into holistic player representations. This segmented approach improves interpretability and allows granular analysis of individual player abilities. The quality of the embeddings is evaluated through a Player Retrieval Task consisting in measuring the successful retrieval of an embedding relative to the same player as the query embedding. The performance on the task is assessed through metrics such as Mean Reciprocal Rank (MRR) and Top-k accuracy metrics. Additionally, similarity searches and clustering reveal meaningful player relationships and role identification. Despite limitations such as reliance on event data and computational constraints, the findings highlight the potential of embedding techniques in football scouting. Future work could explore integrating tracking data and enhancing interpretability to further optimize player evaluations.
Embedding Football Players: A Deep Learning Approach to Data-Driven Scouting
TRASFORINI, FRANCESCO MARIA
2023/2024
Abstract
The identification of talent is a crucial problem in football. Traditional scouting is prone to subjective bias and faces challenges such as limited observation time and geographical reach. Football clubs are increasingly seeking to overcome these limitations by enforcing a data-driven player recruitment strategy. This thesis develops a deep learning framework to embed football players statistical profiles into low-dimensional vectors. Leveraging event data from over 26,000 matches played across 16 leagues from 2020 to 2024, the methodology constructs heatmaps of player actions, segmented into key attributes such as passing, carrying, defending, and attacking. These heatmaps are processed using convolutional neural network (CNN) autoencoders to generate compact embeddings, which are subsequently aggregated into holistic player representations. This segmented approach improves interpretability and allows granular analysis of individual player abilities. The quality of the embeddings is evaluated through a Player Retrieval Task consisting in measuring the successful retrieval of an embedding relative to the same player as the query embedding. The performance on the task is assessed through metrics such as Mean Reciprocal Rank (MRR) and Top-k accuracy metrics. Additionally, similarity searches and clustering reveal meaningful player relationships and role identification. Despite limitations such as reliance on event data and computational constraints, the findings highlight the potential of embedding techniques in football scouting. Future work could explore integrating tracking data and enhancing interpretability to further optimize player evaluations.File | Dimensione | Formato | |
---|---|---|---|
Embedding Football Players - A Deep Learning Approach to Data-Driven Scouting.pdf
non disponibili
Dimensione
8.42 MB
Formato
Adobe PDF
|
8.42 MB | Adobe PDF |
I documenti in UNITESI sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.
https://hdl.handle.net/20.500.14247/24428