This repository contains the code and data used for my Master's thesis titled "Predicting the future performance of soccer players". The purpose of this thesis was to investigate and analyze the impact of various factors on predicting the future performance of soccer (football) players using statistical, machine learning, and deep learning models. The applied methods demonstrated significantly better performance compared to random and naive predictors, providing valuable insights into the predictive factors of future soccer player performance.
This repository contains the following folders and files:
- code/data_collection: Contains code for collecting relevant player data.
- code/create_input: Contains code for enriching the dataset with additional player/match attributes.
- code/modeling: Contains code for data analysis, model development and evaluation.
- dataset: Contains the final complete CSV file with all player data.
- thesis: The complete dissertation (pdf) and presentation for a quick overview.
- requirements: Contains a list of Python packages required to run the code, provided in both txt and yaml formats.
- instructions.md: Contains step-by-step instructions to reproduce the analysis and modeling results.
To reproduce the analysis and modeling results, please refer to the instructions provided in instructions.md.
If you have any questions or comments about this thesis or the code, please contact me.
This code is licensed under the Apache License. See the LICENSE file for more details.