Skip to content

This is a repository on hands-on introduction to football (soccer) data analysis, targeting those who want to start working with their own football data and perform analyses.

License

Notifications You must be signed in to change notification settings

indrag49/football-analysis-project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

63 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

An Introduction to Football (Soccer) Data Analysis with Python

This is a repository introducing hands-on football (soccer) data analyses to those who want to start working with football data and perform analyses on the same.

Summary:


This project introduces the following concepts:

  1. How to access open event data from statsbomb api using statsbombpy {Notebook},
  2. How to draw and visualize a soccer pitch using mplsoccer {Notebook},
  3. How to visualize a pass network map for a particular team in a particular game {Notebook},
  4. How to use NetworkX module to analyse the pass network (eg. finding out degree distribution of passes, clustering coefficient, centrality, etc.) {Notebook},
  5. How to implement computational geometric concepts like Convex Hulls, Voronoi diagrams and Delaunay triangulations to understand and visualize football tracking data (using scipy.spatial and mplsoccer) {Notebook},
  6. How to analyse Expected Goals (xG) using open data from statsbomb {Notebook},
  7. How to use Radar Charts for comparing and evaluating players' per 90s stats using soccerplots package, {Notebook}
  8. How to use Linear Regression on football data, with the help of scikit learn module, to predict correlation betweeen Goals scored and Shots on goals {Notebook},
  9. How to make use of Elastic Net to find the relationship between number of shots taken vs the number of goals scored {Notebook},
  10. How to use Logistic Regression to predict whether a pass is a successful pass or not (given some features of the pass) {Notebook},
  11. How to use a Decision Tree Classifier to build a model for predicting a shot outcome from a particular team {Notebook},
  12. How to use Random Forest to predict whether a pass is a successful pass or not {Notebook},
  13. How to use Naive Bayes Classifier to predict a pass outcome {Notebook},
  14. How to use K-means clustering to cluster shot outcomes for Barcelona in La Liga {Notebook}

References:


Resources that helped me start with football data analysis:

  1. Friends of Tracking youtube channel usually hosted and maintained by Dr. David Sumpter,
  2. Book Soccermatics by Dr. Sumpter,
  3. Youtube channel maintained by McKay Johns,
  4. FC Python blog,
  5. Graph Theory and Complex Network: An Introduction by Dr. Maarten Van Steen,
  6. Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow: Concepts, Tools, and Techniques to Build Intelligent Systems by Aurélien Géron.

Miscellaneous data files:


  1. La Liga 2020-21 shot stats - Sheet1.csv exported from FBREF.
  2. 2020-21 La Liga player stats (per 90s) - Sheet1.csv exported from FBREF.

About

This is a repository on hands-on introduction to football (soccer) data analysis, targeting those who want to start working with their own football data and perform analyses.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published