The origins of goals in the German Bundesliga

J Sports Sci. 2021 Nov;39(22):2525-2544. doi: 10.1080/02640414.2021.1943981. Epub 2021 Jul 25.

Abstract

We propose to analyse the origin of goals in professional football (soccer) in a purely data-driven approach. Based on positional and event data of 3,457 goals from two seasons German Bundesliga and 2nd Bundesliga (2018/20,219 and 2019/2020), we devise a rich set of 37 features that can be extracted automatically and propose a hierarchical clustering approach to identify group structures. The results consist of 50 interpretable clusters revealing insights into scoring patterns. The hierarchical clustering found 8 alone standing clusters (penalties, direct free kicks, kick and rush, one-two's, assisted by header, assisted by throw-in) and nine categories (e.g., corners) combining more granular patterns (e.g., five subcategories of corner-goals). We provide a thorough discussion of the clustering and show its relevance for practical applications in opponent analysis, player scouting and for long-term investigations. All stages of this work have been supported by professional analysts from clubs and federation.

Keywords: Hierarchical Clustering; Professional football (Soccer); Sports analytics; Tactical Analysis.

MeSH terms

  • Seasons
  • Soccer*
  • Standing Position