4.1 Article

Predicting matches in international football tournaments with random forests

Journal

STATISTICAL MODELLING
Volume 18, Issue 5-6, Pages 460-482

Publisher

SAGE PUBLICATIONS LTD
DOI: 10.1177/1471082X18799934

Keywords

random forests; football; FIFA World Cups; Poisson regression; regularization

Ask authors/readers for more resources

Many approaches that analyse and predict results of international matches in football are based on statistical models incorporating several potentially influential covariates with respect to a national team's success, such as the bookmakers' ratings or the FIFA ranking. Based on all matches from the four previous FIFA World Cups 2002-2014, we compare the most common regression models that are based on the teams' covariate information with regard to their predictive performances with an alternative modelling class, the so-called random forests. Random forests can be seen as a mixture between machine learning and statistical modelling and are known for their high predictive power. Here, we consider two different types of random forests depending on the choice of response. One type of random forests predicts the precise numbers of goals, while the other type considers the three match outcomes-win, draw and loss-using special algorithms for ordinal responses. To account for the specific data structure of football matches, in particular at FIFA World Cups, the random forest methods are slightly altered compared to their standard versions and adapted to the specific needs of the application to FIFA World Cup data.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.1
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Statistics & Probability

A regularized hidden Markov model for analyzing the 'hot shoe' in football

Marius Oetting, Groll Andreas

Summary: The study presents a penalized likelihood approach for automated variable selection in hidden Markov models (HMMs), considering a large number of potentially correlated covariates. By quadratically approximating the non-differentiable penalty, the likelihood can be maximized numerically. The feasibility of the approach is assessed through simulation experiments and applied to investigate the 'hot shoe' effect in football penalty takers.

STATISTICAL MODELLING (2022)

Article Sport Sciences

The Interval Between Matches Significantly Influences Injury Risk in Field Hockey

Joel Mason, Anna Lina Rahlf, Andreas Groll, Kai Wellmann, Astrid Junge, Astrid Zech

Summary: The study found that in field hockey, a congested fixture schedule increases the risk of injuries. Matches played within 24 hours after a previous match showed significantly higher injury rates compared to matches played 3-7 days later, while higher match exposure in the preceding 7 and 14 days was associated with reduced injury rates.

INTERNATIONAL JOURNAL OF SPORTS MEDICINE (2022)

Editorial Material Orthopedics

Artificial intelligence and machine learning: an introduction for orthopaedic surgeons

R. Kyle Martin, Christophe Ley, Ayoosh Pareek, Andreas Groll, Thomas Tischer, Romain Seil

Summary: The application of artificial intelligence and machine learning in orthopaedic surgery is increasing rapidly, but the statistical jargon and techniques associated with AI may be unfamiliar to many clinicians. In order to bridge this knowledge gap and make these novel techniques more accessible to orthopaedic surgeons, we introduce the concepts of AI and machine learning and provide examples of their impact on clinical practice and patient care.

KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY (2022)

Article Statistics & Probability

Introducing LASSO-type penalisation to generalised joint regression modelling for count data

Hendrik Van der Wurp, Andreas Groll

Summary: In this work, we propose an extension of the versatile joint regression framework for bivariate count responses by incorporating an (adaptive) LASSO-type penalty. The method enables variable selection and guarantees shrinkage and sparsity, making it particularly useful in high-dimensional count response settings. The proposal's empirical performance is investigated in a simulation study and an application on FIFA World Cup football data.

ASTA-ADVANCES IN STATISTICAL ANALYSIS (2023)

Editorial Material Orthopedics

Machine learning and conventional statistics: making sense of the differences

Christophe Ley, R. Kyle Martin, Ayoosh Pareek, Andreas Groll, Romain Seil, Thomas Tischer

Summary: This editorial discusses the application of machine learning in orthopaedic surgery, addressing the differences between ML techniques and traditional statistics. It aims to familiarize readers with the new opportunities offered by the ML approach.

KNEE SURGERY SPORTS TRAUMATOLOGY ARTHROSCOPY (2022)

Article Health Care Sciences & Services

Can Machine Learning from Real-World Data Support Drug Treatment Decisions? A Prediction Modeling Case for Direct Oral Anticoagulants

Andreas D. Meid, Lucas Wirbka, Andreas Groll, Walter E. Haefeli

Summary: Estimating individual treatment effects can assist in reducing the impact of strokes, major bleeding events, and their composite through model-assisted recommendations. The study findings suggest that model-assisted recommendations may improve treatment decisions and decrease adverse outcomes.

MEDICAL DECISION MAKING (2022)

Article Computer Science, Information Systems

Predicting Hospital Readmissions from Health Insurance Claims Data: A Modeling Study Targeting Potentially Inappropriate Prescribing

Alexander Gerharz, Carmen Ruff, Lucas Wirbka, Felicitas Stoll, Walter E. Haefeli, Andreas Groll, Andreas D. Meid

Summary: This study developed prediction models for readmissions based on routine data, specifically focusing on potentially inappropriate prescribing (PIP). The results showed that PIP effectively predicted readmissions for most diseases, suggesting the possibility for interventions to improve modifiable risk factors.

METHODS OF INFORMATION IN MEDICINE (2022)

Review Hospitality, Leisure, Sport & Tourism

Sex differences in injury rates in team-sport athletes: A systematic review and meta-regression analysis

Astrid Zech, Karsten Hollander, Astrid Junge, Simon Steib, Andreas Groll, Jonas Heiner, Florian Nowak, Daniel Pfeiffer, Anna Lina Rahlf

Summary: A systematic review and meta-analysis comparing injury rates between female and male team-sport players found that male players had higher overall injury rates, while female players had a higher rate of anterior cruciate ligament injuries. No significant sex-specific differences were found for match, training, severe injuries, concussions, or ankle sprains.

JOURNAL OF SPORT AND HEALTH SCIENCE (2022)

Article Rehabilitation

Maximum isometric torque at individually-adjusted joint angles exceeds eccentric and concentric torque in lower extremity joint actions

Andreas Stotz, Ebrahem Maghames, Joel Mason, Andreas Groll, Astrid Zech

Summary: This study highlights the importance of optimal joint angles in isometric strength assessment. Isometric contractions at the strongest joint angles can produce higher muscle torques than eccentric contractions in the lower body.

BMC SPORTS SCIENCE MEDICINE AND REHABILITATION (2022)

Article Psychology, Multidisciplinary

Interactions of Scores Derived From Two Groups of Variables: Alternating Lasso Regularization Avoids Overfitting and Finds Interpretable Scores

Philipp Doebler, Anna Doebler, Philip Buczak, Andreas Groll

Summary: Regression models with interaction terms are commonly used for moderating relationships. The hierarchical score model reduces the dimensionality of the interaction model and ensures interpretability. Regularization and residualization procedure help avoid spurious interactions. The ALOA algorithm with lasso penalty is an interpretable statistical learning technique for moderation.

PSYCHOLOGICAL METHODS (2023)

Article Statistics & Probability

Editorial special issue: Statistics in sports

Andreas Groll, Dominik Liebl

Summary: The advances in data gathering technologies have led to a growing interest in the use of statistical analysis, predictions, and modeling techniques in sports. This special issue aims to foster the development of statistics and its applications in sports, addressing various statistical problems and investigating the impacts of the SARS-CoV-2 pandemic on the sports framework.

ASTA-ADVANCES IN STATISTICAL ANALYSIS (2023)

Article Sport Sciences

Predictive modeling of lower extremity injury risk in male elite youth soccer players using least absolute shrinkage and selection operator regression

Mathias Kolodziej, Andreas Groll, Kevin Nolte, Steffen Willwacher, Tobias Alt, Marcus Schmidt, Thomas Jaitner

Summary: The purpose of this study was to identify neuromuscular and biomechanical injury risk factors in elite youth soccer players and assess the predictive ability of a machine learning approach. Through various tests and measurements, it was found that knee extensor peak torque, hip transversal plane moment in the single-leg drop landing task, and center of pressure sway in the single-leg stance test are the three most important predictors for injury. However, the final model showed poor predictive performance and needs to be evaluated in larger samples.

SCANDINAVIAN JOURNAL OF MEDICINE & SCIENCE IN SPORTS (2023)

Review Health Care Sciences & Services

Regularization approaches in clinical biostatistics: A review of methods and their applications

Sarah Friedrich, Andreas Groll, Katja Ickstadt, Thomas Kneib, Markus Pauly, Joerg Rahnenfuhrer, Tim Friede

Summary: This article reviews regularization approaches in data science for overcoming overfitting and improving prediction, and discusses their limited application in medical research. The authors suggest increased use of regularization approaches in medicine, despite the added complexity they bring to analyses. Proper investments in computing facilities and educational resources can help overcome these challenges.

STATISTICAL METHODS IN MEDICAL RESEARCH (2023)

Article Mathematical & Computational Biology

A tree-based modeling approach for matched case-control studies

Gunther Schauberger, Luana Fiengo Tanaka, Moritz Berger

Summary: Conditional logistic regression (CLR) is the standard method for matched case-control studies, but it has limitations in including non-linear effects and interactions of confounding variables. A novel tree-based modeling method is proposed to address this issue and provide a flexible framework for a more complex confounding structure. The proposed machine learning model is fitted within the CLR framework, allowing for the consideration of matched strata. Simulation results demonstrate the effectiveness of the method, and it is applied to a cervical cancer case-control study for illustration.

STATISTICS IN MEDICINE (2023)

Article Cardiac & Cardiovascular Systems

Correlation of Walking Activity and Cardiac Hospitalizations in Coronary Patients for 1 Year Post Cardiac Rehabilitation: The More Steps, the Better!

Sinann Al Najem, Andreas Groll, Axel Schmermund, Bernd Nowak, Thomas Voigtlaender, Ulrike Kaltenbach, Peter Dohmann, Dietrich Andresen, Juergen Scharhag

Summary: The study found that the number of steps taken by cardiac patients post-rehabilitation is related to the risk of cardiac hospitalization, with increased walking activity reducing the risk. Patients with lower EF values had higher risks.

CLINICAL MEDICINE INSIGHTS-CARDIOLOGY (2022)

No Data Available