Olympic Medal Prediction Based on Multi-Task Hybrid Modelling
Download as PDF
DOI: 10.25236/iiicec.2025.012
Author(s)
Kaibing Yang, Jiahao Zhang, Yuefeng Chen
Corresponding Author
Kaibing Yang
Abstract
With increasing global investment in elite sports, accurate forecasting of Olympic medal outcomes has become a critical area of research. This study develops a comprehensive hybrid prediction framework for estimating medal counts at the 2028 Los Angeles Olympics. Leveraging a combination of statistical techniques and machine learning models, the framework addresses four key tasks: overall medal count prediction, identification of the first medal-winning country, assessment of event-specific contributions to medal totals, and evaluation of elite coaching impacts. For medal count forecasting, we construct a stacked ensemble model integrating Elastic Net Regression (ENR), XGBoost, LightGBM, and CatBoost, with clustering and multi-criteria decision analysis enhancing feature representation. The ensemble achieves a mean squared error of 1 and an R² of 0.963, projecting the U.S. to lead with 45 gold and 132 total medals. A two-stage random forest model is employed to predict the first medal-winning country, suggesting Luxembourg as a top contender. Gray relational analysis reveals strong positive correlations between the number of events, participating nations, and medal counts, while synthetic control methods confirm the significant impact of top-tier coaching on national performance. This integrated approach not only improves predictive accuracy but also offers actionable insights for national Olympic committees in optimizing resource allocation and strategic planning. The study underscores the importance of combining data-driven modeling with domain-specific knowledge for complex, high-stakes forecasting tasks.
Keywords
Olympic Medal Prediction, Hybrid Model, Elastic Net Regression, XGBoost, Random Forest