A switching dynamic generalized linear model to detect abnormal performances in Major League Baseball

Download Full Paper Here

This paper develops a novel statistical method to detect abnormal performances in Major League Baseball. The career trajectory of each player’s yearly home run total is modeled as a dynamic process that randomly steps through a sequence of ability classes as the player ages. Performance levels associated with the ability classes are also modeled as dynamic processes that evolve with age. The resulting switching Dynamic Generalized Linear Model (sDGLM) models each player’s career trajectory by borrowing information over time across a player’s career and locally in time across all professional players under study. Potential structural breaks from the ability trajectory are indexed by a dynamically evolving binary status variable that flags unusually large changes to ability. We develop an efficient Markov chain Monte Carlo algorithm for Bayesian parameter estimation by augmenting a forward filtering backward sampling (FFBS) algorithm commonly used in dynamic linear models with a novel Polya-Gamma parameter expansion technique. We validate the model’s ability to detect abnormal performances by examining the career trajectories of several known PED users and by predicting home run totals for the 2006 season. The method is capable of identifying Alex Rodriguez, Barry Bonds and Mark McGwire as players whose performance increased abnormally, and the predictive performance is competitive with a Bayesian method developed by Jensen et al. (2009) and two other widely utilized forecasting systems.

Back to Videos