Abstract: In this paper we show how machine learning can be applied to generate a model that could lead to better on-field decisions by predicting a pitcher’s performance in the next inning. Specifically we show how to use regularized linear regression to learn pitcher-specific predictive models that can be used to estimate whether a starting pitcher will surrender a run if allowed to start the next inning.
For each season we trained on the first 80% of the games, and tested on the rest. The results suggest that using our model would frequently lead to different decisions late in games than those made by major league managers. There is no way to evaluate would have happened when a manager lifted a pitcher that our model would have allowed to continue. From the 5th inning on in close games, for those games in which a manager left a pitcher in that our model would have removed, the pitcher ended up surrendering at least one run in that inning 60% (compared to 43% overall) of the time.