A Data-driven Method for In-game Decision Making in MLB

Read full paper here


See slides here

In this work, we show how machine learning can be applied to generate a model that could lead to better on-field decisions by predicting a pitcher’s performance in the next inning. Specifically we show how to use multi-task machine learning to build pitcher-specific predictive models that can be used to estimate whether a starting pitcher will surrender a run if allowed to start the next inning.

The results suggest that using our model would frequently lead to different decisions late in games than those made by major league managers. From the 5th inning on in close games, for those games in which a manager left a pitcher in that our model would have removed, the pitcher ended up surrendering at least one run in that inning 60% (compared to 43% overall) of the time.

We look at the predictions for Red Sox in the 2013 postseason. There were 96 innings pitched by Red Sox starters, of which, 33 were beyond the 4th inning. In 24 of those innings our model would have agreed with the manger to keep the starter in. The starter ended up giving a run in 3 (12.5%) of those innings. There were 9 innings where the manager kept the starter in, but our model wouldn’t have, and the starter ended up giving a run in 5 (55%) of those innings.

Back to Videos