If a batter can correctly anticipate the next pitch type, he is in a better position to attack it. That is why batteries worry about having signs stolen or becoming too predictable in their pitch selection. In this paper, we present a machine-learning based predictor of the next pitch type. This predictor incorporates information that is available to a batter such as the count, the current game state, the pitcher’s tendency to throw a particular type of pitch, etc. We use a linear support vector machine with soft-margin to build a separate predictor for each pitcher, and use the weights of the linear classifier to interpret the importance of each feature. We evaluated our method using the STATS Inc. pitch dataset, which contains a record of each pitch thrown in both the regular and post seasons. Our classifiers predict the next pitch more accurately than a naïve classifier that always predicts the pitch most commonly thrown by that pitcher. When our classifiers were trained on data from 2008 and tested on data from 2009, they provided a mean improvement on predicting fastballs of 12.5% and a maximum improvement of 50%. The most useful features in predicting the next pitch were Pitcher/Batter prior, Pitcher/Count prior, the previous pitch, and the score of the game.
To read the entire paper, click here