Call to the Pen: Maximizing pitcher effectiveness via topic modeled cluster centroid distances

Research Paper will be posted in the coming weeks. Check back soon!
Download the
Full Paper Here

Austin Hymes


This paper builds on the foundational understanding of pitch sequencing to address the question: Can bullpens be optimized for above-average performance based on the dissimilarity of pitchers used? This work introduces a novel approach to pitcher classification leveraging topic modeling as an ideal method to uncover latent variables and understand both physical and strategic components of a pitcher.  Utilizing Latent Dirichlet Allocation, pitchers are analyzed textually: the pitches thrown are the words, the at-bats are the sentences, the games are the paragraphs, and the season is the document. Understanding the topic composition of a pitcher and the pitchers most topically dissimilar, this paper introduces a new approach to understanding bullpen usage, sequencing, and efficacy.