No items found.

Details

The Sloan Sports Analytics Conference showcases cutting-edge research that is frequently featured in top media outlets throughout the world and has even changed the way sports are analyzed. The competition is an ideal way to build your reputation within the field of sports analytics.

This year’s competition will feature six sports tracks – Basketball, Baseball, Soccer, Football, Business of Sports, and Other Sports.

Submissions are now open for abstracts. Please submit your research paper abstract by 11:59pm ET on Sunday, September 25, 2022.

Abstracts will be selected based on the novelty, academic rigor and impact of the research.

*New* All submissions will be required to be open-source and a link to the author's GitHub repository or other repository supporting the research will be required.

Please refer to our Research Papers Rules page for full details on the submission and evaluation process. We look forward to reading your contribution!

Rules

Competition Format

The competition consists of the following phases:

  1. Abstract Phase

Authors submit abstracts. Based on the judged merits of their abstract submissions, a select group of authors will be invited to submit full manuscripts.

  1. Full Manuscript Phase

Invited authors submit full manuscripts. Referees will evaluate every manuscript, and authors of the best submissions will be invited to give a presentation on their findings at the conference. The referees will also select a separate set of authors who will be invited to present their work during a poster session, as well as a final set of three authors to give a deep-dive of their work in an open-source workshop.

  1. Conference Phase I

     a. Presentations

Invited authors will present their findings during the first day of the conference. Based on the quality of the presentation and manuscript, one paper per sports track (see tracks below) and one wildcard will be selected to present at the conference in front of a panel of industry experts. The judge scores will be tabulated and the winners will be announced following presentations.

     b. Poster Competition

All posters selected for the conference will be entered into a competition for Best Poster, determined by a combination of a fan and judges vote during the weekend of the conference.

Note: this competition is independent of the presentation finals, and none of the posters will advance to the presentation finals.

Timeline (all times Eastern Time)

Abstract submission due - Sep. 25, 2022, 11:59 p.m. EST

Full paper requests submitted - Early-October 2022

Full paper submissions due (if selected) - Nov. 28, 2022, 11:59 p.m. EST

Finalists and posters announced - Mid-January 2023

Submission of poster (if selected) - Early-February 2023

Submission of presentation (if selected) - Mid-February 2023

Conference presentations (if selected) - Conference Day

Open-Source *New to SSAC 2023*

For the Sloan Sports Analytics Conference, the Research Papers contest has been a tremendous opportunity for researchers to both share their work with the community and improve the application of analytics across sports. To further the impact of the great work that our researchers are doing, the leadership team, with the support of our co-chairs Jessica Gelman and Daryl Morey, is adjusting the requirements of the 2023 SSAC to require all papers to be open-source. This change will bring the work presented at the conference in line with research conducted across academia. 

Open-source research will allow researchers to build on top of the models and methods of their peers, both amplifying the effect of their research and better enabling widespread adoption of their work. We strongly believe that continued research into sports analytics is what makes our games more exciting and participants more effective, and this change better aligns with our mission to democratize analytics in sports.

All papers will be required to submit a link to the team's GitHub repository, or another open-source repository, with the modeling and data used to conduct the research. This should include any publicly available data or private data used in the research. For any private / proprietary data, please use your best judgement to anonymize any personal information before sharing publicly.

Sports Tracks

Based on abstract content, all submissions will be entered into one of the following Sports Tracks:

  1. Basketball – All submissions related to the sport of basketball.
  2. Baseball – All submissions related to the sport of baseball.
  3. Soccer – All submissions related to the sport of soccer.
  4. Football – All submissions related to the sport of American football.
  5. Business of Sports – All submissions related to the business of owning, managing, or marketing a sport, or to new technology or ideas which could change the face of the sport.
  6. Other Sports – All submissions related to the playing of a sport that is not basketball or baseball.

Abstract Guidelines

Abstract submissions should be submitted online, and must use the following guidelines:

  • Abstracts must contain fewer than 500 words, including title and body.
  • Abstracts may include up to two tables or figures combined (e.g.  1 figure and 1 table, or 2 tables).
  • Each abstract should contain the following sections:
  • Introduction – What question is this research trying to answer? Why is it an important question for the industry?
  • Methods – Description of relevant statistical methods used, including data sources or data collection procedures
  • Results – Description of actual (not promised) results along with relevant statistics
  • Conclusion – The overall takeaway from the study, including how the results will impact the sports industry

Evaluation of Submissions

The conference seeks submissions that report research pertaining to the use of analytics in the sports industry. We are open to contributions ranging from evaluating players and game strategies, to examining the success factors for sports business. In the abstract and full paper submission process, research will be evaluated on, but not necessarily limited to, the following criteria:

  • Novelty of research – Does the research provide interesting insight into new models or challenge existing beliefs?
  • Academic rigor / validity of model – Are the methodologies of the model and results fundamentally sound and appropriate?
  • Reproducibility – Can the model and results be replicated independently?
  • Application – What are the applications or potential applications of the insights from the research?

In evaluating presentation finalists at the 2023 SSAC, the above factors will be supplemented by the following criteria, as judged by a panel of academics and industry executives from team management and sports business operations:

  • Interest / impact – Is there significant interest in the proposed question in the field of study or the community at large? What are the benefits or impact of the model or application?

The Research Papers team will review all abstracts. The Review Committee will evaluate all manuscript submissions. The Review Committee consists of the Research Papers team, as well as academic professors and experts from top universities in fields including statistics, information sciences, and economics. The industry panel that makes the final winner selection will decide on the basis of the paper and the presentation at the 2023 Sloan Sports Analytics Conference. In these final evaluations, more weight will be given to the final presentation, specifically the highlighted application and impact of the research.

Conflict of Interest Policy

Our objective is to ensure an unbiased evaluation of submissions throughout the process. We are aware that members of the evaluation committee may have had relationships with authors who have submitted papers. When possible, potential conflicts of interest are avoided by minimizing the review of research by the following:

  • Authors who have collaborated with the reviewer on previous submissions
  • Current or former students who worked with the reviewer
  • Colleagues from the same organization
  • Any other previous relationships with the author that may prevent an unbiased evaluation of the paper

All potential conflicts of interest will be managed as best as possible while still maintaining the quality of the review process. Final reviews will occur without knowledge of the names of the authors.

Rights and Permissions

All authors retain ownership rights to the research and the right to publish the research after the conference. Upon submission, authors grant access to 42 Analytics to make their research available for public viewing online and in print, for conference use for the Sloan Sports Analytics Conference. Authors are responsible for obtaining permission from third parties to reprint copyrighted information such as data, tables, or figures that may be protected by copyright.

SSAC 2022 Research Papers & Authors Profiles

We received multiple research papers submissions for various industries, functions, and sports. Please review our SSAC 2022 research papers below and go to our open source competition voting page to cast your vote for your favorite

2022 Research Papers Finalists

Winning duels in VALORANT, a visualization of optimal positioning
Short Abstract:
This paper applies traditional sports analytics metrics with novel machine learning models in a brand new competitive Esport. By leveraging in-game positional data, we are able to evaluate the difficulty of a particular gun fight and assign a win probability to both sides. We use these predictions to identify players who are performing above or below expected, and identify strengths and weaknesses for NRG’s player development. We are hopeful for more analytics in Esports from current working professionals and the younger generation.
GitHub Link (Open Source)
Author(s):
DeMars DeRover
Founded in 2015, NRG has embodied competitive Esports excellence and boasts one of the most popular VALORANT rosters in North America. This paper is by their team analyst NRG DeMars DeRover who is currently a student at MIT Sloan. It was developed with support from their coach NRG JoshRT and chief of staff Jamie Cohenca, with players NRG ANDROID, eeiu, hazed, s0m and tex.
Using Tracking and Charting Data to Better Evaluate NFL Players: A Review
Short Abstract:
As the game of football makes a significant shift towards the quantitative, much of the progress made in the space can be attributed to analyses of play-by-play and charting data. However, recent years have given rise to player tracking data, which has opened the door for innovation that was not possible before. In our paper we used charting and tracking data to craft metrics for players that are more stable, descriptive and predictive than metrics that are built with either individually. Metrics for pass rushers, linebackers and wide receivers are debuted, with examples of where value can be found in player evaluation by virtue of these analyses.
Author(s):
Eric Eager
Eric Eager is the head of research, development and innovation at PFF, a worldwide leader in sports information and analysis. In his role at PFF, Eric runs a data science team that consults with all 32 NFL team clients, over 100 college football team clients, and numerous media entities. Prior to joining PFF, Eric was a mathematics professor with more than 25 publications in applied mathematics, mathematical biology and the scholarship of teaching and learning. Eric earned more than $300,000 in National Science Foundation funding to mentor undergraduate researchers in applied mathematics and mathematical biology.
Using Machine Learning to Describe how Players Impact the Game in the MLB
Short Abstract:
This paper draws upon recent advances in Natural Language Processing (NLP) and Computer Vision (CV) to learn to describe the way in which players impact the game in the MLB. In particular, this work views the game as a sequence of events - instead of a set of summary statistics describing said events - and trains machine learning models to describe the impact that a given sequence of events has on the game. The models describe a sequence of events for a single player over a relatively small time period; so we refer to the model output as player form embeddings - descriptions of how they have impacted the game in the short term. We demonstrate how these embeddings can be used to describe players over the short- and long-term, and contain signals useful for predicting the outcome of games.
GitHub Link (Open Source)
Author(s):
Connor Heaton
Connor Heaton is a Ph.D. Candidate in Informatics at the Pennsylvania State University, continuing at the university after he earned his B.S. in Computational Data Science in 2019. His research interests include machine learning, natural language processing, representation learning, big data, and sports analytics. Although his work focuses on the software side of computing, he also has a personal interest in computer hardware. A Pittsburgh native, he is a fan of the Steelers, Penguins, and Pirates.
Prasenjit Mitra
Prasenjit Mitra is a professor of Information Sciences and Technology at The Pennsylvania State University. He obtained his Ph.D. in Electrical Engineering from Stanford University in 2004, an M.S. in Computer Science from The University of Texas at Austin in 1994, and a B.Tech.(Hons.) in Computer Science and Engineering from the Indian Institute of Technology, Kharagpur in 1993. His research interests are in the areas of artificial intelligence, big data analytics, applied machine learning, natural language processing, and visual analytics. His research has been funded by the NSF CAREER award and by several grants from the DoD, DoE, DHS, NGA, DTRA, Microsoft Corp., Dow, Lockheed Martin, Raytheon, etc.
Using Hex Maps to Classify & Cluster Dribble Hand-off Variants in the NBA
Short Abstract:
This paper introduces a novel approach to precision abstraction as it relates to the classification and analysis of play-actions in the NBA. We look specifically at dribble-hand-offs as observed from SportVU player tracking data and embed the raw coordinate data into hex map representations. We then outline the architecture for an automated pipeline capable of selecting action instances and clustering them into variants. These action variants can then be used to further differentiate players and strategies within the game context in which they are implemented.
Author(s):
Koi Stephanos
Koi Stephanos works as a Senior Data Scientist and Software Engineer for Scientific Financial Systems, building out a quantitative platform for market analysis and projection. He recently obtained a master’s degree in computing at East Tennessee State University, completing a thesis on machine learning and pattern mining using NBA trajectory data. In his free time, he is an avid NBA fan and enjoys watching, studying, and playing all things basketball. This is his second publication as lead author.
Ghaith Husari
Dr. Ghaith Husari is an Assistant Professor in the Department of Computing at East Tennessee State University. Dr. Husari received his PhD degree from the University of North Carolina at Charlotte in 2019. His research interests include big data analytics, machine learning, computer vision, and cybersecurity analytics.
Brian Bennett
Dr. Brian Bennett is the Interim Chair and an Associate Professor in the Department of Computing at East Tennessee State University. Dr. Bennett received a Ph.D. in Computer Information Systems from Nova Southeastern University, focusing on program semantics. Before moving into a faculty role, he worked in various Information Systems Management positions for ten years. His research interests include applying and studying datasets with Artificial Intelligence and Machine Learning techniques. Other areas of interest include topics related to program compilation and software engineering.
Matthew Harrison
Matthew Harrison is a lecturer at East Tennessee State University, where he teaches courses on computer systems topics, software engineering, and discrete mathematics. He is working towards a Ph.D. at the University of Tennessee, Knoxville, with a research focus on heterogeneous memory systems. His other research areas include artificial life, operating systems, embedded systems, and video game design and development. In his spare time, he enjoys playing pick-up basketball games at ETSU’s gym, coaching youth basketball, and playing as many board games as possible.
Emma Stephanos
Emma Stephanos holds a master’s degree in pharmacology from Columbia University and a bachelor’s degree in biochemistry from Clemson University. She is interested in technical writing and editing and has contributed to the publication of research papers in multiple disciplines. She currently works in a pharmacy and is preparing to welcome her first child, a baby girl.
Sports narrative enhancement with natural language generation
Short Abstract:
This paper proposes a novel end-to-end solution to convert tabular data into natural-sounding sports narratives. The solution leveraged large language models (LLM) such as T5 and natural language generation techniques such as back translation and paraphrasing. The solution improves the readability of the narratives by 13% compared to the baseline rule-based template solution. The solution can also be easily scaled to include new statistics and expanded to other sports domains for future business needs.
Author(s):
Henry Wang
Henry Wang is an Applied Scientist at Amazon Machine Learning Solutions Lab. His main interests are Computer Vision, Natural Language Processing, Reinforcement Learning and their practical applications. During free time, you can find him reading, strolling on golf course, hitting tennis or watching Formula One with friends.
Saman Sarraf
Saman Sarraf is a Senior Applied Scientist at Amazon Machine Learning Solutions Lab. His background is in applied machine learning including Deep Learning, Computer Vision, and Time Series data prediction,
Arbi Tamrazian
Arbi Tamrazian is the Director of Data Science and Machine Learning at FOX where he focuses on building scalable machine learning solutions that can be applied to real-time data feeds and media assets. His main areas of interest are Deep Learning, Computer Vision and Reinforcement Learning.
Detection of tactical patterns using semi-supervised graph neural networks
Short Abstract:
Overlapping runs are a widely used group-tactical pattern in soccer. By combining variational autoencoder with a graph neural network representation of positional data, we are able to detect overlapping runs using only a very limited amount of hand-labeled data. Based on this detection, we show practical applications using data of the German national team during the European Championship 2021. Using the same methodology, we outperform state of the art approaches on the prediction of player trajectories using a publicly available Basketball dataset.
Author(s):
Gabriel Anzer
In my current role as the Head of Data Analytics at Hertha BSC my focus is on using various data sources to enhance decision making ranging from player recruitment to match analysis. Previously, I was the lead data scientist at Sportec Solutions and was amongst other things responsible for creating the initial Bundesliga match facts. While my background is in Mathematics and Machine Learning, I recently finished my Ph.D. in Sports Analytics at the University of Tübingen and am working on my soccer coaching license.
Pascal Bauer
As a data-scientist at the German national team, I am responsible for a wide field of data science & machine learning applications in football (soccer), including talent identification & development, tactical match analysis based on positional data, and much more. Additionally, I hold the UEFA A-level coaching license and just finished my Ph.D. on the detection of tactical patterns in football using machine learning.
Ulf Brefeld
I am a professor for Machine Learning at Leuphana Universität Lüneburg. Prior to joining Leuphana, I was a joint professor for Knowledge Mining & Assessment at TU Darmstadt and the German Institute for Educational Research (DIPF), Frankfurt am Main. Before, I led the Recommender Systems group at Zalando SE and worked on machine learning at Universität Bonn, Yahoo! Research Barcelona, Technische Universität Berlin, Max Planck Institute for Computer Science in Saarbrücken, and at Humboldt-Universität zu Berlin. I received a Diploma in Computer Science in 2003 from Technische Universität Berlin and a Ph.D. (Dr. rer. nat.) in 2008 from Humboldt-Universität zu Berlin
Dennis Fassmeyer
I am a PhD student at the Machine Learning Group of Prof. Dr. Ulf Brefeld at the Leuphana Universität Lüneburg. I received a Master of Science in Data Science from Leuphana Universität Lüneburg and a Bachelor of Science in Economics from Albert-Ludwigs-Universität Freiburg. Also, I spent a semester abroad at the University of Helsinki during my master's degree.
Beyond action valuation: A deep reinforcement learning framework for optimizing player decisions in soccer
Short Abstract:
This paper proposes an end-to-end deep reinforcement learning framework that receives raw tracking data for each situation in a game, and yields optimal ball destination location on the full surface of the pitch. Using the proposed approach, soccer players and coaches are able to analyze the actual behavior in their historical games, obtain the optimal behavior and plan for future games, and evaluate the outcome of the optimal decisions prior to deployment in a match.
Author(s):
Pegah Rahimian
Pegah Rahimian is a PhD candidate of informatics at Budapest University of Technology and Economics, and a soccer analytics researcher at HSNLab (http://hsnlab.hu/). She is interested in applying deep reinforcement learning to the state-of-the-art soccer action valuation and optimization methods by presenting several research papers at different data science venues. She is also a machine learning engineer for automated driving at Robert Bosch Kft, and won the best instructor prize in the field of data science and deep learning in 2021. She was a member of the national elite foundation between 2012-2016, won the first prize of the 32nd outstanding student competition in 2016, and has been awarded scholarship from the Hungarian government since 2017. She is into playing piano, skiing, and watching soccer matches.
Jan Van Haaren
Jan Van Haaren is a Data Scientist at Club Brugge, which is a professional soccer club from Belgium that regularly competes in the UEFA Champions League. Working closely with both the first-team staff and the recruitment staff, he is involved with data-driven performance analysis, match analysis and player evaluation. He is also a Research Fellow at the Department of Computer Science at KU Leuven, where he conducts research at the intersection of machine learning and sports. He obtained a PhD in Machine Learning in 2016 and a Master of Science in Computer Science in 2011 from KU Leuven. Together with Tom Decroos, Lotte Bransen and Jesse Davis, he won a best paper award for Actions Speak Louder Than Goals: Valuing Player Actions in Soccer at the International Conference on Knowledge Discovery and Data Mining in 2019. Together with Lotte Bransen, he finished runner-up with Player Chemistry: Striving for a Perfectly Balanced Soccer Team in the research-paper competition at the MIT Sloan Sports Analytics Conference in 2020.
Togzhan Abzhanova
Togzhan Abzhanova is a software engineer at Nokia. She obtained her MSc degree in electrical engineering at Budapest University of Technology and Economics in 2021. She participated at international Conference on Computing and Network Communications (CoCoNet2018) by publishing “NU Smart Shopping Card”.
Laszlo Toka
Laszlo Toka is associate professor at Budapest University of Technology and Economics, vice-head of HSNLab (http://hsnlab.hu/), and member of both the MTA-BME Network Softwarization and the MTA-BME Information Systems Research Groups. He obtained his Ph.D. degree from Telecom Paris in 2011, he worked at Ericsson Research between 2011 and 2014, then he joined the academia with research focus on cloud computing, artificial intelligence and sports analytics.

2022 Poster Presenters

When are they coming? Understanding and forecasting the timeline of arrivals at FC Barcelona stadium on match days
Short Abstract:
Futbol Club Barcelona operates the largest stadium in Europe (with a seating capacity of almost one hundred thousand people) and manages recurring sports events. These are influenced by multiple conditions (time and day of the week, weather, adversary) and affect city dynamics -- e.g., peak demand for related services like public transport and stores. We study fine grain audience entrances at the stadium segregated by visitor type and gate to gain insights and predict the arrival behavior of future games, with a direct impact on the organizational performance and productivity of the business. We can forecast the timeline of arrivals at gate level 72 hours prior to kickoff, facilitating operational and organizational decision-making by anticipating potential agglomerations and audience behavior. Furthermore, we can identify patterns for different types of visitors and understand how relevant factors affect them. These findings directly impact commercial and business interests and can alter operational logistics, venue management, and safety.
Author(s):
Feliu Serra Burriel
Feliu Serra-Burriel is a Data Scientist, Economist and Statistician, with an emphasis on applied statistics. Feliu did his undergraduate degree in Economics at the Pompeu Fabra University, to then complete a MSc in Statistics at the London School of Economics, and he is currently finishing his PhD in Statistics and Operations Research at the Polytechnic University of Catalonia. He worked as a Data Scientist at the Barcelona Supercomputing Center for more than three years and his research includes a wide range of topics, from causal inference, to satellite remote sensing, to statistical learning and functional data analysis.
Fernando Cucchietti
Fernando Cucchietti leads the Data Visualization and Analytics Group at the Barcelona Supercomputing Center (BSC). He holds a Ph.D. in quantum computing, and currently works on data visualization for science (data-heavy graphical interfaces), data science applied to industrial problems (machine learning and artificial intelligence in digital twins), and scientific visualization for dissemination.
Pedro Delicado
Pedro Delicado is a full professor at the Department of Statistics and Operations Research, Technical University of Catalonia. His research topics include unsupervised learning (non-parametric dimensionality reduction, extensions of multidimensional scaling), functional data analysis (spatial dependence, principal components) and connections between Statistics and Data Science (interpretability of machine learning models).
Eduardo Graells Garrido
Dr. Eduardo Graells-Garrido is a Research Professor at the Data Science Institute in Universidad del Desarrollo (Chile). His research lines include Urban Informatics, Computational Social Science, and Information Visualization, with a focus on human mobility as seen from digital traces.
Imanol Eguskiza
Imanol Eguskiza, Innovation Manager and New Products and Services manager at the Innovation Hub of FC Barcelona.He is a postgraduate in Innovation and Design Thinking at Barcelona School of Management and holds a degree in Business Administration. Imanol has been part of the project team for the execution of the IoTwins project.
Alex Gil
Alex Gil is a Business Analytics & Strategy Manager at Football Club Barcelona (FCB). He holds a degree in Computer Science and a Postgraduate of Innovation and Technology Management, and currently works on data science and AI applied to business strategy. Alex is part of the IoTwins project team, for the execution of the FCB’s stadium digital twin.
Understanding why shooters shoot - An AI-powered engine for basketball performance profiling
Short Abstract:
In professional basketball, it is crucial for the coaching staff of a team to analyze an opposing team and develop an effective strategy. Understanding player shooting profiles is an essential part of this analysis: knowing where certain opposing players like to shoot from can help coaches neutralize offensive gameplans from their opponents, while understanding where their players are most comfortable can lead them to developing more effective offensive strategies. We present a tool that can visualize player performance profiles in a timely manner while taking into account factors such as play-style and game dynamics, generating interpretable heatmaps that allow us to identify and analyze how these non-spatial factors affect the performance profiles. Our methods provide an effective and efficient tool that can provide insight into how certain players and teams play, without requiring the time-consuming process of reviewing hours of film, and could potentially be applied to other sports with adaptations.
GitHub Link (Open Source)
Author(s):
Alejandro Rodriguez Pascal
Born and raised in Madrid, Spain, Alejandro Rodriguez Pascual excelled in Mathematics and self-taught Computer Science from an early age, as he also engaged in high-level youth basketball at one of Spain's best clubs. After moving with his family to the San Francisco Bay Area in 2016, Alejandro was unable to continue playing basketball in high school, but remained an avid fan of the sport as he watched former coaches, teammates and adversaries make it to the NBA. He was also exposed to his first research experience his junior year when he independently worked on BACON: an automated poetry generator with author-specific style transfer, which won an Honorable Mention in Mathematics at the 2018 California Science and Engineering Fair. Now an undergraduate student at UC San Diego, Alejandro worked with Frank Rodriz and Muhammad Zubair Khan, advised by Rose Yu and Ishan Mehta, on how to extract Spatiotemporal Latent Factors for Basketball. He completed the project throughout Summer 2021, resulting in the paper Understanding why shooters shoot – An AI-powered engine for basketball performance profiling. Alejandro will graduate in Spring 2022, a year early, and plans to attend graduate school immediately afterwards to earn an M. Sc. in Computer Science.
Ishan Mehta
After earning an M.Sc in Electrical and Computer Engineering from UCSD in 2021, Ishan interned with the San Diego Padres Research and Development department. Prior to that he worked on a Computer Vision Senior Capstone project sponsored by Tesla. Ishan is also an avid basketball fan and currently works for Zelus Analytics as a data scientist.
Muhammad Khan
Muhammad Zubair Khan is currently an Undergraduate student at University of California San Diego, pursuing a Bachelor's in Computer Science. He is a recipient of the Mary & Haag scholarship and the Shores scholarship, as well as multiple Provost Honors during his time at UCSD due to his outstanding academic record. At UCSD, he researched under Dr Rose Yu and Ishan Mehta, and alongside Alejandro Pascual and Frank Rodriz on how to extract Spatiotemporal Latent Factors for Basketball. Furthermore, he also interned at Amazon as a Software Development Engineer (SDE) intern in the summer of 2021, and will be returning to his SDE intern position for the coming Summer 2022.
Rose Yu
Dr. Rose Yu is an assistant professor at the University of California San Diego, Department of Computer Science and Engineering. She was a Postdoctoral Fellow at the California Institute of Technology. Her research focuses on advancing machine learning techniques for large-scale spatiotemporal data analysis, with applications to sustainability, health, and physical sciences. A particular emphasis of her research is on physics-guided AI which aims to integrate first-principles with data-driven models. Among her awards, she has won Faculty Research Award from Facebook, Google, Amazon, and Adobe, Several Best Paper Awards, Best Dissertation Award in USC, and was nominated as one of the ’MIT Rising Stars in EECS’.
Frank Rodriz
Frank A. Rodriz is currently an Undergraduate student at University of California San Diego, pursuing a Bachelor’s in Computer Science. At UCSD, he researched under Dr Rose Yu and Ishan Mehta, alongside Alejandro Pascual and Muhammad Zubair Khan on how to extract Spatiotemporal Latent Factors for Basketball. Furthermore, he also interned at Bank of America as a Software Engineer in the summer of 2021. He looks to return to a full-time position at Bank of America following his graduation this Spring 2022.
Learning from the Pros: Extracting Professional Goalkeeper Technique from Broadcast Footage
Short Abstract:
As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professional goalkeeper technique. Then, an “expected saves” model is developed, from which we can identify the optimal goalkeeper technique in different match contexts.
GitHub Link (Open Source)
Author(s):
Matthew Wear
Matthew is a Data Scientist at Royal Mail, holding an MSc in Data Science from the University of Southampton. He also holds a BSc in Mathematics from the University of Exeter. Matthew is focused on research in goalkeeper analytics but has also been involved in projects on soccer and horse racing prediction.
Ryan Beal
Ryan is CEO and Co-Founder of SentientSports who are an AI focused sports analytics start-up. Ryan also holds a PhD from the University of Southampton where his research focused on applications of AI to team sports. He has published a number of papers in this space exploring problems such as valuing teamwork and using game theory to optimise tactical decision making. He is also involved in football recruitment via work with AI Abacus.
Tim Matthews
Tim is CTO and co-founder of SentientSports, holding a PhD in Computer Vision and Artificial Intelligence from the University of Southampton. During his time there he developed the SquadGuru fantasy sports AI, able to beat 99% of human players, and now works on football recruitment via work with AI Abacus.
Gopal Ramchurn
Gopal is a Professor of Artificial Intelligence, Director of the Centre for Machine Intelligence, Turing Fellow, and a co-founder of SentientSports. He has won multiple best paper awards for his research and is a winner of the AXA Research Fund Award for his work on Responsible Artificial Intelligence. His papers have been cited more than 7000 times and his work has featured in various media including BBC News, New Scientist, Sky News, BBC Click, and Wired.
Tim Norman
Tim is a Professor of Computer Science and Head of the Agents, Interaction and Complexity Group at the University of Southampton. He has been involved in sports research for the past 4 years as well as a strong focus on wider AI topics such as multi-agent systems and AI planning and scheduling.
Live Counter-Factual Analysis in Women’s Tennis using Automatic Key-Moment Detection
Short Abstract:
A recent trend in machine learning is to utilize interpretable techniques such as counter-factual analysis to explain predictions of individual events. Such techniques are powerful in sports where it can be used to answer the impact of a play or event on the overall outcome of the match by framing it as a “what-if” questions (i.e., if a player win/loses the next point – how does that change the likelihood of her winning the game/set/match?). In this paper, we present a counter-factual method for women’s tennis that first automatically highlights the key moments in a match using our “leverage”, “clutch” and “momentum” metrics. Not only can our approach highlight important moments before they occur in an automatic fashion, it can also link player behaviors at a season level which shines a light on their tendencies in key moments.
Author(s):
Robert Seidl
Dr. Robert Seidl is currently a Senior AI Scientist at Stats Perform, where he focuses on live streaming predictions utilizing the wealth of live tennis data that is available through the WTA/Stats Perform partnership. Robert has extensive industry experience in live and large-scale sports analytics (specifically in soccer and tennis), previously holding positions at SciSports, German Bundesliga as well as Tennis Australia. He received his diploma in Mathematics from LMU Munich, and PhD in Computer Science from TU Munich.
Patrick Lucey
Since October 2015, Patrick Lucey has been at Stats Perform and currently serves as Chief Scientist where his goal is to maximize the value of the 40+ years of data that Stats Perform has. Previously, Patrick was at Disney Research for 5 years, where he conducted research into automatic sports broadcasting using large amounts of spatiotemporal tracking data. Patrick received his BEng(EE) from USQ and PhD from QUT, Australia in 2003 and 2008 respectively. He has had papers in the research track at the MIT Sloan Sports Analytics Conference 9 of the last 10 years – winning in 2016, and runner-up in 2017 and 2018 (website: www.patricklucey.com).
Call to the Pen: Maximizing pitcher effectiveness via topic modeled cluster centroid distances
Short Abstract:
This paper builds on the foundational understanding of pitch sequencing to address the question: Can bullpens be optimized for above-average performance based on the dissimilarity of pitchers used? This work introduces a novel approach to pitcher classification leveraging topic modeling as an ideal method to uncover latent variables and understand both physical and strategic components of a pitcher. Utilizing Latent Dirichlet Allocation, pitchers are analyzed textually: the pitches thrown are the words, the at-bats are the sentences, the games are the paragraphs, and the season is the document. Understanding the topic composition of a pitcher and the pitchers most topically dissimilar, this paper introduces a new approach to understanding bullpen usage, sequencing, and efficacy.
Author(s):
Austin Hymes
Austin Hymes is a Senior Manager in PwC’s Financial Services Advisory practice responsible for the development of management analytics. Austin has designed industry leading profitability models used for executing performance diagnostics, profitability reasoning, and pricing analysis. Austin holds a master’s in data science from Northwestern University, and a bachelor's in business from Boston University. In his free time, he enjoys hiking, crossword puzzles, and watching his beloved Boston Red Sox.
A Markov Approach to Untangling Intention Versus Execution in Tennis
Short Abstract:
In this paper, we develop a novel modeling framework based on Markov reward processes and Markov decision processes to investigate how execution error (i.e., not hitting the ball exactly where a player intended) impacts a player's value function and strategy in tennis. We power our models with hundreds of millions of simulated tennis shots with 3D ball and 2D player tracking data. We find that optimal shot selection strategies in tennis become more conservative as execution error grows, and that having perfect execution with the empirical shot selection strategy is roughly equivalent to choosing one or two optimal shots with average execution error. We find that execution error on backhand shots is more costly than on forehand shots, and that optimal shot selection on a serve return is more valuable than on any other shot, over all values of execution error.
Author(s):
Doug Fearing
Doug Fearing is the co-founder and President of Zelus Analytics, one of the world’s leading sports intelligence companies. Prior to Zelus, Doug founded the Los Angeles Dodgers R&D department, grew it to a staff of 20 over four seasons, and reached the World Series in both 2017 and 2018. While a faculty member at Harvard Business School and then the UT Austin’s McCombs School of Business (2010 - 2015), Doug acted as a Senior Advisor to R&D for the Tampa Bay Rays. Along with co-author Timothy Chan, Doug received an MIT Sloan Sports Analytics Conference research paper award in 2013.
Stephanie Kovalchik
Stephanie is a senior data scientist at Zelus Analytics, one of the world’s leading sports intelligence companies. She previously led data science innovation for the Game Insight Group of Tennis Australia, building first-of-a-kind metrics and real-time applications with tracking data. An expert in tennis analytics and causal inference with 40+ published papers, she also developed novel statistical methods in her previous roles at the NCI and RAND Corporation. She blogs at on-the-t.com
Tim Chan
Timothy Chan is the Canada Research Chair in Novel Optimization and Analytics in Health, a Professor in the department of Mechanical and Industrial Engineering, the Director of the Centre for Analytics and AI Engineering, an Associate Director of the Data Sciences Institute, and a Senior Fellow of Massey College at the University of Toronto. His primary research interests are in operations research, optimization, and applied machine learning, with applications in healthcare, medicine, sustainability, and sports. Along with co-author Doug Fearing, he received the MIT Sloan Sports Analytics Conference research paper award in 2013. This past summer he got back into playing tennis and he is never going back.
Craig Fernandes
Craig Fernandes is a first year Operations Research PhD student at the University of Toronto being supervised by Prof. Timothy Chan. His research focuses on optimization, game theory & AI/ML techniques applied in social good, healthcare and sports. His research has been featured at the New England Symposium on Statistics in Sports (NESSIS), Sport Innovation (SPIN) Summit hosted by Own the Podium, and the Canadian Operational Research Society's (CORS) annual conference. He also previously worked as a research data scientist at Amazon, focusing on inventory optimization.
A Game-theoretic Approach to the Football Endgame: When Should the "2-Minute Drill" Begin?
Short Abstract:
This paper investigates when NFL teams should switch to a “hurry-up” offense toward the end of the game. While colloquially called “the 2-minute drill,” teams should make this transformation much earlier- typically between 6 and 10 minutes remaining; however the exact answer depends on many factors including point differential and field position. This analysis uses a recursive game-theoretic approach utilizing payoff matrices based on the past 20 seasons of play-by-play NFL data via nflfastR.
Author(s):
Logan McGuire
Second Lieutenant McGuire was born in Toledo, Ohio but grew up just outside of South Bend, Indiana in a town called Granger. In 2017, he started his education at the United States Naval Academy where he majored in Operations Research. He quickly found he had a passion for numbers and data analysis and joined the honors program at the United States Naval Academy where he focused his honors thesis on game theory and its relation to endgame play calling for American Football strategists. He earned several awards before graduating, including top graduate of the Operations Research major and he earned graduation with distinction honors which are given to the top 100 graduating officers from the Naval Academy. Upon graduation in May of 2021, he commissioned as Second Lieutenant in the United States Marine Corps. In June of 2021, he attended The Basic School in Quantico, VA where he learned military tactics, Marine Corps heritage, and above all else, how to lead Marines. After graduating, he reported to First Transportation Battalion as a Ground Supply Officer, a career field that allows him to continue pursuing his passion for data and numbers. In his free time, he enjoys hiking, flying his drone, and most of all, watching the San Francisco 49ers play “bully ball” against all opponents. It truly is hard to find a bigger 49ers fan than him and all those close to him know that Sunday’s (sometimes Thursday’s and Monday’s) are reserved from September to February every year.
Franklin Kenter
Professor Franklin Kenter is originally from the San Francisco Bay Area. In 2013, he completed his PhD in mathematics from University of California San Diego with an emphasis on discrete mathematics, network science and spectral graph theory. After spending three years at Rice University in Houston as a post-doctoral scholar, he joined the faculty at the United States Naval Academy in Annapolis, MD as an assistant professor of mathematics. His research program can be broadly described as “games on networks” with impacts on number theory, theoretical computer science, combinatorics and probability. Outside of academics, he has a keen interest in tabletop games and game design as well as recreational sports, especially ultimate and indoor soccer. His affinity for orange is not affiliated with any school or team.

Research Paper Finalists

Research Paper Submissions

Please submit your full manuscript if you were selected for this phase
Max file size 10MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.
Max file size 10MB.
Uploading...
fileuploaded.jpg
Upload failed. Max size for files is 10 MB.
Thank you!
Your submission has been received!
Oops! Something went wrong while submitting the form.

Open Source Competition - Voting

Review our SSAC 2022 open source finalists below and cast your vote for your favorite submission HERE!

2022 Open Source Finalists

Understanding why shooters shoot - An AI-powered engine for basketball performance profiling
Short Abstract:
In professional basketball, it is crucial for the coaching staff of a team to analyze an opposing team and develop an effective strategy. Understanding player shooting profiles is an essential part of this analysis: knowing where certain opposing players like to shoot from can help coaches neutralize offensive gameplans from their opponents, while understanding where their players are most comfortable can lead them to developing more effective offensive strategies. We present a tool that can visualize player performance profiles in a timely manner while taking into account factors such as play-style and game dynamics, generating interpretable heatmaps that allow us to identify and analyze how these non-spatial factors affect the performance profiles. Our methods provide an effective and efficient tool that can provide insight into how certain players and teams play, without requiring the time-consuming process of reviewing hours of film, and could potentially be applied to other sports with adaptations.
GitHub Link (Open Source)
Author(s):

Alejandro Rodriguez Pascal, Ishan Mehta, Muhammad Khan, Rose Yu, Frank Rodriz

Learning from the Pros: Extracting Professional Goalkeeper Technique from Broadcast Footage
Short Abstract:
As an amateur goalkeeper playing grassroots soccer, who better to learn from than top professional goalkeepers? In this paper, we harness computer vision and machine learning models to appraise the save technique of professionals in a way those at lower levels can learn from. We train an unsupervised machine learning model using 3D body pose data extracted from broadcast footage to learn professional goalkeeper technique. Then, an “expected saves” model is developed, from which we can identify the optimal goalkeeper technique in different match contexts.
GitHub Link (Open Source)
Author(s):

Matthew Wear, Ryan Beal, Tim Matthews, Gopal Ramchurn, Tim Norman

Winning duels in VALORANT, a visualization of optimal positioning
Short Abstract:
This paper applies traditional sports analytics metrics with novel machine learning models in a brand new competitive Esport. By leveraging in-game positional data, we are able to evaluate the difficulty of a particular gun fight and assign a win probability to both sides. We use these predictions to identify players who are performing above or below expected, and identify strengths and weaknesses for NRG’s player development. We are hopeful for more analytics in Esports from current working professionals and the younger generation.
GitHub Link (Open Source)
Author(s):

DeMars DeRover

Using Machine Learning to Describe how Players Impact the Game in the MLB
Short Abstract:
This paper draws upon recent advances in Natural Language Processing (NLP) and Computer Vision (CV) to learn to describe the way in which players impact the game in the MLB. In particular, this work views the game as a sequence of events - instead of a set of summary statistics describing said events - and trains machine learning models to describe the impact that a given sequence of events has on the game. The models describe a sequence of events for a single player over a relatively small time period; so we refer to the model output as player form embeddings - descriptions of how they have impacted the game in the short term. We demonstrate how these embeddings can be used to describe players over the short- and long-term, and contain signals useful for predicting the outcome of games.
GitHub Link (Open Source)
Author(s):

Connor Heaton, Prasenjit Mitra