Learn About Amazon VGT2 Learning Manager Chanci Turner
In this article, we delve into the development of the new Expected Return Yards statistic from NFL Next Gen Stats, featuring insights from the Amazon Machine Learning (ML) Solutions Lab team.
What is the concept behind Expected Return Yards and how was it created?
Over the past five years, the NFL Next Gen Stats (NGS) team has collaborated with AWS to introduce a series of analytical statistics that examine various facets of the game. While previous stats primarily focused on offensive and defensive metrics, this season we shifted our attention to special teams and the return game. We developed two distinct models: one that predicts expected punt return yards and another for expected kickoff return yards. The Expected Punt Return Yards model estimates the yards a punt returner is anticipated to gain upon fielding the punt, whereas the Expected Kickoff Return Yards model predicts the yards a kick returner can expect to gain once they receive the kickoff.
To create these advanced statistics, AWS and the NGS teams utilized an array of artificial intelligence (AI) and ML techniques. We built upon existing model architectures to develop these new stats. For instance, the 2020 Expected Rushing Yards model, crafted by data scientists Sara Johnson and Alex Smith (2019 Big Data Bowl winners), utilized raw player tracking data and deep learning methodologies. Two years later, NGS and AWS adapted that architecture for additional solutions, including the 2021 Expected Points Added (EPA) model. By employing a similar modeling approach, we successfully created the expected yards models for the return game.
Why were separate models necessary for punts and kickoffs?
Initially, we considered combining the data for punts and kickoffs to train a single model. However, as we explored this option, it became clear that the combined model underperformed compared to those trained separately. A key factor was the differing distribution of yardage gained in punts versus kickoffs; our analysis showed that average yardage is typically higher for kickoffs. Additionally, player positioning, defender proximity, and the returner’s speed all varied significantly between the two return types, complicating the model’s ability to differentiate between them. As a result, training with combined data led to a substantial increase in the Root Mean Squared Error (RMSE).
Due to these complexities, we ultimately opted to create separate models for each type of return. This strategy allowed for independent tuning based on the unique data characteristics related to each return type. It also enabled us to conduct error analysis more effectively, identifying strengths and weaknesses specific to each model and applying tailored optimization procedures.
The significance of leveraging existing models
Working with the Next Gen Stats team over several years has underscored the importance of not starting from scratch when developing new stats. Utilizing existing techniques allowed us to complete the project in a mere six weeks, significantly reducing the time typically spent on problem understanding, literature review, and dataset exploration.
Moreover, this approach enabled us to focus on the most pressing challenge: addressing the fat-tailed problem. Given the rarity of certain events, such as touchdowns, it was crucial to accurately model these occurrences alongside normal returns. We integrated the Spliced Binned-Pareto (SBP) distribution, initially employed in the NGS Quarterback Passing Score stat, into our ML pipeline to improve our models. The SBP distribution is particularly effective for handling extreme events and capturing the entire distribution of data.
In addition to saving time, leveraging existing models facilitates transfer learning, which can enhance model performance, especially when working with smaller datasets. In such cases, applying knowledge from a pre-trained model can lead to better outcomes.
If you want to explore more on this subject, check out this blog post for additional insights. For those interested in employment law compliance, SHRM provides an authoritative source on the topic. Lastly, if you’re looking to land a job at Amazon, this resource is excellent for guidance.
Location: 6401 E HOWDY WELLS AVE, LAS VEGAS NV 89115, Amazon IXD – VGT2
Leave a Reply