Predicting Soccer Goals in Near Real-Time Using Computer Vision

By: Emily Carter, Liam Johnson, Ava Martinez, Noah Brown, Mia Wang, and Ethan Davis

Date: 08 DEC 2020

Category: Amazon SageMaker, Artificial Intelligence, Sports

In the world of soccer, fans are often thrilled to witness a player darting down the sideline during a counterattack or when the ball is within the 18-yard box, as these moments can lead to exciting goals. However, capturing these rapid movements and predicting the outcome is a challenge for even the sharpest human eyes. Leveraging machine learning (ML), we can analyze intricate details at the pixel level, leading to a solution that accurately predicts goals before they occur.

Sportradar, a premier provider of real-time sports data, partnered with the Amazon ML Solutions Lab to create a computer vision-based Soccer Goal Predictor. This tool identifies thrilling moments that could culminate in goals, enhancing fan engagement and offering broadcasters a way to elevate viewer experience. While most action recognition models only identify events after they happen, the Amazon ML Solutions Lab devised a groundbreaking Soccer Goal Predictor that forecasts goals two seconds prior to their occurrence.

“We intentionally presented one of the toughest computer vision challenges to the Amazon ML Solutions Lab team to explore the boundaries of what is possible, and I am truly impressed with the results,” remarks Mark Thompson, Group CTO at Sportradar. “The team developed a video action recognition model to predict future soccer goals two seconds in advance using Amazon SageMaker, demonstrating its application for tracking match intensity. This advancement has opened up numerous new business opportunities. The implementation costs and latency of this model on our production pipeline utilizing AWS’s infrastructure look very promising. After today, I have no more doubts about the potential of computer vision in transforming our business.”

The team utilized Amazon SageMaker notebook instances to create a data processing pipeline that extracted training examples from raw videos, employing transfer learning to fine-tune an Inflated 3D Networks (I3D) model. The results have inspired Sportradar’s data science and innovation teams to devise new statistics for their broadcast videos, further enhancing fan engagement.

In this post, we will delve into how we applied transfer learning with the I3D model for goal prediction and utilized the inferences to create an intensity index that quantifies the chances of a team scoring. We will also explore the construction of a momentum index, which measures the rate of change during attacks (a term in soccer that refers to the movements of the team with the ball). With both the intensity and momentum indices, we can identify intense moments (those likely to lead to goals) in near real-time using live feeds, enabling the development of products that help broadcasters engage fans during games.

Data Processing and Model Building

To identify these critical moments, we framed the challenge as a binary classification problem: distinguishing activities that result in goals from those that do not. The positive class consists of video clips that are two seconds away from goals, while the negative class features clips depicting activities that do not lead to goals (the ballsafe class). We generated 1,550 clips from 398 professional soccer matches sourced from Sportradar.

Given the rapid pace of soccer, we used short video clips for training. For this purpose, we extracted 5-second clips. A significant challenge in video processing is that accessing multiple video streams and extracting clips sequentially can be extremely time-consuming, often requiring hours. To expedite this clip extraction, we established a data pipeline utilizing multiprocessing in an Amazon SageMaker notebook, employing an ml.c5.18xlarge instance with 72 CPUs to parallelize the I/O-intensive clip extraction process, which reduced the overall time from 12 hours to under 15 minutes.

After processing the data, we constructed a binary classification model using the I3D model from GluonCV’s model repository. This I3D model employs 3D convolutions to learn spatiotemporal data directly from videos. Due to the limited dataset available, we utilized transfer learning to fine-tune the I3D model, achieving optimal performance with our data. For further insights on fine-tuning and using the I3D model, see this authoritative source on the topic.

Using Amazon SageMaker notebook instances, we initially loaded a pre-trained I3D network from the Kinetics400 dataset into a Jupyter notebook. We then fine-tuned this network with Sportradar’s data to optimize parameters, particularly those pertinent to action recognition models (e.g., number of frames, segments, and frame sampling stride).

Results

For model evaluation, we prioritized recall as our main metric, aiming for near-100% goal detection (the positive class). The graphs below illustrate the confusion matrix and the precision-recall curve. It is evident that differentiating between the two classes becomes challenging when targeting near-100% recall. We recalibrated predicted probabilities to assess model performance in achieving 80% and 90% recall for the positive class.

The table below outlines the precision and recall for the negative class when fixing the recall of the positive class. Our model successfully differentiates between the two classes even with the revised settings. When we maintain the recall of the positive class at 90%, we can identify 68% of the negative class samples, achieving a precision of 75%.

	At 80% Goal Recall	At 90% Goal Recall
Ballsafe Recall	0.81	0.68
Goal Precision	0.82	0.75

Intensity and Momentum Index

Following training and validation, we selected the model that exhibited the highest recall on the validation dataset. We conducted inferences across three complete games, employing a moving window with the predicted probabilities serving as the intensity index. To gauge the velocity changes during attacks, we also developed a momentum index based on the slope of the linear regression line of predicted probabilities from the last four timestamps. Finally, we utilized min-max normalization to adjust the index between -1 and 1, allowing the momentum index to effectively track changes in predicted goal probabilities within recent seconds.

The image below demonstrates our model’s inference using a 5-second moving window within a 40-second clip. The red-marked areas indicate moments of heightened goal prediction intensity. The initial two red bars represent near-goal scenarios, leading to a goal scored at the end of the clip during the third intense red bar.

The meter on the left illustrates the momentum index ranging from -1 to 1, while the match intensity line chart at the bottom displays goal predictions from our model. Given the rapid actions that can occur within 2 seconds, the model’s high goal probability predictions remain accurate even before shots are missed.

For additional insights, you can explore this blog post here.

Model Performance in Production

Sportradar continues to invest in computer vision through both internal R&D and external collaborations. To facilitate the swift transition of computer vision models from development to production, we have made significant strides, making it about the same overall length.

Predicting Soccer Goals in Near Real-Time Using Computer Vision

Data Processing and Model Building

Results

Intensity and Momentum Index

Model Performance in Production

Related Topics:

Comments

Leave a Reply Cancel reply