On 14 July 2024, less than five minutes remained in the championship match of the Euro 2024 tournament to determine the best national soccer team in Europe. Spain and England were tied 1-1 when Spanish player Mikel Oyarzabal lunged for a ball at the top of the penalty area and toed in what appeared to be the game-winning goal [
1]. However, Oyarzabal was close to being offside, or too far down the field, on the play. Offsides calls are notoriously difficult to make—the attacking player’s head, body, or feet cannot be ahead of the ball and the second-to-last defender [
2]. To be sure about the call, the head referee on the field could have asked for help from the video assistant referee (
Fig. 1), or VAR, which is actually a team of officials that monitors multiple video feeds of the game [
3].
But no VAR review was necessary. Artificial intelligence (AI) had already made the call. Euro 2024 used an AI game-monitoring system, known as semi-automated offside technology, which featured ten cameras mounted under the roof of each stadium [
4]. Capturing 50 frames per second, the cameras enabled the AI system to determine the positions of 29 parts of each player’s body, including the head, upper arms, and knees [
5]. The ball also contained a sensor that allowed precise tracking of its movement and detected when it had been kicked [
4]. AI ruled that Oyarzabal was barely onside—the right kneecap of one of England’s defenders was ahead of Oyarzabal’s position when the ball was passed to him [
6]. The goal counted, and Spain went on to win the championship.
AI is already established in soccer. The 2022 World Cup and the 2023 Women’s World Cup adopted a similar AI system as Euro 2024 for calling offsides, as have leading national leagues such as England’s Premier League, Italy’s Serie A, and Spain’s LaLiga [
5]. In addition, AI determines whether a ball crosses the goal line [
7]. Other sports and competitions have also been testing or have introduced AI-based officiating or judging, including Major League Baseball (MLB) and the National Basketball Association (NBA) in the United States, and men’s and women’s gymnastics [
8], [
9]. At least initially, AI is mostly used to aid human officials, often with the most difficult decisions. But tennis, which debuted automatic line calling for player challenges nearly 20 years ago, is going further. By 2025, almost no top-level men’s tournaments will have human line judges [
10].
Sports are embracing AI in the hope that it will improve the accuracy of calls, speed up play by eliminating protracted video reviews, and increase objectivity in competitions where scores are often subjective, such as gymnastics. Fans should expect AI officiating to expand into more sports, said John Eric Goff, professor of physics at the University of Lynchburg in Lynchburg, VA, USA. “Anywhere it can be used, it will be used.”
Sports’ track record with AI also has broader implications. AI is moving into more and more areas, and sports, a pioneer in adopting the technology, may give an early indication of how well it will work in practice and how humans respond to it. “Sport has been one of the first successful use cases of AI,” said Patrick Lucey, chief scientist at Stats Perform, a sports data and AI company based in London, UK.
One reason for AI’s increasing use in sports is that the technology is now powerful enough to track fast-paced action. “AI is about computing speed and the ability to handle large data sets,” said Goff. But another reason is dissatisfaction with refereeing. Officials inevitably make mistakes, and at one time video replay seemed like the solution to these errors. However, fans, coaches, and players detest VAR and other replay systems and distrust the results. Reviews interrupt play and can halt games for long periods of time. In 2022, for instance, viewers of US National Football League games had to wait an average of 2 minutes and 25 seconds for a replay decision—although reviews took almost a minute longer in 2005 [
11]. And even after spending minutes poring over videos that show the action in slow-motion from multiple angles, referees still get calls wrong [
5].
In 2006, professional tennis became the first sport to introduce an AI line calling system, known as Hawk-Eye (
Fig. 2) [
12]. Now owned by Sony Group of Tokyo, Japan, the system is named for the British AI engineer Paul Hawkins, who developed it 25 years ago, originally for use in cricket [
13]. Although details of the set-up vary from tournament to tournament, it typically includes 8 to 12 cameras positioned around the court that record at up to 340 frames per second [
14]. Hawk-Eye is not an instant-replay system. Instead, it analyzes the images the cameras capture to calculate the position of the ball and project whether it will land in or out on a computer model of the court created from measurements of the actual court [
15]. The results now have an error of 2.2 mm, or about 3% of the diameter of a tennis ball [
16].
Like Hawk-Eye, the semi-automated offside technology used in soccer leagues and tournaments such as Euro 2024 relies on modeling. From the camera input, it creates three-dimensional avatars of the players on the pitch and can compare their positions at specific times to determine if an offensive player is offside [
5]. The system is semi-automated because it still needs help from VAR, which identifies the players who took part in the play under question and determines which ones touched the ball [
5].
Tennis has the most experience with AI officiating. Most fans, players, and officials—except perhaps for the line judges who are about to lose their jobs—agree that Hawk-Eye has improved the sport [
12]. It has not stopped players from disputing calls or misbehaving on court, but they almost always accept its verdicts [
12]. Tennis initially opted for a system that allowed players to challenge a set number of human calls per set. Instead of creating an annoying interruption, the challenges added drama, as players and fans waited expectantly for Hawk-Eye’s animation of the ball’s trajectory to play on the video screen in the stadium. Hawk-Eye does not get 100% of calls right, but that does not matter, said Daniel Martin, associate professor of economics at the University of California, Santa Barbara, USA, who has studied the technology’s effects on line judges. “We do not need AI to be perfect, we only need it to be really good at times when humans are making mistakes.”
Thanks in part to Hawk-Eye’s record in tennis, it has also been adopted by professional cricket, badminton, rugby, the NBA, and various soccer leagues, which rely on the technology to determine whether a ball has crossed the line into the goal. Hawk-Eye also opened the way for other applications such as the semi-automated offside technology and AI-based judging of routines in gymnastics [
9].
Despite these successes, some sports are still working out how to integrate AI. MLB’s experience illustrates some of the complexities. MLB started testing an automated system for calling balls and strikes in its lower-tier leagues in 2019, but has decided the system is still not ready for the major leagues [
17]. One reason the deployment has been so slow is that determining whether a pitch should be called a strike is more difficult than judging whether a tennis shot is in, said Goff. For one thing, “the lines do not move on the tennis court,” he said. However, the strike zone, which extends from just below the batter’s kneecaps to the midline between the shoulders and the top of his pants [
18], does change size each time a different batter steps up to the plate, he noted, because players differ in height. Moreover, MLB has had to repeatedly adjust how the AI system gauges whether a pitch is a strike—such as by lowering the upper limit of the strike zone—to match how umpires call the game [
19]. The upshot is that MLB does not plan to use AI full-time in the major leagues until 2026 at the earliest and will probably institute a challenge system rather than letting AI completely handle all ball and strike calls [
17].
AI’s ability to learn could allow it to work in other situations. For example, the team of Enqi Ma and Zbigniew Kabala, an associate professor of civil and environmental engineering at Duke University in Durham, NC, USA, has taught AI to make one of the toughest calls in squash, interference, in which one player obstructs the other from hitting the ball [
20]. Kabala said the idea for the project came from Ma, who at the time was a high school student and is now an undergraduate at the University of Pennsylvania in Philadelphia, PA, USA. From videos of professional squash matches, the researchers picked 400 instances of potential interference and digitized key frames, collecting data such as the location of the ball’s first and second bounces and the relative positions of the two players. The pair then fed the data into two machine learning models. With only nine variables, one model correctly recognized interference 86% of the time, which is comparable to the performance of a human referee [
20]. “Squash is a very tough game to judge,” Kabala said. “We were very surprised by our results.” The models are not ready to referee a live squash match, but the researchers are working to improve them. Kabala noted that they only used 400 examples for training. “If we have 4000 or 40 000, the accuracy should be much higher.”
The NBA, which relies on AI to detect only one violation, goaltending [
8], might also benefit from a broader use of the technology. Ayush Pai, an undergraduate student at the Georgia Institute of Technology in Atlanta, GA, USA, is developing a machine learning system that can identify more violations. Using annotated stills from game footage, the system learns to recognize the basketball and identify the positions of different parts of the players’ bodies, allowing it to call traveling and double dribbles [
21]. When Pai posted a video describing an early version of his system, he heard from the NBA and from some of the league’s referees. He said the referees told him that they would welcome a system that helped them make close calls—if it did not cost them their jobs. Pai said he is now improving the system to recognize more violations, including reach-in and blocking fouls. He predicts that AI refereeing “will be a big part of the NBA in the future.”
AI officiating could lead to better calls in a surprising way—by making human referees more accurate, a team led by Daniel Martin and David Almog, a PhD student in managerial economics and strategy at Northwestern University in Evanston, IL, USA, found [
22]. The researchers analyzed Hawk-Eye data from 698 matches played before and after tennis instituted challenges in 2006. Overall, the accuracy of line calls rose 8% once players could question a call.
But there was one exception—serves within 20 mm of the line. In those cases, referees were wrong almost 23% more often after the beginning of challenges, usually because they called serves in that were actually out [
23]. Almog said that two changes account for this apparent contradiction. Line judges paid closer attention after the introduction of Hawk-Eye, indicating that “they could have done better,” he said. Once players could challenge, line judges also became more conservative, calling more close balls in. On groundstrokes, the line judges’ increased attention had a bigger effect than their greater conservatism, and their overall accuracy increased, Almog explained. But serves travel so much faster than groundstrokes that even paying more attention did not improve accuracy—the officials were already at their limit of perception, Almog said. On those shots, the officials’ tendency to play it safe led to a reduction in accuracy.
Martin added that the study’s results are relevant beyond sports. Many of the new or proposed applications of AI involve supervising humans and fixing their mistakes. Examples include the lane-correction features in cars and systems that check the decisions of doctors and judges [
22]. Businesses and organizations are rushing to deploy these systems because they seem like a low-cost way to improve performance. But “we need to know how humans will react,” he said.
Even with technological improvements, AI cannot yet replace human officials in most situations, experts say. Referees not only make calls AI is not yet capable of making, but they also manage the game, maintain player discipline, and take on other key functions. Even tennis is not giving up on humans and will retain chair umpires, who keep score and serve as match administrators. And as much as fans like to complain about bad calls, they are not ready to completely delegate officiating to machines, said Goff. “The mistakes that human beings make are part of what makes sports fun.”