Category
Theme
Published Date: 2023/11/28

How will AI commentary change sports viewing? "Voice Watch"

Racing cars roar past the front of the grandstands one after another. An AI provides real-time commentary describing this action to visually impaired spectators in the stands. From current race positions to predictions of how the race will unfold, the commentary rivals that of professional announcers, allowing visually impaired individuals to experience the excitement of motorsports.

Voice Watch, a real-time sports commentary generation AI that won the top prize in the AI category at the NY ADC Awards and was selected for this year's Good Design Best 100, is opening new avenues for visually impaired individuals to enjoy sports. What kind of fusion of ideas and technology lies behind this? We asked Kazuhiro Shimura, Creative Director and project leader.

Voice Watch

Eliminating the Information Gap in Sports Viewing

──What led to the creation of the real-time sports commentary AI "Voice Watch"?

Shimura: It began with a Toyota Mobility Foundation open call I happened to see online. The challenge was: "How can people with disabilities truly enjoy racing?" At the time, I was on childcare leave and wanted to reset my thinking, considering work that could contribute more to society. I thought this might be interesting, so I assembled a team and entered.

Our first step was to listen directly to the voices of people with disabilities. We were introduced to several visually impaired members working at Dentsu Group companies and spoke with them directly. Before these conversations, we had assumed the focus would be on solving issues like "How to enable travel to motorsports venues?" or "How to make moving around the venue smoother?" But hearing directly from visually impaired individuals made us realize there were many other critical challenges to address. They repeatedly voiced concerns like: "Even if we get to the venue, we can't really tell what's happening right in front of us," "It feels awkward to constantly ask friends or family who came with us to explain the race situation," and "Even when everyone else is excited, we can't share in that feeling, so we lose the desire to go."

In other words, there's an information gap between visually impaired spectators who can't see the race and sighted spectators, even though they're in the same venue. This gap creates a psychological barrier that makes them hesitate to go to the venue to watch. Realizing that this information gap at the race venue was the core issue to solve, I wanted to tackle this problem. Our goal was to create a society where visually impaired individuals aren't left out of the excitement of sports spectating, and where both sighted and visually impaired people can enjoy watching together. With this vision, we developed an idea, entered it into a public contest, and the result was the real-time sports commentary generation AI, "Voice Watch."

Three AI systems work together: object recognition, sign detection, and speech frames

──How does"Voice Watch" generate real-time sports commentary?

Shimura: While " Voice Watch" functions as one large AI system overall, it's composed of three distinct AI components when examined in detail.

「Voice Watch」 システム画面
"Voice Watch" System Screen

The first is an AI that performs object recognition from footage captured by fixed cameras tracking the race cars. This essentially acts as the eyes for visually impaired individuals. By tracking the race cars, it describes the race unfolding before them, answering questions like "Which team just passed?" or "Who is battling whom?"

The second AI analyzes vast amounts of real-time driving data, such as lap times and positions, to detect "signs" of impending changes in the race dynamics. It predicts future race developments, such as "The second-place car looks poised to catch and overtake the leader in the next lap" or "The gap is gradually closing; the third-place car seems likely to move up to second."

The third is a proprietary speech framework AI that extracts professional commentary expertise by learning from past race commentaries delivered by live announcers. This AI collaborates with the first two AI systems to generate commentary that incorporates both the current situation and predictions about future developments. Having learned from professional announcers, the AI commentary sounds natural and immersive, just like a real announcer speaking.

Of course, since you're at the venue, we want you to enjoy not just the commentary but also the powerful engine sounds. That's why we carefully balance the volume of the commentary audio with the ambient venue sounds.

──When we think of sports commentary, it often feels aimed at those who can't attend the venue, like TV or radio broadcasts. In that sense, "Voice Watch" is a "reversal of perspective," being designed for those actually at the venue.

Shimura: Whether it's baseball or soccer, you don't need commentary when watching from the stands because you can see what's happening. Actually, when developing "Voice Watch," to understand how visually impaired spectators might experience the stands, I and other staff members wore blindfolds and listened to the circuit sounds. It was incredibly scary, or rather, eerie. The roar of engines would suddenly hit your ears at unpredictable moments, feeling extremely unpleasant. However, when we listened to the circuit sounds alongside the "Voice Watch" commentary in the same situation, we gradually started to feel excitement from those previously unpleasant roars. The human brain is truly fascinating. Hearing those roars alongside the commentary seems to allow you to imagine the scenery in your mind, somehow transforming what was once unpleasant into excitement.

──Ihear you had visually impaired individuals use"Voice Watch" at an actual race. How was their reaction?

Shimura: We had visually impaired individuals use it at the "Super Taikyu," Japan's largest endurance race. Many commented, "I was able to enjoy the race," and "With 'Voice Watch,' I'd like to go watch sports events." Creating that first experience of enjoying watching a race for people who had never attended a motorsports event before was deeply moving for us as well.

We also received feedback from non-visually impaired attendees at the venue saying, "Even for us, it made understanding the race developments easier and more interesting. It made watching the race more enjoyable." While Voice Watch was created for visually impaired users, discovering that it also provided a new and enjoyable viewing experience for non-disabled users was a pleasant surprise.

日本最大の耐久レース「スーパー耐久」のAI実況を生成
Generating AI Commentary for Japan's Largest Endurance Race, "Super Taikyu"

For sports with passion but lacking commentary

──Are there plans to expand"Voice Watch" to other sports?

Shimura: We're currently exploring various possibilities. We asked visually impaired users who experienced Voice Watch at Super Taikyu what sport they'd like to see it implemented in next. We expected major sports like baseball or soccer to be mentioned, but surprisingly, many people answered "children's sports days." Many shared experiences like: "Even when I go to my child's sports day, I can't understand what's happening," or "I want to ask other parents how my child is doing, but everyone is so absorbed cheering for their own kids that it's impossible to ask." For visually impaired parents, watching sports days presented a significant barrier.

So, we tried providing real-time commentary during the races at an actual sports day. The result? By listening to live commentary of their child running through "Voice Watch," visually impaired fathers were able to enjoy their child's sports day more than ever before.

子どもの運動会で小学生50m走のAI実況を生成
Generating AI Commentary for Elementary School 50-Meter Races at Children's Sports Days

Children's sports days are one example, but there are many sports out there where the passion is there, yet due to cost constraints, human commentators can't be present. That was the moment I felt "Voice Watch" could be useful there.

──"Voice Watch" was developed for the visually impaired, but is there potential to expand its use to others?

Shimura: Just as it can provide commentary on your child running in a sports day event, the AI can generate commentary focused on specific athletes. We call this "favorite athlete commentary," and we believe it greatly expands Voice Watch's potential.

For example, for someone cheering on the Toyota team at the Super Taikyu, we could generate commentary focused solely on the Toyota racing cars. In other words, it can create personalized commentary for specific fan groups, not just generic broadcasts for everyone.

Furthermore, since the commentary language can be switched, at international events attracting spectators from overseas, we could generate "favorite commentary" in each country's language for their respective drivers.

Our development team initially aimed to "get as close as possible to human commentary." However, we've now realized that AI commentary possesses unique appeal not found in human commentary. Moving forward, we aim to create new spectator experiences by pursuing the potential of "Voice Watch."


AI is merely a means to realize ideas

──Mr. Shimura, you were also involved in developing "TUNA SCOPE," which gained attention for using AI to identify high-quality tuna. Are there common development philosophies or technologies between "TUNA SCOPE" and "Voice Watch"?

Shimura: Whether it's " TUNA SCOPE" or "Voice Watch," the development philosophy is consistent: creating something new, beneficial to society, and exciting.

──Both start with AI technology. Is the starting point using AI to launch something new?

Shimura: People often think that, but it's actually a bit different. The starting point isn't "Can we use AI technology to create something new?" It's "There's this challenge in the world; can we solve it with an idea?" The sequence is that AI was necessary to solve societal challenges or realize ideas that excite people.

The core of our work as an advertising agency isn't so much about technological innovation, but rather about combining creativity with technology to create new experiences that don't yet exist in the world, and solving problems for people and society.

If you start with AI technology and try to create something new, there are countless people worldwide thinking the same way, and the result is often very similar solutions. Our approach is different: the idea comes first. If AI is necessary to realize that idea, then we use AI. Because our starting point is different, I believe we can continue creating new types of AI that no one else has.

Discovering new societal challenges, solving them in surprising ways, and building a better future. That is Dentsu Inc.'s strength and the true essence of our work.

 

Voice Watch Website: https://voicewatch-project.com/


【Staff List】
Creative Director: Kazuhiro Shimura (Dentsu Inc. / Future Creative Center)
Art Director: Seri Tanaka (Dentsu Inc. / Future Creative Center)
Planner: Sho Tomita (Dentsu Inc. / Future Creative Center)
Planner: Ryo Seki (Dentsu Inc. / Future Creative Center)
Business Producer: Masashi Kodama (Dentsu Inc.)
Data Scientist: Hatsumi Suzuki (Dentsu Digital Inc.)
Data Scientist: Tomoaki Uemura (Dentsu Digital Inc.)
Data Scientist: Samaneh Arzpeima (Dentsu Digital Inc.)
Producer: Yusuke Michise (Dentsu Live Inc.)
Producer: Daiki Shimomichi (Dentsu Live Inc.)
Producer: Masaya Ishii (Dentsu Live Inc.)
Director: Tomoyuki Kato (Dentsu Live Inc.)

 

Twitter

Was this article helpful?

Share this article

Author

Kazuhiro Shimura

Kazuhiro Shimura

Dentsu Inc.

Studied biotechnology at university and graduate school. Currently aims to expand the creative domain, working not only in advertising but also in service & product development and innovation. Awards include Cannes Lions, One Show, Clio Awards, D&AD, LIA, Adfest, Spikes Asia, ACC Awards, and many others. Served as a juror for the Cannes Lions Digital Craft category in 2017 and the Spikes Asia Innovation category in 2018. Has given numerous international lectures, including the Cannes Lions Dentsu Inc. Seminar "Creativity for Business Innovation."

Also read