Chief Editor

1 week ago

Categories: Tech News

AI-in-operated closed captions can open new possibilities-and loss

AI is making captioning faster and cheaper, but it has a lot to learn. Cole Kan/CNET/Getty Images

Bandh caption has become a prominent of TV- and film-looking experience. For some, this is a way to understand dialogue. For others, people who are deaf or hard to hear are an important access tool. But the captions are not correct, and tech companies and studios are looking fast to change AI.

Captioning for TV shows and films is largely done by real people, who can help ensure accuracy and preserve nuances. But there are challenges. Anyone who sees a live event with a closed caption, knows on-screen text, often lags behind, and can cause errors in the congestion of the process. Scripted programming offers more time for accuracy and expansion, but it can still be a labor-intensive process-or in the eyes of the studio, a expensive one.

In September, Warner Brothers Discovery announced that he is forming a team with Google Cloud to develop AI-operated closed captions, “coupled with human inspection for quality assurance.” In a press release, the company said that using AI reduced the low cost to 50%, and reduced the time when a file was imposed to caption up to 80%. Experts say that this is a glimpse in the future.

A web accessibility advocate and co-founder of Global Exercise Awareness Day, Joe Dewon, said about using AI in captioning, “Anyone who is not doing is just waiting for being displaced.” The quality of today's manual caption is “like everywhere, and it definitely needs to be improved.”

As AI continues to change our world, it is also telling how companies reach. For example, Google's expressive caption feature uses AI to improve better emotion and tone in the video. Apple added transcription for voice messages and memo in iOS 18, which is doubled as a way to make audio content more accessible. Both Google and Apple have real-time captioning tools that help deaf or hard-off-hearing people reaching audio content on their equipment, and Amazon added text-to-speech and captioning features to Alexa.

Warner Brothers are working closely with Google Cloud to roll out the Discovery AI-operated caption. A human process oversees.

Google/Warner Brothers Discovery

In the entertainment space, Amazon launched a feature called Dialogue Boost in Prime Video in 2023, which uses AI to identify and enhance the speech which can be difficult to listen to background music and effects. The company also announced a pilot program in March, which uses AI for dubbing films and TV shows “which would not have been dubbed otherwise,” it is said in one. blog postAnd only in a mark how the collectively dependent audience has been made on captioning, in April, Netflix rolled out only a dialogue-keval subtitle option for anyone, which only wants to understand what is being said in conversations, while leaving the sound description.

As AI continues to develop, and as we consume more materials on both big and small screens, this is only a few times ago, more studios, networks and technical companies have tapped AI's capacity – hopefully, hoping to remember why the first located captions exist.

Arriving at the forefront

The development of closed captioning in the US began as an accessibility measure in the 1970s, eventually more justified to wide spectators from live television broadcasting to movie blockbusters. But many spectators who are not deaf or listening to listening also prefer to watch movies and TV shows with caption – which are usually referred to as a subtitle, even if technically related to language translation – especially in cases where production dialogue is difficult to understand.

Half Americans say they usually see materials with subtitles, according to a 2024 survey of the language learning site, and 55% of the total respondents said that it has become difficult to hear dialogues in films and shows. Those habits are not limited to the old audience; The 2023 YouGov survey found that 63% of adults under the age of 30 prefer to watch TV with subtitles – 65 and older than 30% of people.

“People, and material creators also, believe that the captions are only tough deaf or hearing community,” said the president of disability and CEO Ariel Sims. But the caption can also be easy for anyone and can make it easier to maintain information.

By accelerating the captioning process, AI can help to make more content accessible, whether it is a TV show, movie or social media clip, Sims notes. But quality can suffer, especially in the early days.

“We have a name for AI-Janit caption in the disability community-we call them”NonsenseTions, 'Sims laughed.

This is because automatic captions still struggle with things like punctuation, grammar and appropriate names. Technology may not be able to take various accents, dialects or speech patterns in a human way.

Ideally, Sims said, companies using AI to generate captions will still be on a human ship to maintain accuracy and quality. The studios and networks should also work directly with the disability community to ensure that the process does not compromise.

“I am not sure that we can ever take humans out of the process,” Sims said. “I think the technology will continue to be better and better. But at the end of the day, if we are not partnering with the disability community, we are leaving an incredibly important perspective on all these access equipment.”

For example, Warner Bros.s emphasize the role of humans in ensuring studios such as Discovery and Amazon, AI-operated captioning and dubbing.

“You are going to lose your reputation if you allow AI to dominate your content,” Devon said. “This is the place where the human is going to be in the loop.”

But given how fast the technology is developing, human participation cannot live forever, he predicts.

“The studios and broadcasters who will cost the least cost to ensure that,” Devon said. But, he said, “If technology works better to improve an auxiliary technology, then who is anyone to stand in that path?”

Spleen

It is not just TV and movies where AI is doing supercharging captioning. Social media platforms such as Tiktok and Instagram have implemented auto-caption facilities to help make more material accessible.

These native captions often appear as a plain lesson, but sometimes, manufacturers choose the flashier display in the editing process. A normal “karaoke” style involves highlighting each individual word as using different colors for the text, it is being spoken. But this more dynamic approach, while holding the eye, can compromise the readability. People are not able to read at their own pace, and all colors and speeds can be distracted.

“There is no way to please 100% of users with caption, but only likes a small percentage profit and karaoke style,” of an accessibility marketing consultant Merryl. Evans said, which is deaf. She says that she will have to watch videos with a dynamic caption several times to receive the message. “The most accessible captions are boring. He allowed the video to be stars.”

But there are ways to maintain simplicity when adding useful references. The expressive caption facility of Google uses AI to emphasize some sounds and gives the audience a better idea of what is happening on their phone. An excited “Happy birthday!” For example, it may appear in all caps, or the enthusiasm of a sports announcer can be relayed by adding additional letters, to say that “Amaaazing Shot!” The expressive caption also looks like applause, gasp and whistle. All on-screen texts appear in black and white, so it is not distracted.

Expressive captions put a few words in the all-cap to express enthusiasm.

Google

Accessibility was a primary focus when developing the feature, but Angana Ghosh, director of the product management, said that the team knew that the users who are not deaf or not listening to, would also benefit from using it. (Think of all the time that you have gone out publicly without a headphone, but still wanted to follow what was happening in a video, for example.)

“When we develop for access, we are actually making a better product for all,” Ghosh says.

Nevertheless, some people may prefer more vibrant captions. In April, advertising agency FCB Chicago introduced an AI-managed platform, called caption with intenses, which uses animation, color and variable typography to express emotion, tone and pacing. Specific text colors represent lines of different varnas, and the words are highlighted and synchronized on the actor's speech. It helps to help transfer types of type and weight how loud someone is speaking, as well as their intonation. Open-source platform studio is available to apply production companies and streaming platforms.

FCB participated with the Chicago Hearing Society and is tough deaf and listening to develop and test captioning variations with them. FCB Chicago Executive Creative Director Bruno Mazoti said his own experiences raised by two deaf parents also helped shape the stage.

“The closed caption was a part of my life; it was a decisive factor what we were going to see as a family,” said Majoti. He said, “After the privilege of listening, I could always notice when things did not work well,” he said, such as when the caption was lagging behind the dialogue or when the lesson became a jambal when many people were speaking at once. “The major objective was to bring more emotion, pacing, tone and speaker for people.”

Caption with Intent is a platform that uses animation, color and different typography to express tones, emotion and pacing.

Caption with intentions

Eventually, Mazoti said, the goal is to offer more adaptation options so that viewers can adjust the intensity of the caption. Nevertheless, this more animated approach can be very distracted for some viewers, and it can make them difficult for them what is happening onscreen. It eventually boils for personal preference.

“It's not to say that we should clearly reject such approaches,” said Christian Vogler, director of the Technology Access Program at the University of Galodet. “But to ensure that they have a net profit, they need to be carefully studied with deaf and hearing audiences.”

No easy fix

Despite its current deficiencies, AI may eventually help to expand the availability of captioning and offer more adaptation, said Vogler said.

YouTube's auto-caption is an example of how, despite rough starts, AI can make more video content accessible, especially when the technique improves over time. There may be a future in which the captions correspond to different reading levels and speeds. Non-speech information can also be more descriptive, so that instead of a normal label such as “scary music”, you will find more details that express the mood.

But the learning state stands.

Vogler said, “AI captions still perform worse than the best of human captaints, especially if audio quality is compromised, which is very common in both TV and films,” Vogler said. The hallucinations can also serve the wrong caption that isolate the deaf and hard-off-hearing audience. He said that humans should remain part of the captioning process.

Debora Fails, director of the Inclusive Media and Design Center at Toronto Metropolitan University, said what will happen, what will happen. Human captainters will once oversee manual labor that AI will churn, she predicts.

“So now, we have a different type of job that is necessary in captioning,” Fails said. “Human beings are much better to find errors and decide to correct them.”

And while AI is still a newborn technology for captioning that is limited to a handful of companies, it is likely that it will not happen for a long time.

“They are all going in that direction,” Fails said. “It's a matter of time – and not so much time.”

Source link

Techcrunch mobility: A ride-sharing pioneer comes to Uber, Tesla loses more ground, and dog-like delivery robot land in Texas »

« Startups Weekly: AMD acquisition and other steps to scale AI startups

Chief Editor: With over two decades of experience in digital publishing, this seasoned writer and editor has established a reputation for delivering authoritative content, enhancing the platform's credibility and authority online.