The Trials and Tribulations of AI Voice Technology

AI For Business


this is the audio transcript of it FT News Briefing Podcast episode: ‘The Trials and Tribulations of AI Voice Technology

Mark Filipino
Good morning from the Financial Times. Today he is Wednesday June 21st. This is the FT News Briefing.

[MUSIC PLAYING]

Mark Filipino
A Singaporean sovereign wealth fund wants to invest heavily in the United States. And President Joe Biden’s son faces federal prosecution. Additionally, FT’s Madhumita Murja is testing AI voice technology.

Madumita Murzia
I can hear the real Madhu-like flashes there, but I still feel like I can tell the difference.

Mark Filipino
My name is Mark Filippino. We deliver the news you need to start your day.

[MUSIC PLAYING]

Mark Filipino
Singapore sovereign wealth funds are accelerating their US deals. This fund is called GIC. The firm recently told private equity and venture capital executives that it wanted to increase its exposure to U.S.-focused funds, according to FT sources. Like many global investors, GIC is looking to expand its business beyond China. They worry about escalating geopolitical tensions, a slowing economy and business repression in China. In the US, GIC focuses on investing in venture capital funds and technology companies. So despite last year’s big drop, there still seems to be some optimism about the U.S. tech industry.

[MUSIC PLAYING]

Mark Filipino
US President Joe Biden’s son has been exposed to federal charges. Hunter Biden has agreed to plead guilty to not paying his income tax. He also reached an agreement with prosecutors on another charge that charges him with possession of a firearm while addicted to drugs. Lawrence Hjoder, acting Washington bureau chief of the FT, provided more details.

Lauren Fedor
The Justice Department has been investigating Hunter Biden for several years. On top of that, Republicans have made a lot of noise about Hunter Biden, whether it’s his tax problems, this gun charge, or his overseas business dealings. This development is in some ways a big moment. And Hunter Biden and the White House want to move forward. But, as you know, House Republicans still control Congress and they are very adamant that they will continue to investigate Hunter Biden through congressional committees. So it’s not necessarily the end of Hunter Biden making headlines in Washington and elsewhere.

Mark Filipino
So you might think this could affect Joe Biden’s 2024 re-election campaign. But Lauren says that’s not always the case.

Lauren Fedor
Hunter Biden’s personal issues, drug and addiction issues, some of these investigations have been going on for a very long time. Looking back at the 2020 election, conversations at the time included Hunter’s Biden question, and we all know how that election ended. You know, now we’re soaking in his 2020 afterglow. We’ll have to see how it unfolds. For now, however, it may be entrenched to some extent in terms of voters’ perceptions of both the right and the left.

Mark Filipino
Lauren Hjoder is Acting FT Washington Bureau Chief.

[MUSIC PLAYING]

Mark Filipino
Artificial intelligence voice technology is already being used in a variety of applications, including audio books, video narration, and customer service interactions. Eleven Labs is he one of the startups selling synthetic speech generation. They rolled out an early version of their product in January, and Eleven Labs just raised $19 million, putting the company at a valuation of around $100 million. This technology is popular because it can mimic the voices of politicians and celebrities. But it also faces ethical and legal issues. FT AI editor Madhumita Murja reported.

Madumita Murzia
of this clip nightmare on elm street.

clip source nightmare on elm street
[Words are in Polish]

Madumita Murzia
Dubbing into Polish is the kind of cinematic experience Matty Staniszewski had to endure growing up in Poland.

Nightmare on Elm Street clip
[Words are in Polish].

Matty Staniszewski
Every movie you watch, every American or British movie, relies on the narration of a single narrator, not dubbing.

Madumita Murzia
Freddy Krueger, a screaming teenager, and a cop—they all have the same voice because they’re all done by the same person.

Matty Staniszewski
And as you can imagine, it’s a pretty awful experience.

Madumita Murzia
The 28-year-old engineer eventually moved from Poland to London. But his experience watching movies is one of the reasons he started his AI voice company. Generative AI technologies like ChatGPT have the ability to create new text, music, and images. AI voice technology also has the ability to imitate and create any voice, making it much easier to distinguish between Freddy Krueger and his victims. Matty co-founded ElevenLabs, along with another frustrated Polish movie buff, and is now one of the leading text-to-speech AI startups. I met Matty at a coworking space in London. In a glass-enclosed conference room, he opened his laptop to show me how the technology worked.

Matty Staniszewski
This instantly cloned the audio. With her permission, clone Madhu’s voice. I will call it AI Madhu. Then upload the file Madhu provided.

Madumita Murzia
I had already sent Matty the audio file of me speaking. He just needed my voice. His software His program extracted certain qualities of my speech and recreated my voice so that I could say basically anything. To show me what it’s like, Matty viewed his FT’s Wikipedia page, copied some of the text, and pasted it into an AI text-to-speech program.

Matty Staniszewski
You only need to click to generate this. Let’s see how it looks like.

Madumita Murzia
The Financial Times is a British daily business newspaper focused on current affairs in business and the economy. It will be printed in large format and published digitally.

Matty Staniszewski
What do you think?

Madumita Murzia
It’s funny because the accents are definitely different and I didn’t grow up here so I’m not very good at distinguishing between regional British accents. But I get it, and I hear flashes of real Madhu-like flashes there, but I still feel like I can tell the difference.

Madumita Murzia
I admit it wasn’t perfect. They’re still trying to get their accents right. But the way his software captures intonation and pace is a big breakthrough. Another big advancement for him is the ability to use the meaning of the text and written content to adjust the emotion of the delivery. I typed in something that looked like a lot of fun and this is what came out of my laptop.

Madumita Murzia
It’s so beautiful and sunny in London today. i’m so excited. I slept a lot and am really looking forward to the long weekend. i can’t wait.

Madumita Murzia
So I was really excited there. I think the ending part was pretty good. I think the words “I can’t wait” were phrased in a very right way.

Madumita Murzia
This possibility to capture emotions is what makes ElevenLab as a software so attractive for companies that create audiobooks or provide real-time customer service. But the more advanced the technology, the more likely it is to be exploited. ElevenLab has acknowledged that its software has been exploited, but has not specified. We know AI voice technology is being used for phone fraud and bank fraud. Eleven Labs users also talk about the program’s ability to create deepfake voices of celebrities and politicians. Matty’s reaction to all of this is similar to any techie. He recognizes the risks, but he downplays them. He said they are already working on technical solutions to discover real AI.

Matty Staniszewski
Already now, just like every sound that is produced hides another audible signal, we can decode it and know if it’s from Eleven Labs . So everything we produce, today, can be tracked in your account and action can be taken if you violate our terms of service or do something illegal.

Madumita Murzia
Matty said the company is stepping up its efforts to disable the accounts of users who violate its policies. But AI companies in general are nervous enough to seek legal advice.

Sophie Goossens
I think it’s been about once a week for the last six months, and once a day for the past two months.

Madumita Murzia
Sophie Goossens is a technology and copyright attorney. Although she doesn’t represent Eleven Labs, she says one of the biggest risks AI voice companies face is piracy.

Sophie Goossens
This is because AI engines require huge amounts of data to learn. Also, if the data you are learning from is copyrighted, you should always ask yourself whether permission is required to use that data. Another issue is privacy and data protection. As a human being with that voice, you are in a position to control what happens with your voice, even if it is a machine that is generating it.

Matty Staniszewski
Well, I think that will be the easiest.

Madumita Murzia
But the risks of AI voice technology extend beyond the legal gray area. The prospect of being able to speak using other people’s voices comes with some serious ethical issues. Hearing a synthesis of my voice, even if imperfect, was unsettling. Despite her risks, Matty remains focused on getting her technology out there. He said one of the films to be screened at this year’s Venice Film Festival will have audio generated entirely by AI technology.

Matty Staniszewski
It’s just the biggest surprise, kind of extending the platform beyond its design. The whole movie is a movie produced with the voice of Eleven Lab. All of these dialogue scenes are synthetic and will be our first foray into this.

Madumita Murzia
So maybe one day AI will be dubbed into Polish versions of 80s horror movies like this: nightmare on elm street. But to get there, young AI companies like Eleven Labs need to stay one step ahead of the legal and moral challenges this technology poses.

Madumita Murzia
This is Madhu from FT News Briefing. It was actually my synthetic voice. This is the real Madumita Murzia’s report at the FT News Briefing.

[MUSIC PLAYING]

Mark Filipino
You can read more about all these stories at FT.com. This is his daily FT news briefing. Be sure to check back tomorrow for the latest business news.



Source link

Leave a Reply

Your email address will not be published. Required fields are marked *