OpenAI New Audio Tool Create Human-Like Voice Mimicry

OpenAI, a pioneer in artificial intelligence (AI) shares its early results of an audio tool capable of reading text aloud in a remarkably human-like voice. This development raises many concerns regarding the potential of deepfakes risks.

The company has offered early demos of its text-to-speech model, Voice Engine, through a limited-scale preview shared with ten developers. However, they decided not to widen the feature roll-out after receiving feedback from stakeholders, such as policymakers, educators, and industry experts.

“We recognise that generating speech that resembles people’s voices has serious risks, which are especially top of mind in an election year,” the company wrote in a blog post on March 29. “We are engaging with US and international partners from across government, media, entertainment, education, civil society and beyond to ensure we are incorporating their feedback as we build.” said the spokesman of OpenAI.

Voice Engine can produce speech that closely sounds like individual voices, complete with their specific cadence and intonations. By analyzing just 15 seconds of recorded audio, the tool can replicate a person’s voice perfectly.

During a demonstration, OpenAI showcased Voice Engine’s capabilities by generating speech indistinguishable from the voice of its CEO, Sam Altman. While the technical quality of the generated voice is impressive, concerns regarding safety and misuse abound.

One of OpenAI’s partners,  the Norman Prince Neurosciences Institute at Lifespan, is leveraging the technology to assist patients who lost their ability to speak clearly due to a brain tumor. They used this tool to help with her school project by replicating her speech from an earlier recording.

Voice Engine’s custom speech also offers translation into multiple languages, which is useful for businesses in the audio industry. For instance, companies like Spotify utilize their pilot programs to translate podcasts using this technology.

OpenAI has implemented strict usage policies for its partners. They must obtain consent from original speakers before utilizing their voices and disclose to listeners that the audio is AI-generated. OpenAI also adds an inaudible watermark to detect content created by their tool.

Share This Article
Facebook
Twitter
LinkedIn

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Get in touch with our consultant