Just a week after introducing new AI-generated voices for its viral ChatGPT chatbot, including one called “Sky,” OpenAI has been forced to pull Sky from service. The reason? Accusations that Sky’s voice sounded too much like actress Scarlett Johansson’s performance as an AI assistant in the 2013 film “Her.”
The controversy highlights the tricky ethical terrain companies must navigate as artificial intelligence (AI) becomes increasingly advanced and realistic, raising questions around appropriating someone’s voice or likeness without consent.
What Happened?
When OpenAI unveiled the latest version of ChatGPT last week, one of the new features allowed users to engage with the AI through voice chat. OpenAI offered five different AI-generated voice options, including the female-sounding “Sky.”
However, it didn’t take long for social media users to point out the striking similarity between Sky’s voice and Johansson’s performance voicing the AI system “Samantha” in the sci-fi romance “Her.” In that film, Johansson’s sultry tones helped bring the virtual assistant character to life as Joaquin Phoenix’s character fell in love with the AI.
Initially, OpenAI attempted to defuse the situation by explaining in a blog post how it selected and created the voices, stating that Sky’s voice came from “a different professional actress using her own natural speaking voice” and was not directly imitating Johansson.
But as the online backlash grew, with some claiming it was impossible to distinguish Sky from Johansson in “Her,” OpenAI made the decision to “pause the use of Sky” while working to address the concerns.
Johansson’s Objections
On Monday, Johansson released her own statement shedding more light on the situation. The acclaimed actress said she had been approached by OpenAI CEO Sam Altman about providing her voice for ChatGPT, but she declined the offer “for personal reasons.”
Johansson claimed that nine months after initially turning down Altman’s request, she was “shocked, angered and in disbelief” to hear how closely Sky resembled her voice from “Her” – to the point that even her friends and family couldn’t tell the difference.
The “Black Widow” star alleged that Altman tried contacting her again right before OpenAI’s product launch last week in the hopes she would reconsider, but the company publicized Sky’s voice before they had a chance to speak.
Adding to the controversy, Johansson pointed to a tweet from Altman on the day of the product event that appeared to reference “Her” and the idea of humans falling for AI assistants.
Deeper AI Concerns
The Sky voice situation taps into broader unease around the capabilities of modern AI systems to potentially misappropriate people’s identities, work, and likenesses without consent.
Infact, Johansson is already engaged in a legal battle against another AI company, Liska AI, over allegations it used her computer-generated likeness in marketing materials without permission.
In her statement, Johansson called for “appropriate legislation” to protect individuals’ rights from being violated by generative AI technologies that can replicate voices, images, and more with increasing realism.
“In a time when we are all grappling with deepfakes and the protection of our own likeness, our own work, our own identities, I believe these are questions that deserve absolutely clarity.”
How OpenAI Created the Voices
To create the voices like Sky, OpenAI said it worked with casting directors and voice actors, eventually selecting five professional performers out of over 400 submissions.
The actors were given scripts containing ChatGPT responses on different topics and recorded their natural speaking voices to generate the AI voices. OpenAI said it had “extensive coordination” with the actors over five months and discussed potential risks.
However, in line with its policy, OpenAI did not publicly identify the specific voice actors used for any of the AI voices, including Sky, in order to “protect their privacy.”
The company maintained that none of the voices, including Sky, directly imitated celebrities. But it acknowledged erring in the case of the Johansson-esque Sky voice.
“We believe that AI voices should not deliberately mimic a celebrity’s distinctive voice,” OpenAI stated, admitting “we made a mistake in dialing Sky’s voice too close” to Johansson’s.
The Future of AI Voices
Even as it pivoted on the Sky voice issue, OpenAI indicated its continued ambitions for voice and audio AI capabilities.
The company said it plans to introduce new AI-generated voices for ChatGPT in the future to “better match the diverse interests and preferences of users.” It also recently unveiled a text-to-audio AI system called Voice Engine, though chose not to publicly release it yet due to potential misuse risks.
OpenAI emphasized its voice actors are compensated “above top-of-market rates” and will continue being paid as long as their voices are used. The company said it remains committed to collaborating with voice artists on advancing the technology “in a way that respects actor rights.”
Whether the Johansson situation prompts wider guardrails or regulations around voice cloning and synthetic media remains to be seen. But it serves as a high-profile reminder that as AI capabilities accelerate, missteps involving people’s identities could become bigger flashpoints.