ChatGPT’s controversial advanced voice mode is finally here

Getting your Trinity Audio player ready...

ChatGPT is about to get even chattier. OpenAI announced this week that it will finally start rolling out a new artificial intelligence (AI) feature that allows the chatbot to engage in real-time voice conversations with its users.

OpenAI announced the new feature in May as it launched its latest GPT-4o, although it has been in development since last year. While it offered a choice of four voices, users were quick to notice that one of them sounded an awful lot like Hollywood superstar Scarlett Johansson. The actress blasted the company, and it took down her voice while still denying the infringement.

After riding out the wave of bad press from the Johansson debacle, OpenAI has now announced the launch of its Advanced Voice Mode.

We’re starting to roll out advanced Voice Mode to a small group of ChatGPT Plus users. Advanced Voice Mode offers more natural, real-time conversations, allows you to interrupt anytime, and senses and responds to your emotions. pic.twitter.com/64O94EhhXK
— OpenAI (@OpenAI) July 30, 2024

The new feature will initially be available only to a small group of ChatGPT Plus users. The company pledged to add more users on a rolling basis, with the goal of onboarding all its paying users by the fall.

According to OpenAI, the new feature allows ChatGPT to sense and respond to user emotions, which can be interrupted midway through a conversation with new instructions. Users with early access say it even simulates taking a breath when giving a long response.

In a live demonstration in May, CTO Mira Murati tasked a ChatGPT assistant equipped with the new feature with varying requests, from telling a bedside story in a calming voice to using a robotic tone. They interrupted it several times, and in all the tests, it would naturally stop and await new instructions. The assistant could even tap the phone’s camera and relay what it was seeing around it.

However, OpenAI announced in June that it would postpone the launch by one month for safety reasons. It defended the delay this week, saying it took the time to “reinforce the safety and quality of voice conversations.”

The company says it has now tested the voice assistant with over 100 external red teamers across 45 languages. Red teamers are ethical hackers who emulate real-life attacks to test a system’s security.

To avoid misuse and protect people’s privacy, OpenAI will also limit the assistant to four preset voices.

“We’ve made it so that ChatGPT cannot impersonate other people’s voices, both individuals and public figures, and will block outputs that differ from one of these preset voices,” commented spokesperson Taya Christianson.

In order for artificial intelligence (AI) to work right within the law and thrive in the face of growing challenges, it needs to integrate an enterprise blockchain system that ensures data input quality and ownership—allowing it to keep data safe while also guaranteeing the immutability of data. Check out CoinGeek’s coverage on this emerging tech to learn more why Enterprise blockchain will be the backbone of AI.

Watch: Transformative AI applications are coming