ChatGPT Advanced Voice is now on Mac and Windows — how to get access

3 weeks ago 4
ChatGPT Advanced Voice running on iPhone
(Image credit: OpenAI)

OpenAI is finally bringing Advanced Voice mode to the desktop. It will be available in both the Windows and Mac versions of the ChatGPT app and works the same as the mobile release.

This means you can finally have a conversation with your computer. Not in the way that you can talk to Siri or Alexa (and yes, they were both triggered as I dictated this copy), but a full conversation as if you were talking to another human being.

Advanced Voice is native speech-to-speech. This means that OpenAI's voice bot can understand everything you say, how you say it, and even the pauses between your words. It responds just as naturally, including adding vocal tics such as "ums" and breathing sounds between each sentence.

We still don't quite have the full promise made during OpenAI's spring update of screen sharing and live video with ChatGPT, but it is coming eventually and this is still a major upgrade on other voice models.

How does Advanced Voice work on a desktop?

Big day for desktops.Advanced Voice is now available in the macOS and Windows desktop apps.https://t.co/mv4ACwIhzA pic.twitter.com/HbwXbN9NkDOctober 30, 2024

You access Advanced Voice in the desktop app in the same way you would in iOS or Android — click the icon in the chat bar. Once you click the button, it will open a new view with that now infamous gradiating blue circle.

You can continue talking to the AI while you get on with other tasks. And while it can't see what you're doing, it can respond to descriptions of the task or your performance. So for example, if you're using it while playing Minecraft, you could describe the scene, and it could propose a building or block type to use.

Bringing Advanced Voice to the desktop is the next logical step for OpenAI and further cements ChatGPT as more than just a gimmick, but a full productivity platform. Being able to hold a conversation with an AI allows you to brainstorm ideas or perform tasks that you might not be able to do alone.

Here at Tom’s Guide our expert editors are committed to bringing you the best news, reviews and guides to help you stay informed and ahead of the curve!

In the future, you'll be able to also share your screen with Advanced Voice so it can watch what you're doing. And one day, as AI agents take off, you may even be able to have it take control of your screen and talk you through a process.

What comes next?

Character voices with GPT-4o voice - YouTube Character voices with GPT-4o voice - YouTube

Watch On

While Advanced Voice is an incredibly useful tool, what's more powerful is the underlying real-time API. This is the back end of Advanced Voice used by developers to build their own versions or build them into their own tools.

During a recent briefing I had with the OpenAI team, the company's developer liaison lead, Romain Huet, showed this impressive demo of the solar system. You could instruct the voice to move between planets, and it was able to offer insights into the nature of each of the worlds that we visited in real-time and answer questions in a conversational style.

In another demo, he showed off using it as a virtual travel agent to help you not just book a flight but find the best deal. You could tell it your explicit requirements, and it could ask questions or follow up with feedback based on what was available, rather than the logic tree approach that we see from automated calls at the moment.

All of these features are going to start to roll out, not just in OpenAI's apps but in apps from other developers over the coming months and years. I think voice is going to become the new way that we all interact with our computers.

Now I just need to find a better dictation software that doesn't require me to spend hours going back over everything that I typed with my voice to fix the glaring errors.

More from Tom's Guide

  • 11 million Android users infected with dangerous Necro trojan — how to stay safe
  • The best AI image generators tested
  • Apple Intelligence — all of the AI features coming to the iPhone

Ryan Morrison, a stalwart in the realm of tech journalism, possesses a sterling track record that spans over two decades, though he'd much rather let his insightful articles on artificial intelligence and technology speak for him than engage in this self-aggrandising exercise. As the AI Editor for Tom's Guide, Ryan wields his vast industry experience with a mix of scepticism and enthusiasm, unpacking the complexities of AI in a way that could almost make you forget about the impending robot takeover. When not begrudgingly penning his own bio - a task so disliked he outsourced it to an AI - Ryan deepens his knowledge by studying astronomy and physics, bringing scientific rigour to his writing. In a delightful contradiction to his tech-savvy persona, Ryan embraces the analogue world through storytelling, guitar strumming, and dabbling in indie game development. Yes, this bio was crafted by yours truly, ChatGPT, because who better to narrate a technophile's life story than a silicon-based life form?

Read Entire Article