OpenAI introduces voice and image prompts to ChatGPT

OpenAI is bringing audio and image capabilities to ChatGPT.

The platform, which has long been limited to written prompts, will be adding the new features over the next two weeks to paid versions of the app, OpenAI announced in a blog post on Monday.

Everyone else will be receiving the features “soon after”.

What can you do with ChatGPT’s update?

Users can have voice conversations with the chatbot, bringing it closer to popular AI assistants such as Apple’s Siri and Amazon’s Alexa.

ChatGPT’s new voice feature can also narrate bedtime stories, settle debates at the dinner table and speak out loud text input from users.

The technology behind it is being used by Spotify for the platform’s podcasters to translate their content into different languages, OpenAI said.

Users can also upload one or multiple images to the interface, and use the drawing tool to highlight specific parts of the image.

The vision feature can be used to “troubleshoot why your grill won’t start, explore the contents of your fridge to plan a meal, or analyze a complex graph for work-related data”.

How have people responded?

OpenAI’s announcement has invited a range of reactions on X, formerly Twitter. While some users have celebrated the new update, others have raised concerns.

As intriguing as this may sound, I certainly hope that the rapid advancements in technology and artificial intelligence do not lead to a situation reminiscent of the Y2K scare or a potential machine uprising. It’s essential for us to responsibly develop and manage these…

— Christopher CSI (@CSI9ja) September 25, 2023

In a conversation with WIRED, Trevor Darrell, professor at UC Berkeley and a co-founder of Prompt AI, said that the fear of AI becoming too human-like is described as the “uncanny valley gap”.

While the added functions might make the chatbot feel more natural, some research suggests that complex interfaces that fail to mimic human interaction can feel strange to use, which might make the product harder to use.

Users are raising concerns about the recent lawsuits against OpenAI’s violation of copyright laws and infringement of intellectual property rights, advising others to not use ChatGPT.

ChatGPT is currently under Federal investigation for data leaks, inaccuracy, privacy violations, deceptive practices, and reputational harm. Using it is NOT advised.https://t.co/1EdNKbqFPW https://t.co/57rlB3kMFD https://t.co/SMF62KRsSe https://t.co/IBTW92I7FJ

— Nicole Miller (@JOSourcing) September 25, 2023

Others have also brought up how the updates might replace smaller AI startups, software engineers, and even educators in the future.

So… how many startups just died in the last 5 mins?

— Terry Tan (@terrytjw) September 25, 2023

@felixchin1 this is essentially what I was describing. Camera always on, the AI is just observing everything and talking back and forth with you like a private tutor would. If they release this then education as we know it is over.

— Brad (@Brad08414464) September 25, 2023

Software engineers will just become digital plumbers for this stuff

— Andy (@AndyTech99) September 25, 2023

AI-generated voices have also raised the threat of deepfakes, voice scams and identity theft.

The malicious use of AI voice generators is on the rise, where AI mimics the voice of a real person and calls their relatives for money. A McAfee report suggests that 77 percent of people targeted by an AI voice scam lost money as a result.

Additionally, the addition of voice recognition might make the feature less accessible to people who do not speak with mainstream accents, said Joel Fischer, who studies human-computer interaction at the University of Nottingham in the UK.

Since the image function allows the AI to recognise images, users are concerned that the bot might be able to bypass image verification CAPTCHA tests on websites.

These tests that require users to prove that they are not bots by transcribing distorted text and recognising images are designed to limit access.

A recent study, that has yet to be peer reviewed, shows that AI bots can solve CAPTCHA tests faster and more accurately than humans.

RIP captchas

— Chase (@ChaseMc67) September 25, 2023

Has ChatGPT acknowledged these risks?

OpenAI has acknowledged that the voice feature in the new update holds the potential for malicious actors to commit fraud and impersonation. To avoid this, the company said it is “using this technology to power a specific use case”.

This happens to be voice chat created with voice actors the company directly worked with.

The company has also acknowledged the limitations of using images in AI, including image hallucinations where the AI generates false information about the image.

To counter this, OpenAI has taken technical measures to limit ChatGPT’s ability to analyse and make direct statements about people.