Mindblowing ChatGPT-4o real-time translation should terrify Google | Trusted Reviews

Trusted Reviews is supported by its audience. If you purchase through links on our site, we may earn a commission. Learn more.

Mindblowing ChatGPT-4o real-time translation should terrify Google

With Google I/O set to focus on the increasing talents of the Gemini AI app tomorrow, OpenAI is getting in there first by launching the latest version of Chat-GPT – ChatGPT-4o.

The new Chat GPT-4o – the ‘o’ stands for ‘omni’ because of its ability to handle audio, images, video and text – is partly headlined by the speed of real-time translation.

OnePlus 10T down to £367.71

OnePlus 10T down to £367.71

The OnePlus 10T is an excellent Android flagship, which has come of age for a MAJOR price cut. Now available at Amazon for just £367.71

  • Amazon
  • Once £729
  • Now £367.71
View Deal

For the iteration of ChatGPT-4 the company says it “trained a single new model end-to-end across text, vision, and audio, meaning that all inputs and outputs are processed by the same neural network. Because GPT-4o is our first model combining all of these modalities, we are still just scratching the surface of exploring what the model can do and its limitations.”

For people speaking different languages, this system could reap incredible rewards. It acts as a real-time go between with very little latency between hearing repeating the utterances back in the intended language.

If the demonstration showcased during OpenAI’s presentation today is the experience users get, it throws down the gauntlet to Google – the long time kings of mobile language translation through its powerful and brilliant Translate app.

One of the videos below video (there are other examples too) shows a man asking ChatGPT to act as a translator.

The man asks the AI to translate everything it hears in English into Italian, and then the other way around. Then, the OpenAI CTO Mira Murati speaks in Italian and the English response comes very rapidly, with an impressively conversational tone.

Interestignly, the AI refers to the speaker of the original language in the third person (“she said that…”) rather than simply translating the utterance. It is informed by the nuances in the user’s voice and can generate voices in “a range of different emotive styles”. OpenAI says it outperforms rivals like Google and Meta in terms of speed too.

Elsewhere videos published by the company shows users being able to interject and correct the AI and have it quickly shift course and respond in kind. Check out the faster counting video below, for example. The company also showcased the ability the incredibly lifelike conversational tone of voice and the ability to recognise its surroundings.

OpenAI says text and image input for GPT-4o is coming today, while the voice and video input will be added to the API in the coming weeks.

Why trust our journalism?

Founded in 2003, Trusted Reviews exists to give our readers thorough, unbiased and independent advice on what to buy.

Today, we have millions of users a month from around the world, and assess more than 1,000 products a year.

author icon

Editorial independence

Editorial independence means being able to give an unbiased verdict about a product or company, with the avoidance of conflicts of interest. To ensure this is possible, every member of the editorial staff follows a clear code of conduct.

author icon

Professional conduct

We also expect our journalists to follow clear ethical standards in their work. Our staff members must strive for honesty and accuracy in everything they do. We follow the IPSO Editors’ code of practice to underpin these standards.