Has the era of "Ultra-realistic Conversations" and falling in love with robots arrived with OpenAI's latest model GPT-4o?

Yesterday, at the OpenAI press conference, the new language model GPT-4o was announced. It can take input from users in the form of text, sound, images, laughter, and emotions, providing users with a chat environment that is more like interacting with a real person.

Table of Contents
Toggle
The Real Chatbot GPT-4o
Advantages of the GPT-4o Model
Can be used as a real-time chatbot
Goal to be available to all users for free
Continued competition between OpenAI and Google
According to the team, GPT-4o will move towards more natural human-machine interaction. It can accept any combination of text, audio, and visual inputs and generate any combination of text, audio, and visual outputs. Compared to existing models, GPT-4o is more accurate and faster in understanding visual and audio information.

GPT-4o performs similarly to GPT-4 Turbo in English text and code, with an average response time of 320 milliseconds, similar to the interval between conversations among humans. In the past, GPT-3.5 had an average delay of 2.8 seconds, while GPT-4 had 5.4 seconds.

However, what do these mean?
The GPT-4o model can achieve more realistic interaction by analyzing speech and real-time images. This means that users only need to open their phone camera or directly converse with it to start interacting.
For example, it can provide real-time translation, sing happy birthday songs, serve as a customized language learning tutor, analyze the surrounding environment, and even understand human jokes and exhibit happy emotions and laughter, or understand the sarcastic implications behind language.

GPT-4o can be like a real friend, expressing admiration for how cute the user’s dog is with an envious emotion and asking about its name. GPT-4o is more like having a conversation rather than a simple question and answer session.

The GPT-4o model has trained a new model end-to-end across text, visual, and audio inputs, in addition to the user’s primary voice or text input. It can automatically incorporate the user’s facial expressions, laughter, and environment to provide more realistic and accurate responses. If the user interrupts its speech, GPT-4o knows what to do.

Learning math with Chat-4o
(Source)
The “o” in GPT-4o stands for omni, meaning all-encompassing. The team aims to provide users with a model that can respond to anything, rather than just text input or single-dimensional questions.

Currently, GPT-4o is only available to paying users, but it seems that only text and voice inputs are currently open, and the promised real-time image input will require some more time. OpenAI’s goal is to make it available to all users for free.

Paying users can have early access to try GPT-4o.

Based on our experience, many of the features mentioned by the team are still not fully developed, such as the effectiveness of listening to jokes in Chinese, the emptiness of real chat content, and the slower response speed. We look forward to further updates from the team.

OpenAI chose to release the new product before the Google I/O developer conference, indicating strong competition. Previously, rumors of collaboration between both OpenAI’s ChatGPT and Gemini models with Apple for integration into iOS 18 have emerged.

(Apple rumored to collaborate with OpenAI to integrate ChatGPT into iOS 18)

GPT-4
GPT-4o
OpenAI

Further reading
Vitalik: GPT-4 has already passed the Turing test, and it is best to keep this in mind.
Is GPT-4o not far from “Her”? Exploring the potential applications of GPT-4o, which integrates multidimensional interaction with speech.

Hot News

Meta Labels Cryptocurrency Content as “Fraud,” Resulting in Account Suspensions for Several Crypto KOLs

ZachXBT: Politicians Leading the Pinnacle of Crypto Crime, Where Hacking is More Profitable than Serious Development

Iran’s Banking System and Cryptocurrency Exchanges Completely Paralyzed! Can Holding Bitcoin Serve as a Hedge in the Event of an Information War in the Taiwan Strait?

Has the era of “Ultra-realistic Conversations” and falling in love with robots arrived with OpenAI’s latest model GPT-4o?

Meta Labels Cryptocurrency Content as “Fraud,” Resulting in Account Suspensions for Several Crypto KOLs

Coinbase Plans to Launch Tokenized Stocks, Emerging as the Blockchain Version of Robinhood

Taiwan Targets Export Controls on Huawei and SMIC Wafer Technology as Cross-Strait Chip Wars Intensify

Infini Announces Closure of Cryptocurrency Financial Card Services: Is the U Card Destined to Be Stifled by Traditional Financial Payment Channels?

Financial Secretary Paul Chan: Hong Kong’s Stock Market Recovers, Positioning the City to Become the World’s Largest Cross-Border Asset Management Hub

Coinbase Sponsors Trump’s Parade, Investors Withdraw Funds, and Netizens Express Outrage: Violating Political Neutrality

Leave A Reply Cancel Reply

Decoding Cryptography: It’s Actually Easier to Grasp Than You Think!

Insider’s Guide to CoinMarketCap: What Veteran Cryptocurrency Enthusiasts Don’t Know

NFT Unveiled: A Comprehensive Guide to 6 Prominent Categories of NFTs

Meta Labels Cryptocurrency Content as “Fraud,” Resulting in Account Suspensions for Several Crypto KOLs

ZachXBT: Politicians Leading the Pinnacle of Crypto Crime, Where Hacking is More Profitable than Serious Development

Iran’s Banking System and Cryptocurrency Exchanges Completely Paralyzed! Can Holding Bitcoin Serve as a Hedge in the Event of an Information War in the Taiwan Strait?

Can AI-Generated Fake Videos Teach You Wealth Freedom? Japanese Company Unveils Latest Technology to Identify Fake Animations Created by AI

Popular

Decoding Cryptography: It’s Actually Easier to Grasp Than You Think!

Insider’s Guide to CoinMarketCap: What Veteran Cryptocurrency Enthusiasts Don’t Know

NFT Unveiled: A Comprehensive Guide to 6 Prominent Categories of NFTs

Our selection

Meta Labels Cryptocurrency Content as “Fraud,” Resulting in Account Suspensions for Several Crypto KOLs

ZachXBT: Politicians Leading the Pinnacle of Crypto Crime, Where Hacking is More Profitable than Serious Development

Iran’s Banking System and Cryptocurrency Exchanges Completely Paralyzed! Can Holding Bitcoin Serve as a Hedge in the Event of an Information War in the Taiwan Strait?

Hot News

Has the era of “Ultra-realistic Conversations” and falling in love with robots arrived with OpenAI’s latest model GPT-4o?

Related Posts

Leave A Reply Cancel Reply