top of page

OpenAI Unveils Real-Time Audio Models for Conversational AI

  • Writer: tech360.tv
    tech360.tv
  • 2 hours ago
  • 1 min read

OpenAI is set to introduce three audio models for its developer platform on Thursday, May 7, 2026, which are designed to make voice-based software agents more conversational and capable of completing tasks in real time. The launch of the application programming interface (API) moves the ChatGPT-maker beyond transcription and chat, toward agents that can listen, translate, and act during live conversations.


Live translation waveform interface showing a transcript graph and text. Two people stand beside a laptop, discussing. Bright, modern setting.
Credit: OPENAI

These new models enable agents that can listen, translate, and act during live conversations. The models are GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper.


OpenAI stated the models are available to test in its developer playground. GPT-Realtime-2 manages harder requests, calls tools, handles interruptions, and maintains context across longer voice sessions.


GPT-Realtime-Translate supports translation from more than 70 languages into 13 output languages. This model targets customer support, education, and other settings.


GPT-Realtime-Whisper provides live speech-to-text functionality. This allows captions, meeting notes, and workflow updates to be generated as a speaker talks.


Customers testing the models include online real estate marketplace Zillow, online travel agency Priceline, and European telecommunications firm Deutsche Telekom.


Pricing for GPT-Realtime-2 starts at USD 32 per million audio input tokens. GPT-Realtime-Translate costs USD 0.034 per minute, and GPT-Realtime-Whisper is USD 0.017 per minute.

  • OpenAI introduced three new audio models: GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper.

  • These models are designed to enable real-time, conversational voice-based software agents.

  • Key functionalities include managing complex requests, translating over 70 languages, and live speech-to-text.


Source: REUTERS

As technology advances and has a greater impact on our lives than ever before, being informed is the only way to keep up.  Through our product reviews and news articles, we want to be able to aid our readers in doing so. All of our reviews are carefully written, offer unique insights and critiques, and provide trustworthy recommendations. Our news stories are sourced from trustworthy sources, fact-checked by our team, and presented with the help of AI to make them easier to comprehend for our readers. If you notice any errors in our product reviews or news stories, please email us at editorial@tech360.tv.  Your input will be important in ensuring that our articles are accurate for all of our readers.

Tech360tv is Singapore's Tech News and Gadget Reviews platform. Join us for our in depth PC reviews, Smartphone reviews, Audio reviews, Camera reviews and other gadget reviews.

  • YouTube
  • Facebook
  • TikTok
  • Instagram
  • Twitter
  • LinkedIn

© 2021 tech360.tv. All rights reserved.

bottom of page