Web Analytics
Bitcoin World
2026-05-07 22:45:13

OpenAI adds GPT-5-level voice reasoning and real-time translation to its API

BitcoinWorld OpenAI adds GPT-5-level voice reasoning and real-time translation to its API OpenAI announced Thursday that its API now includes a suite of new voice intelligence features, giving developers tools to build applications capable of natural conversation, live transcription, and real-time translation. The updates center on three new models — GPT-Realtime-2, GPT-Realtime-Translate, and GPT-Realtime-Whisper — each designed to handle different aspects of voice interaction. GPT-Realtime-2 brings GPT-5 reasoning to voice The flagship model, GPT-Realtime-2, succeeds GPT-Realtime-1.5 and is built on GPT-5-class reasoning. OpenAI says this enables the model to handle more complex user requests in real-time voice conversations, moving beyond simple call-and-response patterns. The company describes it as a realistic vocal simulation that can listen, reason, and respond contextually as a conversation unfolds. Real-time translation across 70+ languages GPT-Realtime-Translate offers conversational translation that keeps pace with natural speech. It supports more than 70 input languages — the languages it can understand — and 13 output languages for spoken responses. This positions the tool for use in international customer support, live events, education, and media localization, where speed and accuracy in spoken translation are critical. Live transcription with Whisper The third model, GPT-Realtime-Whisper, provides live speech-to-text capabilities that capture interactions as they happen. Unlike batch transcription services, this runs in real time, making it suitable for applications such as live captioning, meeting notes, and voice-controlled interfaces. Enterprise applications and guardrails OpenAI sees clear enterprise demand for these features, particularly in customer service automation. But the company also acknowledges misuse risks, including spam, fraud, and other forms of online abuse. To address this, OpenAI has embedded guardrails that can halt conversations if they violate harmful content guidelines. Specific triggers are built into the system to detect and stop abusive behavior. Pricing and availability All three models are available through OpenAI’s Realtime API. GPT-Realtime-Translate and GPT-Realtime-Whisper are billed by the minute of audio processed, while GPT-Realtime-2 is billed by token consumption, consistent with OpenAI’s existing pricing model for text-based models. Why this matters Voice interfaces have long been limited by latency and a lack of contextual understanding. OpenAI’s latest models aim to close that gap, making voice interactions feel more natural and capable of handling complex tasks. For developers, this means building apps that can transcribe, translate, reason, and act in real time — a step toward more human-like voice assistants. The updates also signal OpenAI’s continued push into multimodal AI, where voice, text, and reasoning converge in a single platform. Conclusion OpenAI’s new voice intelligence features represent a meaningful upgrade to its API, offering developers GPT-5-level reasoning, real-time translation, and live transcription in a single suite. With built-in guardrails and flexible pricing, the company is positioning these tools for broad enterprise adoption while addressing potential misuse. The updates are available now through the Realtime API. FAQs Q1: What is GPT-Realtime-2? GPT-Realtime-2 is OpenAI’s latest voice model, built on GPT-5-class reasoning, designed for real-time, natural voice conversations that can handle complex user requests. Q2: How many languages does GPT-Realtime-Translate support? It supports over 70 input languages for understanding and 13 output languages for spoken responses. Q3: How are the new voice models billed? GPT-Realtime-Translate and GPT-Realtime-Whisper are billed by the minute, while GPT-Realtime-2 is billed by token consumption. This post OpenAI adds GPT-5-level voice reasoning and real-time translation to its API first appeared on BitcoinWorld .

获取加密通讯
阅读免责声明 : 此处提供的所有内容我们的网站,超链接网站,相关应用程序,论坛,博客,社交媒体帐户和其他平台(“网站”)仅供您提供一般信息,从第三方采购。 我们不对与我们的内容有任何形式的保证,包括但不限于准确性和更新性。 我们提供的内容中没有任何内容构成财务建议,法律建议或任何其他形式的建议,以满足您对任何目的的特定依赖。 任何使用或依赖我们的内容完全由您自行承担风险和自由裁量权。 在依赖它们之前,您应该进行自己的研究,审查,分析和验证我们的内容。 交易是一项高风险的活动,可能导致重大损失,因此请在做出任何决定之前咨询您的财务顾问。 我们网站上的任何内容均不构成招揽或要约