OPENAI
The next AI app might answer out loud
OpenAI has added new voice features to its API to help developers build apps that can talk, transcribe, and translate conversations in real time.
The main update is GPT-Realtime-2, a voice model built for more natural conversations.
OpenAI says it uses GPT-5-class reasoning, which should help it handle more complex requests during live chats.
The company also launched GPT-Realtime-Translate, which can understand more than 70 languages and translate into 13.
It is designed to keep up with conversations as they happen.
Another new tool, GPT-Realtime-Whisper, offers live speech-to-text transcription for calls, meetings, events, media work, and accessibility tools.
The main points:
GPT-Realtime-2 is built for more natural and complex voice conversations.
GPT-Realtime-Translate supports 70+ input languages and 13 output languages.
GPT-Realtime-Whisper provides live transcription as people speak.
Meetings may suffer
OpenAI says the goal is to move voice AI beyond simple back-and-forth replies, making it able to listen, reason, translate, transcribe, and act during a conversation.
The tools could be useful for customer service, education, media, events, and creator platforms.
OpenAI also says it has added safeguards to help prevent misuse, including spam, fraud, and harmful content.
All three models are available through OpenAI’s Realtime API. Translate and Whisper are billed by the minute, while GPT-Realtime-2 is billed by token use.
ChatGPT is about to make my vacations more interesting. - MV


