Artificial intelligence research company OpenAI announced Monday it is rolling out new voice and image capabilities in its popular conversational agent ChatGPT, which will allow users to have more natural conversations with the bot.
The new voice feature on iOS and Android devices enables users to have fluid back-and-forth dialogues with ChatGPT using natural speech. Users can opt into the voice capability in their mobile settings and choose from five different voices crafted by professional voice actors hired by OpenAI.
The voice assistant is powered by a sophisticated new AI text-to-speech model developed by OpenAI that can generate human-like speech from text using just a few seconds of sample audio data from the voice actors.
Read more: Rai Group denies rumors of mass Kenya exit
Users can ask ChatGPT questions, make requests and have open-ended discussions on any topic simply by speaking to the app instead of typing. The bot will respond conversationally in the selected voice.
Along with voice, OpenAI has introduced the ability for users to show ChatGPT images to get detailed information and commentary related to the visual content. On mobile devices, users can snap or upload photos and on all platforms they can highlight sections of an image to focus the chatbot’s response.
The image capabilities are enabled by OpenAI’s new multimodal AI models GPT-3.5 and GPT-4, which can apply advanced language reasoning skills to visualize and describe photographic images, documents, screenshots and more.
OpenAI said it is taking a gradual approach to rolling out the voice and image features to allow for ongoing safety improvements, research and risk mitigation.
Read more: Kenya unveils plan for 1,000 EV charging stations by 2027
The company acknowledged the voice synthesis technology could potentially be misused for fraudulent audio impersonations but said limiting its use to conversational chat can reduce risks.
The image feature also has some limitations to curtail harmful use cases. ChatGPT is prevented from making direct statements about individuals pictured to respect privacy and avoid inappropriate inferences.
“Real world usage and feedback will help us make these safeguards even better while keeping the tool useful,” OpenAI said in its announcement.
Access to the new capabilities will initially be limited to OpenAI’s paying Plus and Enterprise customers over the next two weeks before expanding to other user groups.