OpenAI has recently revealed enhanced capabilities for ChatGPT, enabling the AI to interpret spoken language, respond using artificial voice, and analyze images, as announced on Monday.
This enhancement marks OpenAI’s most significant development since the launch of GPT-4, allowing users to engage in vocal interactions via ChatGPT’s mobile application. It offers an option of five distinct artificial voices for responses. Additionally, users can now share pictures with ChatGPT and specify areas for concentration or evaluation, for example, analyzing types of clouds in an image.
OpenAI plans to introduce these upgrades to the subscribed users in the forthcoming two weeks. Though the vocal interaction feature will be exclusive to the iOS and Android applications, the image analysis feature will be accessible across all platforms.
This major update is in line with the escalating competition in the AI sector among leading companies including OpenAI, Microsoft, Google, and Anthropic. The technology magnates are striving to integrate innovative AI into everyday consumer use by unveiling new chatbot applications and pioneering features, particularly this summer. Google and Microsoft have also unveiled extensive updates to their respective bots.
Earlier in the year, Microsoft boosted its investment in OpenAI by an additional $10 billion, marking it as the highest AI investment of the year as per PitchBook. Reportedly, the company concluded a $300 million share sale in April, valued between $27 billion and $29 billion, attracting investments from notable firms like Sequoia Capital and Andreessen Horowitz.
Concerns have been articulated regarding the potential of AI-generated artificial voices to create more authentic deepfakes, posing risks and challenges in cybersecurity. These synthetic voices, as clarified by OpenAI, are developed in collaboration with voice actors with whom they have directly coordinated, rather than sourcing from unidentified individuals.
The announcement was scant in detailing the handling and securing of consumer voice inputs by OpenAI.