IBM and Deepgram are collaborating to integrate advanced text-to-speech and text-to-speech capabilities into watsonx Orchestrate. This collaboration aims to help organizations further automate their operations and better meet the growing demand for conversational AI technology.
IBM makes Deepgram its first speech partner. This integration brings voice AI to the watsonx Orchestrate platform, enabling enterprises to build digital agents that respond to natural speech. Users have access to advanced speech recognition, real-time captioning, and customization options.
Real-world audio challenges
IBM recognizes that organizations deploying AI-powered speech-to-text systems regularly face challenges. Background noise, diverse accents, and natural conversation dynamics complicate automated transcription. Deepgram’s technology aims to overcome these obstacles by supporting dozens of Arabic and Indian language variants, as well as regional accents.
Watsonx Orchestrate provides enterprises with ready-to-use AI agents that can be implemented in minutes instead of months. The platform supports multi-agent orchestration and includes a catalog of pre-built agents and tools. Integration with Deepgram adds voice functionality.
Applications in healthcare and financial services
Combining technologies creates opportunities to improve automated customer service, call analytics, and voice data entry. This solution opens new opportunities, especially in areas such as healthcare and financial services, where accuracy and compliance are critical.
By incorporating Deepgram into watsonx Orchestrate Agent Builder, IBM customers can build voice agents and voice-driven workflows on a real-time foundation. Nick Holder, vice president of AI technology partnerships at IBM, emphasized that the integration opens up new possibilities for speech recognition and transcription. “This collaboration is designed to help organizations accelerate their AI journey and strengthen IBM’s open ecosystem,” Holder said.
