DHAHRAN: In a world competing for automation, Clemen Simonic considers the most natural interface to be the most permanent. It's a human voice.
As the Founder and CEO of Soniox (a cutting-edge speech-to-text platform), Simonic is betting that voice-driven technology will drive the next wave of digital innovation.
And in a country like Saudi Arabia, where smartphones dominate everyday life and where young people are hungry for digital solutions, it is difficult to ignore the possibility.
Released five years ago by Simonic, Soniox offers speech recognition, transcription and real-time multilingual translation in over 60 languages.

Unlike many competitors, it offers ultra-fast token-level output in milliseconds. This is an important advantage for live assistants, wearables, bots and smart speakers.
However, Simonic's journey to building a company began long before the rise of generation AI.
“I started programming development right after high school. I was invited to join the Joseph Stephen Institute in Slovenia, one of the best institutes in this region of Europe,” he told Arab News.
“I worked with PhD and Postdocs and postdocs on machine learning, natural language processing, dependency analysis, tokenization, tagging and entity extraction.”

That early reveal led him to two internships at Stanford University in 2009 and 2011, where he worked with top AI researchers. “I wanted to join Google to tackle these cool things,” he said.
After his 2014 internship, Simonic was courted by both Google and Facebook. I finally joined the latter in 2015 to help build a speech recognition system used on Facebook, Instagram and WhatsApp.
Today, his company is entirely focused on Voice AI, and its promises go beyond convenience.
With built-in privacy and compliance, including Soc 2 Type II certification and HIPAA preparation, Soniox is already in use in hospitals, call centres and emergency rooms.
This section contains related reference points placed in the (opinion field)
“In particular, in complex medical terms, real-time AI interpretations, there are many medical clients who use APIs in emergency rooms where real-time AI interpretations can bridge communication gaps,” Simonic said.
Saudi Arabia is a particularly attractive market for the company's ambitions. With over 90% of smartphone penetration and 70% of people under the age of 35, the kingdom is a fertile position of voice-enabled technology.
The widespread adoption of government-developed platforms like Tawakkalna during the Covid-19 pandemic has only accelerated the Kingdom's reliance on mobile-first services.
“Data and artificial intelligence are contributing to the achievement of Saudi Arabia's Vision 2030, as 66 out of 96 direct and indirect goals of the vision are related to data and AI,” according to the Saudi Arabian & AI authorities.

The Kingdom's communications and IT sector is currently worth more than $44 billion (4.1% of gross domestic product), and is rapidly expanding with strategic investments in cloud computing, automation and smart infrastructure.
Soniox doesn't have a team in the region yet, but the company is taking great interest from Saudi Arabian organizations exploring AI-powered transcription and customer service tools.
Simonic said there are pilot programs in countries such as Portugal, and it has interest from Saudi Arabian companies, and is aiming to improve call centres and transcription services.
And while Arabic continues to be one of Voice AI's more complex languages, Simonic sees both challenges and opportunities. Many rural Saudi Arabia communities speak dialects with cultural nuances. This is a language that is often excluded from mainstream datasets.

This environment provides the fertile ground for Soniox technology, and “we strive to enable all languages, so that AI can speak and understand for everyone in the world.”
The Simonic team, based primarily in Slovenia, is working to expand language support to make technology more comprehensive, even in the marketplace where developers speak local tongues.
Soniox is designed with flexibility in mind. Companies can integrate APIs without storing audio or transcripts to ensure strict data control. For individual users, features such as encrypted transcripts and summary tools increase productivity even when using technology circumvention.
“My mother isn't very tech-savvy, but she uses our app to create a grocery shopping list,” Simonic said. “It wasn't the original purpose, but it shows how technology evolves in ways we didn't expect.”

In July, Soniox launched a new comparison tool that allows developers and businesses to use their own audio samples and real data to benchmark various speech AI providers.
This is another step towards transparency and wider adoption, especially in areas like the Gulf. In this region, choosing the right solution depends on performance in a variety of linguistic contexts.
“Technology changes, but the human voice remains the most intimate and effective way we communicate,” Simon said.
As Saudi Arabia advances digital transformation under Vision 2030, we may feel that technologies like Soniox will be amplified not only as a productivity tool, but also as a bridge between language, innovation and access in a rapidly changing world.

