Sarvam’s 105 billion models put India on the AI ​​frontier map

AI News


Sarvam's 105 billion models put India on the AI ​​frontier map

As part of a major push to enhance India’s advanced AI capabilities, homegrown startup Sarvam has launched a 105 billion-parameter foundational large-scale language model (LLM), along with a suite of tools designed for commercial use. Co-founder Vivek Raghavan spoke about the company’s progress in Indian languages ​​and expansion into AI-powered devices. Sarvam’s 105 billion parameter AI model falls into the frontier category. How is it different from the global model, not only in size but also in structure? It is the largest model trained from scratch in India, with no dependencies on external data and is strongly based on Indian knowledge. Although it is a global model, it is built keeping the Indian context in mind. Models like Gemini and ChatGPT are still an order of magnitude larger, so we don’t claim to be comparable in scale. However, smaller models make them more efficient and cost-effective. For most real-world and agent use cases, models of this size provide excellent results without the need for extreme scale. How deeply is the model trained on Indian languages? Can it outperform global models on low-resource languages?We focus much more on Indian languages ​​than most global labs. Among comparable sized models, we excel in Indian languages. It’s not fair to compare systems that are tens of times larger, but within our size category we are more powerful. What specific advances have been made in Indian languages? We believe Indians will primarily experience AI through voice. We believe we are the best in the world at speech recognition across Indian languages ​​and dialects. Our new model is world class in natural speech synthesis. We also released a smaller vision model that performs better than much larger systems when extracting Indian characters from documents and images. We are as good or better than most LLMs of similar size. For example, we outperformed the DeepSeek model released last year and even compared to a version that was 6 times larger. Our goal is to be world-leading within our size class, especially in Indian language and domain-specific contexts. Were the models trained entirely on domestic infrastructure? How can inference be achieved at scale and affordably? Yes, under India’s AI Mission, using concessional GPUs, no dependence on external data, and trained entirely in India. Reasoning is another challenge. Training does not guarantee employment. We provide access to models, but it’s hard to compete in pure B2C when global players spend billions of dollars to provide free services. It’s a structural reality. We’re going beyond traditional mobile platforms. What is the strategy behind the expansion into devices such as smart glasses and feature phones? AI changes the interface. We think of smart glasses as business devices for recording, analyzing, and coaching voice-based conversations. Feature phone integration is about inclusion, giving users access to AI in their own language. We also aim to run small-scale models directly on devices to reduce costs and dependence on the cloud. The focus is on inclusion and more natural interaction.. How do you deal with hallucinations and cultural bias? Since the LLM is probabilistic, hallucinations are common. The solution is to build a grounding mechanism around them. A real system would never use raw LLM. Although the objective is to minimize risk, residual risk always exists. What are the next milestones for LLMs in India? We were the first to highlight “tokenization tax” in Indian languages. As tokenization improves, costs will come down. But research labs around the world have invested billions of dollars and are giving away their models for free, often using free user data to train their systems. This dynamic is unprecedented and will shape the competitive landscape.



Source link