Sarvam AI Launches Powerful Open-Source LLM: A 24 Billion Parameter Model for Advanced AI Tasks.
  • 375 views
  • 2 min read

Sarvam AI, a Bengaluru-based artificial intelligence startup, has recently launched its flagship large language model (LLM), Sarvam-M, designed with a focus on Indian languages and reasoning capabilities. This open-source, 24-billion-parameter model is built upon Mistral Small and represents a significant stride towards building a sovereign AI ecosystem in India. Sarvam-M is engineered to power a diverse range of applications, including conversational agents, machine translation, and educational tools, specifically tailored to the Indian context.

Sarvam-M distinguishes itself through its unique training process, which involves three key steps: Supervised Fine-Tuning (SFT), Reinforcement Learning with Verifiable Rewards (RLVR), and inference optimization. During SFT, the model is trained using carefully crafted prompts to enhance its capabilities in general dialogue and complex reasoning. RLVR further refines its instruction-following and mathematical skills through custom reward engineering and curated datasets. Finally, inference is optimized using FP8 post-training quantization and techniques like lookahead decoding, ensuring efficient and accurate responses, especially in real-time applications.

The model has demonstrated strong performance in multilingual and reasoning benchmarks. It achieved an impressive 86% gain on a romanized Indian language version of the GSM-8K math dataset. Moreover, it showcased average performance boosts of 20% on Indian language benchmarks, 21.6% on math tasks, and 17.6% on programming tasks. When compared to other models, Sarvam-M outperforms Llama-4 Scout and is comparable to larger models like Llama-3.3 70B and Gemma 3 27B. However, it slightly lags in English benchmarks such as MMLU, indicating a trade-off for its enhanced multilingual and reasoning strengths.

Sarvam-M's architecture is designed for versatility, supporting a wide array of applications. Its accessibility is facilitated through Sarvam's API, a dedicated playground, and its availability for download on Hugging Face, enabling developers and researchers to experiment and integrate the model into various projects. The model supports 10 Indian languages, including Hindi, Bengali, Gujarati, Kannada, and Malayalam.

The launch of Sarvam-M is part of Sarvam AI's broader vision to create a sovereign AI ecosystem in India. This initiative is aligned with the Indian government's IndiaAI Mission, which aims to strengthen the country's domestic AI capabilities. Sarvam AI was selected by the Indian government to build a sovereign LLM under this mission.

Despite the model's capabilities, its initial reception has been mixed. While it has been praised for its focus on Indian languages, mathematics, and programming tasks, some critics have pointed out that it is not "good enough" compared to more established models. Some experts argue that there are cheaper and better models available from Google and other companies. The debate extends to whether India should focus on building AI for local needs or benchmark against Silicon Valley.

In response to the criticism, Sarvam AI has emphasized that Sarvam-M is a research model and a stepping stone towards building a comprehensive sovereign AI. The company plans to release models regularly and share detailed technical findings to foster collaboration and innovation.


Written By
Neha Gupta is a seasoned tech news writer with a deep understanding of the global tech landscape. She's renowned for her ability to distill complex technological advancements into accessible narratives, offering readers a comprehensive understanding of the latest trends, innovations, and their real-world impact. Her insights consistently provide a clear lens through which to view the ever-evolving world of tech.
Advertisement

Latest Post


Electronic Arts (EA), the video game giant behind franchises like "Madden NFL," "Battlefield," and "The Sims," is set to be acquired in a landmark $55 billion deal. This acquisition, orchestrated by a consortium including private equity firm Silver L...
  • 517 views
  • 3 min

ChatGPT is expanding its capabilities in the e-commerce sector through new integrations with Etsy and Shopify, enabling users in the United States to make direct purchases within the chat interface. This new "Instant Checkout" feature is available to...
  • 276 views
  • 2 min

The unveiling of Tilly Norwood, an AI-generated actor, has ignited a fierce debate in Hollywood, sparking anger and raising fundamental questions about the future of the acting profession. Created by Dutch producer and comedian Eline Van der Velden a...
  • 280 views
  • 2 min

Meta Platforms is preparing to launch ad-free subscription options for Facebook and Instagram users in the United Kingdom in the coming weeks. This move will provide users with a choice: either pay a monthly fee to use the platforms without advertise...
  • 369 views
  • 2 min

Advertisement
About   •   Terms   •   Privacy
© 2025 TechScoop360