Sarvam unveils two new large language models focused on real-time use, advanced reasoning

Sarvam unveils two new large language models focused on real-time use, advanced reasoning



Artificial intelligence startup Sarvam on Wednesday launched two new large language modelsSarvam-30B and Sarvam-105B — as the Bengaluru-based company expands its push into advanced reasoning and enterprise deployments.

The lighter Sarvam-30B is designed for efficient, real-time applications. It supports a context window of up to 32,000 tokens and has been trained on 16 trillion tokens. The company said the model is optimised for “efficient thinking”, delivering stronger responses while using fewer tokens — a key factor in reducing inference costs in production environments.

In benchmarks shared at the launch, Sarvam-30B was evaluated against models including Gemma 27B, Mistral-32-24B, OLMo 31.32B, Nemotron-30B, Qwen-30B and GPT-OSS-20B across tasks such as Math500, HumanEval, MBPP, Live Code Bench v6 and MMLU, which test mathematical reasoning and functional correctness. The company indicated competitive performance across general reasoning and coding benchmarks.

On the AIME benchmark — which measures mathematical reasoning under varying compute “thinking budgets” — Sarvam-30B showed improved performance as compute allocation increased, positioning it alongside other 30B-class reasoning models.

Sarvam also introduced Sarvam-105B, a higher-parameter model aimed at more complex reasoning tasks. The model supports a context length of 128,000 tokens and, according to the company, performs on par with several frontier open and closed-source models in its category.

The launch marks Sarvam’s move into larger-parameter models at a time when Indian AI startups are seeking to build foundational capabilities domestically rather than rely solely on global APIs. As enterprises prioritise cost efficiency, controllability and data residency, mid- to large-parameter open models are emerging as a viable deployment alternative.