Sarvam AI and the Controversy Around India’s Sovereign LLM Initiative

In late 2024 and early 2025, Sarvam AI, a Bengaluru-based generative AI startup, found itself at the center of a heated public and policy debate in India. The startup was selected by the Indian government under the ambitious ₹10,371 crore IndiaAI Mission to develop India’s first homegrown large language model (LLM). This recognition, while a milestone for India’s AI aspirations, sparked significant backlash over issues of public resource allocation, transparency, open-sourcing, and the nature of innovation.

Background: The IndiaAI Mission

The Indian government’s IndiaAI Mission, spearheaded by the Ministry of Electronics and Information Technology (MeitY), was launched to build sovereign AI capabilities. At its core, the mission aims to reduce dependency on foreign AI platforms like OpenAI, Google, and Anthropic, by creating an ecosystem of Indian language AI tools tailored to local use cases. As part of this vision, the government offered massive support to selected firms—including grants, computing infrastructure (notably access to 4,000 GPUs for six months), and partnerships with public and private institutions.

Sarvam AI emerged as a lead beneficiary of this initiative. Co-founded by former Microsoft Research scientist Dr. Pratyush Kumar and ex-OpenAI contributor Dr. Vivek Tyagi, Sarvam AI positioned itself as a company focused on Indic language models and voice-first AI applications suitable for India’s linguistic and socio-economic diversity.

The Controversy Unfolds

The backlash against Sarvam AI began as details about its government support surfaced in the public domain. Critics questioned both the opacity of the selection process and the public-private model being used. The key points of contention are:

1. Use of Public Infrastructure for Proprietary Models

Perhaps the most controversial aspect of the deal is Sarvam AI’s access to publicly-funded infrastructure—specifically the 4,000 high-performance GPUs provided under the IndiaAI Compute Capacity project. These resources represent hundreds of crores in taxpayer money.

Opponents argue that since Sarvam AI is using public compute infrastructure, any AI models it develops should be fully open-source under the “public money, public code” principle. Many in the open-source and AI research communities contend that this is necessary to ensure transparency, reproducibility, and equitable access, especially in a country like India where small developers and researchers lack access to powerful hardware.

However, Sarvam AI has resisted full open-sourcing, citing the massive effort it put into creating proprietary datasets. The company argues that due to the lack of high-quality Indian language data, they were forced to invest heavily in building datasets that are now core business assets. Their position is that such investments warrant protection and cannot be freely released under permissive licenses.

2. Open-Source in Name Only?

While Sarvam AI has released some models such as Sarvam-2B, a 2-billion-parameter open-source LLM, and Shuka 1.0, a speech-to-text model for Indian languages, critics claim these releases are largely symbolic.

These models, they argue, are relatively small compared to state-of-the-art models developed internationally (e.g., OpenAI’s GPT-4 or Meta’s LLaMA-3 with tens of billions of parameters), and do not reflect the scale of models expected from a sovereign initiative with access to government-grade compute resources.

Further, some have raised concerns that the models are being released under restrictive open licenses (e.g., research-only or non-commercial), which limits their utility to startups, researchers, and smaller players—the very groups the IndiaAI Mission was supposed to empower.

3. Accusations of Repackaging Existing Tools

Online discussions, particularly on platforms like Reddit and X (formerly Twitter), have also cast doubt on the originality of Sarvam AI’s technology. Some users alleged that certain tools and demos launched by the company—like voicebots and text-to-speech (TTS) services—appear to be using off-the-shelf APIs from Western AI firms like Google Cloud, ElevenLabs, or OpenAI Whisper, rather than internally-developed models.

A Reddit post that went viral compared outputs from Sarvam AI’s voice tools with known outputs from ElevenLabs, suggesting that the Indian startup may simply be providing a UI/UX layer over foreign models, despite positioning itself as a builder of indigenous LLMs.

Sarvam AI has not directly addressed these accusations in public statements, but has reiterated that its long-term goal is to build full-stack Indian AI models, even if current services use external components during early development stages.

4. The Role of the Government: Transparency and Fairness

The Indian government, meanwhile, has come under fire for not clearly outlining the selection criteria or terms of support under which Sarvam AI was chosen. Civil society organizations and digital rights advocates have demanded more transparency in the process, questioning why one private company was granted exclusive access to publicly-funded compute infrastructure.

Some worry that this sets a precedent where government-backed innovation could disproportionately benefit select startups, essentially turning state resources into private profits.

In response, MeitY officials have stated that Sarvam AI was chosen based on its “demonstrated capabilities” and alignment with national language priorities. They also hinted that more startups and research labs would be onboarded in the next phases of the IndiaAI Mission. However, details remain scarce.

5. Balancing Innovation and Public Good

At the heart of the controversy lies a fundamental debate: how should India build its AI future—through publicly-funded, open-source innovation, or by incentivizing private players with a combination of subsidies and market opportunity?

Proponents of the latter approach argue that private companies can move faster and attract top talent, as seen in the U.S. with companies like OpenAI and Anthropic. They claim that expecting startups to open-source everything would kill commercial viability and scare off investors.

On the other hand, critics argue that India’s AI ecosystem is still nascent, and if core models, datasets, and infrastructure remain closed, it will deepen the digital divide and replicate the same structural inequalities seen in global AI markets. They say the government should focus on open-access LLMs, support public universities and research centers, and create shared datasets for broader use.

Conclusion: A Defining Moment for India’s AI Future

The Sarvam AI episode is not just a company-specific controversy—it is a microcosm of the challenges India faces as it seeks to define its AI strategy in a rapidly globalizing tech landscape.

As public investment in AI scales up, India must decide whether to prioritize national AI sovereignty through open science or through private entrepreneurship with government backing. Both models have their merits, but without clarity, transparency, and public accountability, the line between innovation and privatization of public goods may become dangerously blurred.

The coming months will be crucial. With India poised to emerge as a global AI player, how it manages this controversy—especially in terms of access, equity, and openness—could set the tone for decades to come.

Leave a Comment