Artificial Intelligence

India’s Sarvam AI Beats Google Gemini, ChatGPT on OCR Benchmarks

BENGALURU: Bengaluru-based startup Sarvam AI claims its models outperformed Google Gemini and ChatGPT on optical character recognition and text-to-speech benchmarks focused on Indian languages, marking a milestone for domestic AI development.

Sarvam Vision achieved 84.3% accuracy on olmOCR-Bench, surpassing Gemini 3 Pro and DeepSeek OCR v2, while ChatGPT ranked significantly lower. On OmniDocBench v1.5, Sarvam Vision scored 93.28% overall, excelling in complex formulas and layout parsing.

Co-founder Pratyush Kumar shared benchmark results on X, stating “On Indian languages, Sarvam Vision is the best model by far, while supporting all 22 scheduled Indian languages.”

The Vision series includes a 3-billion-parameter state-space model capable of image captioning, scene text recognition, chart interpretation, and complex table parsing. The model handles messy layouts, tables, mathematical formulas, and technical documents where traditional OCR tools struggle.

Alongside Vision, Sarvam launched Bulbul V3, a text-to-speech model supporting 35 voices across all 22 official Indian languages. Bulbul V3 handles smooth language switching between Tamil and English or Hindi and English without disruption.

Tech commentator Deedy Das acknowledged changing his earlier skepticism: “I was wrong about Sarvam. When I wrote about them a year ago, I felt the direction to train small Indic language models was wrong. But they have the best text-to-speech, speech-to-text, and OCR models for Indic languages.”

Union IT Minister Ashwini Vaishnaw said the work reflects success of India’s AI mission.

Sarvam made its Document Intelligence API free through February 2026. The startup positions itself as building “sovereign AI” developed within India for government projects, public infrastructure, and BFSI sector applications, alongside innovations like indigenous AI smart glasses.

Anurag Shukla

Anurag Shukla is a Senior Journalist with over two decades of experience across television, digital, and print media. He has worked with leading national news organisations and has also served as a Research Officer in the Prime Minister’s Office (PMO), contributing to media research and policy-level content. A former journalism academic, Anurag brings strong editorial depth and a keen understanding of how technology, governance, and society intersect at Tea4Tech.

Recent Posts

Amazon Pledges Fresh $13 Bn to Scale Up AI, Cloud Infrastructure in India

New Delhi: Amazon has announced a fresh $13 billion investment in India focused on expanding…

2 days ago

Sakana AI Launches Fugu to Orchestrate Frontier Models

TOKYO: Tokyo-based AI startup Sakana AI has introduced two new products, Fugu and Fugu Ultra,…

3 days ago

Meta Invests $900 Mn in CRED, Gets Kunal Shah as WhatsApp Global Head

New Delhi: In a major leadership shake-up, Meta has appointed Kunal Shah, the founder of…

4 days ago

Odyssey Raises $310 Million Series B to Scale Its AI World Models

PALO ALTO, Calif.: Odyssey, an AI lab focused on building general-purpose AI world models, has…

4 days ago

AI Inference Startup Baseten Targets $13B Valuation in $1.5B Round

SAN FRANCISCO: Baseten is closing in on a massive $1.5 billion funding round at a…

5 days ago

Prem AI Eyes $100M Series A for Self-Hosted Enterprise AI Stack

LUGANO, Switzerland: Prem AI, a Swiss startup building a self-hosted enterprise AI platform, is looking…

5 days ago