Startup Stories

Andreessen-Backed Inferact Emerges from Stealth with $150M to Commercialize vLLM AI Inference Engine

Artificial intelligence infrastructure startup Inferact launched with $150 million in seed funding at an $800 million valuation to commercialize vLLM, the open-source AI inference acceleration framework developed at UC Berkeley.

The round was co-led by Andreessen Horowitz and Lightspeed Venture Partners, with participation from Databricks Ventures, UC Berkeley Chancellor’s Fund, Sequoia Capital, Altimeter Capital, Redpoint Ventures, and ZhenFund.

Inferact’s founding team includes Databricks co-founder and UC Berkeley computer science professor Ion Stoica, who directs the university’s Sky Computing Lab where vLLM originated in 2023. The project has since attracted contributions from more than 2,000 developers globally.

CEO Simon Mo stated, “We see a future where serving AI becomes effortless. Today, deploying a frontier model at scale requires a dedicated infrastructure team. Tomorrow, it should be as simple as spinning up a serverless database.”

vLLM optimizes AI model inference – the production deployment phase where models generate responses, through innovations like PagedAttention memory management, which eliminates GPU memory fragmentation. The technology enables models to generate multiple tokens simultaneously rather than one at a time, reducing loading times for users.

Co-founder Woosuk Kwon wrote in the announcement, “The complexity doesn’t disappear; it gets absorbed into the infrastructure we’re building,” describing Inferact’s strategy to provide enterprise-grade managed services atop the free open-source core.

Major technology companies including Amazon Web Services, Meta, Google, and Character.AI already deploy vLLM in production environments. The framework currently supports more than 500 model architectures and runs on more than 200 accelerators.]]

Inferact plans to launch a paid serverless version of vLLM that automates administrative tasks like infrastructure provisioning and software updates. The company will continue supporting vLLM as an independent open-source project while building proprietary enterprise features.

The funding follows a broader investment trend toward AI inference infrastructure as industry focus shifts from model training to cost-efficient production deployment at scale.

Shobhit Kalra

Shobhit Kalra is the Chief Sub Editor at Tea4Tech, with over 12 years of experience across digital media, digital marketing, and health technology. He is responsible for editorial review, content structuring, and quality control of articles covering software, SaaS products, and developments across the technology ecosystem. || At Tea4Tech, Shobhit oversees content accuracy, clarity, and adherence to editorial standards, ensuring published stories meet the newsroom’s guidelines for originality, sourcing, and consistency.

Recent Posts

Amazon Pledges Fresh $13 Bn to Scale Up AI, Cloud Infrastructure in India

New Delhi: Amazon has announced a fresh $13 billion investment in India focused on expanding…

2 days ago

Sakana AI Launches Fugu to Orchestrate Frontier Models

TOKYO: Tokyo-based AI startup Sakana AI has introduced two new products, Fugu and Fugu Ultra,…

3 days ago

Meta Invests $900 Mn in CRED, Gets Kunal Shah as WhatsApp Global Head

New Delhi: In a major leadership shake-up, Meta has appointed Kunal Shah, the founder of…

4 days ago

Odyssey Raises $310 Million Series B to Scale Its AI World Models

PALO ALTO, Calif.: Odyssey, an AI lab focused on building general-purpose AI world models, has…

4 days ago

AI Inference Startup Baseten Targets $13B Valuation in $1.5B Round

SAN FRANCISCO: Baseten is closing in on a massive $1.5 billion funding round at a…

5 days ago

Prem AI Eyes $100M Series A for Self-Hosted Enterprise AI Stack

LUGANO, Switzerland: Prem AI, a Swiss startup building a self-hosted enterprise AI platform, is looking…

5 days ago