Startup Stories

Andreessen-Backed Inferact Emerges from Stealth with $150M to Commercialize vLLM AI Inference Engine

Artificial intelligence infrastructure startup Inferact launched with $150 million in seed funding at an $800 million valuation to commercialize vLLM, the open-source AI inference acceleration framework developed at UC Berkeley.

The round was co-led by Andreessen Horowitz and Lightspeed Venture Partners, with participation from Databricks Ventures, UC Berkeley Chancellor’s Fund, Sequoia Capital, Altimeter Capital, Redpoint Ventures, and ZhenFund.

Inferact’s founding team includes Databricks co-founder and UC Berkeley computer science professor Ion Stoica, who directs the university’s Sky Computing Lab where vLLM originated in 2023. The project has since attracted contributions from more than 2,000 developers globally.

CEO Simon Mo stated, “We see a future where serving AI becomes effortless. Today, deploying a frontier model at scale requires a dedicated infrastructure team. Tomorrow, it should be as simple as spinning up a serverless database.”

vLLM optimizes AI model inference – the production deployment phase where models generate responses, through innovations like PagedAttention memory management, which eliminates GPU memory fragmentation. The technology enables models to generate multiple tokens simultaneously rather than one at a time, reducing loading times for users.

Co-founder Woosuk Kwon wrote in the announcement, “The complexity doesn’t disappear; it gets absorbed into the infrastructure we’re building,” describing Inferact’s strategy to provide enterprise-grade managed services atop the free open-source core.

Major technology companies including Amazon Web Services, Meta, Google, and Character.AI already deploy vLLM in production environments. The framework currently supports more than 500 model architectures and runs on more than 200 accelerators.]]

Inferact plans to launch a paid serverless version of vLLM that automates administrative tasks like infrastructure provisioning and software updates. The company will continue supporting vLLM as an independent open-source project while building proprietary enterprise features.

The funding follows a broader investment trend toward AI inference infrastructure as industry focus shifts from model training to cost-efficient production deployment at scale.

Shobhit Kalra

Shobhit Kalra is the Chief Sub Editor at Tea4Tech, with over 12 years of experience across digital media, digital marketing, and health technology. He is responsible for editorial review, content structuring, and quality control of articles covering software, SaaS products, and developments across the technology ecosystem. || At Tea4Tech, Shobhit oversees content accuracy, clarity, and adherence to editorial standards, ensuring published stories meet the newsroom’s guidelines for originality, sourcing, and consistency.

Recent Posts

University of Michigan AI System Interprets Brain MRI Scans in Seconds

ANN ARBOR: University of Michigan researchers developed an AI system that interprets brain MRI scans…

6 hours ago

Runway AI Raises $315Mn at $5.3Bn Valuation for World Models

NEW YORK: AI video generation startup Runway secured $315 million in Series E funding at…

9 hours ago

Supertails raises $30mn to scale petcare services in India

Bengaluru: Aiming to deepen its presence in the country’s fast-growing pet services market, the Indian…

11 hours ago

AI Boom Triggers Global Memory Chip Shortage, DRAM Prices Surge 600%

SAN FRANCISCO: A global memory chip shortage is driving unprecedented price increases, with the Dynamic…

1 day ago

India’s Sarvam AI Beats Google Gemini, ChatGPT on OCR Benchmarks

BENGALURU: Bengaluru-based startup Sarvam AI claims its models outperformed Google Gemini and ChatGPT on optical…

1 day ago

Oxford Study Warns AI Chatbots Are Unsafe for Medical Advice

OXFORD: AI chatbots pose risks to people seeking medical advice despite excelling at standardized medical…

1 day ago