Startup Stories

Andreessen-Backed Inferact Emerges from Stealth with $150M to Commercialize vLLM AI Inference Engine

Artificial intelligence infrastructure startup Inferact launched with $150 million in seed funding at an $800 million valuation to commercialize vLLM, the open-source AI inference acceleration framework developed at UC Berkeley.

The round was co-led by Andreessen Horowitz and Lightspeed Venture Partners, with participation from Databricks Ventures, UC Berkeley Chancellor’s Fund, Sequoia Capital, Altimeter Capital, Redpoint Ventures, and ZhenFund.

Inferact’s founding team includes Databricks co-founder and UC Berkeley computer science professor Ion Stoica, who directs the university’s Sky Computing Lab where vLLM originated in 2023. The project has since attracted contributions from more than 2,000 developers globally.

CEO Simon Mo stated, “We see a future where serving AI becomes effortless. Today, deploying a frontier model at scale requires a dedicated infrastructure team. Tomorrow, it should be as simple as spinning up a serverless database.”

vLLM optimizes AI model inference – the production deployment phase where models generate responses, through innovations like PagedAttention memory management, which eliminates GPU memory fragmentation. The technology enables models to generate multiple tokens simultaneously rather than one at a time, reducing loading times for users.

Co-founder Woosuk Kwon wrote in the announcement, “The complexity doesn’t disappear; it gets absorbed into the infrastructure we’re building,” describing Inferact’s strategy to provide enterprise-grade managed services atop the free open-source core.

Major technology companies including Amazon Web Services, Meta, Google, and Character.AI already deploy vLLM in production environments. The framework currently supports more than 500 model architectures and runs on more than 200 accelerators.]]

Inferact plans to launch a paid serverless version of vLLM that automates administrative tasks like infrastructure provisioning and software updates. The company will continue supporting vLLM as an independent open-source project while building proprietary enterprise features.

The funding follows a broader investment trend toward AI inference infrastructure as industry focus shifts from model training to cost-efficient production deployment at scale.

Shobhit Kalra

Shobhit Kalra is the Chief Sub Editor at Tea4Tech, with over 12 years of experience across digital media, digital marketing, and health technology. He is responsible for editorial review, content structuring, and quality control of articles covering software, SaaS products, and developments across the technology ecosystem. || At Tea4Tech, Shobhit oversees content accuracy, clarity, and adherence to editorial standards, ensuring published stories meet the newsroom’s guidelines for originality, sourcing, and consistency.

Recent Posts

Google Makes It Easy to Move from Other AI Chatbots to Gemini

California: Google has recently announced new features, namely “switching tools”, to help people make a switch from other AI chatbots, such as ChatGPT…

18 hours ago

WhatsApp’s New Update Brings AI Replies, Storage Tools & More

California: WhatsApp is introducing a slew of new features for its users, all aimed at making chats easier to manage and faster to respond to…

2 days ago

Defense AI Startup Shield AI Raises $2B at $12.7B Valuation

San Diego: Shield AI has raised $2 billion in new funding at a $12.7 billion…

2 days ago

Conntour Raises $7M to Build an AI Search Engine for Security Cameras

Tel Aviv: Conntour has raised $7 million in seed funding to build an AI-powered search…

2 days ago

Deccan AI Raises $25M to Power AI Post-Training for Frontier Labs

San Francisco: Deccan AI has raised $25 million in a Series A round to scale…

2 days ago

Google Launches Lyria 3 Pro, Its Most Advanced AI Music Model

California: Google has officially launched Lyria 3 Pro, a new artificial intelligence model designed to generate…

3 days ago