Technology

Google Releases Gemini Embedding 2 in Public Preview via Vertex AI

San Francisco: Google DeepMind has released Gemini Embedding 2 in public preview, marking the company’s first natively multimodal embedding model. Available now via the Gemini API and Vertex AI, the model maps text, images, video, audio, and documents into a single unified embedding space, a capability no previous Google embedding model has offered.

Until now, developers building search or retrieval systems across multiple media types had to stitch together separate models for each modality, maintain separate indexes, and write glue code to merge results. Gemini Embedding 2 eliminates that complexity. The simplification mirrors Google’s wider strategy of reducing friction across its AI ecosystem, including making it easier for users and teams to move from other AI chatbots to Gemini.

A single API call can now accept interleaved inputs, a paragraph of text alongside images and an audio clip and return one embedding that captures relationships across all of them. Google says the model captures semantic intent across more than 100 languages, making it viable for global enterprise deployments out of the box.

The model handles five input types in one request: text up to 8,192 tokens, up to six images in PNG or JPEG format, video clips up to 120 seconds in MP4 or MOV, native audio without requiring speech-to-text transcription, and PDF documents up to six pages. That native audio embedding is a notable first prior embedding models required an intermediate transcription step before audio could be processed semantically.

Gemini Embedding 2 uses Matryoshka Representation Learning, a technique that nests information inside vectors so they can be truncated to smaller dimensions without significant accuracy loss. Developers can choose output sizes of 3,072, 1,536, or 768 dimensions allowing teams to balance retrieval quality against storage and infrastructure costs at scale.

Embedding models differ from generative models like Gemini 3 in a key way: rather than producing text, they convert content into mathematical vectors that machines use to measure semantic similarity. These vectors power experiences across Google’s own products, from Search to enterprise Workspace tools, and form the foundation of RAG pipelines, semantic search, and data classification systems increasingly central to enterprise AI deployments.

The model is already integrated with major developer tools including LangChain, LlamaIndex, Weaviate, ChromaDB, Pinecone, and Qdrant. Developers migrating from the older gemini-embedding-001 model will need to re-embed existing data, as the two models use incompatible vector spaces. Google says general availability will follow the public preview period.

The release also fits Google’s broader push to make its AI stack more developer-friendly, including efforts to turn AI Studio into a more full-stack environment for building and shipping Gemini-powered applications.

Anurag Shukla

Anurag Shukla is a Senior Journalist with over two decades of experience across television, digital, and print media. He has worked with leading national news organisations and has also served as a Research Officer in the Prime Minister’s Office (PMO), contributing to media research and policy-level content. A former journalism academic, Anurag brings strong editorial depth and a keen understanding of how technology, governance, and society intersect at Tea4Tech.