San Francisco: Google DeepMind has released Gemini Embedding 2 in public preview, marking the company’s first natively multimodal embedding model. Available now via the Gemini API and Vertex AI, the model maps text, images, video, audio, and documents into a single unified embedding space, a capability no previous Google embedding model has offered.
Until now, developers building search or retrieval systems across multiple media types had to stitch together separate models for each modality, maintain separate indexes, and write glue code to merge results. Gemini Embedding 2 eliminates that complexity.
A single API call can now accept interleaved inputs, a paragraph of text alongside images and an audio clip and return one embedding that captures relationships across all of them. Google says the model captures semantic intent across more than 100 languages, making it viable for global enterprise deployments out of the box.
The model handles five input types in one request: text up to 8,192 tokens, up to six images in PNG or JPEG format, video clips up to 120 seconds in MP4 or MOV, native audio without requiring speech-to-text transcription, and PDF documents up to six pages. That native audio embedding is a notable first prior embedding models required an intermediate transcription step before audio could be processed semantically.
Gemini Embedding 2 uses Matryoshka Representation Learning, a technique that nests information inside vectors so they can be truncated to smaller dimensions without significant accuracy loss. Developers can choose output sizes of 3,072, 1,536, or 768 dimensions allowing teams to balance retrieval quality against storage and infrastructure costs at scale.
Embedding models differ from generative models like Gemini 3 in a key way: rather than producing text, they convert content into mathematical vectors that machines use to measure semantic similarity. These vectors power experiences across Google’s own products, from Search to enterprise Workspace tools, and form the foundation of RAG pipelines, semantic search, and data classification systems increasingly central to enterprise AI deployments.
The model is already integrated with major developer tools including LangChain, LlamaIndex, Weaviate, ChromaDB, Pinecone, and Qdrant. Developers migrating from the older gemini-embedding-001 model will need to re-embed existing data, as the two models use incompatible vector spaces. Google says general availability will follow the public preview period.
San Francisco: Google AI Studio has launched a completely rebuilt vibe coding experience. It is…
San Francisco: Perplexity has recently launched Perplexity Health, a new feature that connects directly to users’ health data from…
San Francisco: Google Labs has relaunched Stitch as a fully AI-native design canvas. Anyone can…
California: Google is introducing a new, safer way for Android users to install apps from outside the…
New York: Most security tools solve one problem. A password manager here. A VPN there.…
Washington, DC: Google has made its Personal Intelligence feature free for all users in the United States, instead…