Google Gemma 4 12B
MOUNTAIN VIEW, Calif.: Google releases Gemma 4 12B, a multimodal open-weight AI model designed to run locally on laptops with just 16GB of memory.
Google DeepMind announces the model under an Apache 2.0 license. Gemma 4 12B handles text, images, and native audio inputs without separate encoders. The encoder-free architecture cuts latency and memory requirements significantly versus traditional multimodal designs.
The model packs 12 billion parameters and delivers performance close to much larger systems. Google benchmarks show it approaching the Gemma 4 26B mixture-of-experts model on key tasks. This makes it the company’s first mid-sized Gemma model with native audio support.
Weights deploy freely on Kaggle and Hugging Face at just under 18GB total. The model runs through Hugging Face Transformers, vLLM, SGLang, MLX, llama.cpp, and LiteRT-LM. Developers can serve it as an OpenAI-compatible local API through the new litert-lm CLI.
Google also launches AI Edge Gallery and AI Edge Eloquent for macOS users. Both apps run Gemma 4 12B fully on-device, processing voice and visual inputs locally. A sandboxed Python execution loop lets users plot scientific charts inside the chat interface.
DRAM prices jumped roughly 90% in Q1 2026 as memory production redirected toward AI data centers. Micron told CNBC at CES it had effectively sold out memory capacity for 2026. A capable 16GB-memory model sidesteps both the hardware crunch and ongoing cloud inference costs.
Gemma models have now crossed 150 million total downloads since launch.
The launch reflects a broader industry pivot toward on-device AI deployment recently. Microsoft pushed Surface Laptop Ultra with RTX Spark earlier this week for local AI workloads. Apple Intelligence, Anthropic, and OpenAI all explore similar device-resident model strategies.
LAS VEGAS: Cisco has launched Cloud Control, a unified platform designed for both humans and…
MOUNTAIN VIEW, Calif.: Google has significantly reduced the price of its AI Plus subscription from…
MENLO PARK, Calif.: Instagram has introduced Grid Reordering, a feature that allows users to manually…
CUPERTINO, Calif.: Apple has unveiled Siri AI at WWDC 2026, a completely rebuilt version of…
MOUNTAIN VIEW, Calif.: Google has given NotebookLM a major upgrade, powering it with Gemini 3.5…
REDMOND, Wash.: Microsoft has unveiled the Xbox Series X25 Limited Edition, a special console marking…