Google released Gemma 4 yesterday, a family of four open-source models built on the same technology as their closed Gemini 3. The models range from mobile-optimized variants (Effective 2B and 4B) to larger models for workstations and servers (26B Mixture of Experts and 31B Dense). The 31B model currently ranks #3 on the Arena AI leaderboard for open models, outperforming systems 20 times its size.
The technical capabilities are substantial: native function calling for agentic workflows, structured JSON output, a 256k context window, and multimodal support for images and video. The mobile variants include native speech recognition. Google worked with Qualcomm and MediaTek to optimize these for devices like smartphones, Raspberry Pi, and Jetson Nano. For code generation, Google claims Gemma 4 can deliver quality comparable to cloud services like Gemini Pro and Claude Code, but running entirely offline.
But the real story is the license change. Previous Gemma versions used a custom license that developers criticized as too restrictive for serious commercial use. With Gemma 4, Google switched to Apache 2.0, the same permissive license used for Android and countless other open-source projects. This removes the legal uncertainty that made enterprises hesitant to build production systems on Gemma.
Why the Apache 2.0 license matters
License terms determine what you can actually do with a model. Previous Gemma licenses included restrictions on commercial redistribution, derivative works, and usage at scale. Legal teams at large organizations often blocked Gemma adoption because the license terms were unclear or incompatible with enterprise policies. Apache 2.0 is different: it explicitly permits commercial use, modification, and distribution with minimal restrictions.
This matters for the sovereign AI discussion. European organizations increasingly want to run AI workloads on infrastructure they control, without routing sensitive data through US cloud providers. But "open-source" models with restrictive licenses created a contradiction: you could self-host the model, but you could not freely use it for commercial purposes or modify it for your needs. Apache 2.0 resolves this. When Google says these models have been downloaded 400 million times and spawned over 100,000 community variants, the Apache license means those variants can now be used commercially without legal ambiguity.
The native agentic capabilities are equally significant for enterprise deployment. Gemma 4 supports function calling and structured JSON output as first-class features, not bolted-on afterthoughts. This means you can build AI agents that interact with your ERP, CRM, or document systems directly, using a model running on your own hardware. The 26B Mixture of Experts model activates only 3.8 billion of its 26 billion parameters during inference, which delivers much higher throughput than comparably sized models.
Laava's perspective
Laava has consistently argued that European businesses should own their AI infrastructure rather than rent it from cloud providers. The challenge has been that truly open models often lagged behind closed alternatives in capability. Gemma 4 narrows that gap significantly. A model that ranks #3 globally while running on a single H100 GPU, released under Apache 2.0, changes the cost-benefit calculation for self-hosting.
The mobile-optimized variants are particularly interesting for edge deployment. Many enterprise use cases involve processing documents or handling requests at locations without reliable cloud connectivity: warehouses, retail stores, field service operations. A model that runs efficiently on a Raspberry Pi or Jetson Nano opens up deployment patterns that were previously impractical. Imagine invoice processing at a warehouse dock, or customer service at a remote retail location, running entirely on local hardware.
The combination with recent releases from Mistral and Qwen means organizations now have multiple high-quality options for sovereign AI deployment. You are no longer choosing between capability and control. You can have both. Laava's model-agnostic architecture means clients can adopt Gemma 4 without rebuilding their systems. The Model Gateway pattern treats LLMs as replaceable components: change one configuration line, and your workflows route to Gemma 4 instead of your previous model.
What you can do now
If you have been waiting for open-source models to reach parity with cloud APIs before building sovereign AI infrastructure, that moment is arriving. Gemma 4 is available now on Hugging Face, vLLM, llama.cpp, and other standard inference frameworks. The Apache 2.0 license means your legal team can approve it without extended review.
Laava helps organizations evaluate and deploy self-hosted models for production workloads. Our four-week Proof of Pilot takes one specific process, deploys a model on your infrastructure, and measures real-world performance before you commit to broader rollout. If you want to understand what Gemma 4 can do for your document processing, customer service, or workflow automation, start with a focused pilot on a single use case.