Gemma 3 12B

Google · Gemma 3

12B parameter open-weight model. 128K context at 12B. Q4 fits in 8–10 GB VRAM with short context; KV cache grows quickly at 128K. Benchmark against Qwen3 14B on your tasks.

Editorial review

Reviewed byOpenSourcesAI EditorialLast updatedJune 2026SourcesHuggingFace model card (google/gemma-3-12b-it), official docs, OpenSourcesAI editorial review.

VRAM figures are empirical estimates. Actual usage varies by runtime, context length, and system configuration. Verify on your specific hardware before production use.

Ready to run this model locally?

Find a compatible interface in our Local AI Tools directory →

Gemma 3 12B

Editorial review

Related models