Model family

Gemma Models

Gemma models are useful for efficient local, app, and multimodal workflows, with small-to-mid-size variants that are practical for developers.

GoogleUpdated 2026ChatLocalEfficientMultimodal

Best for

Chat

Use this family hub to compare Gemma variants for chat workflows, then open the detail page for deeper deployment notes.

Local

Use this family hub to compare Gemma variants for local workflows, then open the detail page for deeper deployment notes.

Efficient

Use this family hub to compare Gemma variants for efficient workflows, then open the detail page for deeper deployment notes.

Multimodal

Use this family hub to compare Gemma variants for multimodal workflows, then open the detail page for deeper deployment notes.

Variants

Gemma models grouped by workflow

Latest / flagship

Coding

Vision / multimodal

Local-friendly

ChatPractical localchatlocal

Gemma 3 27B

Google · Gemma

Best for: Developers testing capable medium-sized chat models with broad tooling support.

Local: Can be tested locally with quantized builds on higher-end consumer GPUs or unified-memory systems.

Details →
EdgeOpen weights where releasedlocalefficient

Gemma 3 12B IT

Google · Gemma

Best for: Efficient local prototypes, app workflows, and Gemma-family comparisons.

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EdgeOpen weights where releasedlocalefficient

Gemma 3 1B IT

Google · Gemma

Best for: Efficient local prototypes, app workflows, and Gemma-family comparisons.

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
ChatOpen weights where releasedlocalefficient

Gemma 2 27B Instruct

Google · Gemma

Best for: Efficient local prototypes, app workflows, and Gemma-family comparisons.

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
ChatOpen weights where releasedlocalefficient

Gemma 2 9B Instruct

Google · Gemma

Best for: Efficient local prototypes, app workflows, and Gemma-family comparisons.

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EdgeOpen weights where releasedlocalefficient

Gemma 2 2B Instruct

Google · Gemma

Best for: Efficient local prototypes, app workflows, and Gemma-family comparisons.

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →

Compare

All Gemma models in the directory

ModelTypeBest forLocal runner notesLicenseDetail
Gemma 4ChatDevelopers evaluating Google-backed open-weight models for efficient local apps, hosted prototypes, and multimodal workflows where supported.Smaller Gemma variants are practical for local testing; larger variants need more VRAM or unified memory.Apache 2.0 / check exact model cardOpen
Gemma 3 27BChatDevelopers testing capable medium-sized chat models with broad tooling support.Can be tested locally with quantized builds on higher-end consumer GPUs or unified-memory systems.Gemma TermsOpen
Gemma 3 12B ITEdgeEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Gemma 3 4B ITEdgeEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Gemma 3 1B ITEdgeEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Gemma 2 27B InstructChatEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Gemma 2 9B InstructChatEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Gemma 2 2B InstructEdgeEfficient local prototypes, app workflows, and Gemma-family comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
CodeGemma 7BCodeCoding assistant experiments and developer workflow prototypes.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
PaliGemma 2VisionVision-language app prototypes and multimodal evaluation.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen

Source box

Gemma family pages should separate Gemma 4 Apache-licensed checkpoints from earlier Gemma releases that used Gemma-specific terms.

Verified through: June 2026

Model licenses, context windows, release names, and provider terms can vary by checkpoint. Verify the exact model card before production or commercial use.