Model family

Llama Models

Meta Llama models are widely supported open-weight options for local AI stacks, multimodal workflows, assistant prototypes, and guardrail experiments.

MetaUpdated 2026ChatMultimodalLocalSafety

Best for

Chat

Use this family hub to compare Llama variants for chat workflows, then open the detail page for deeper deployment notes.

Multimodal

Use this family hub to compare Llama variants for multimodal workflows, then open the detail page for deeper deployment notes.

Local

Use this family hub to compare Llama variants for local workflows, then open the detail page for deeper deployment notes.

Safety

Use this family hub to compare Llama variants for safety workflows, then open the detail page for deeper deployment notes.

Variants

Llama models grouped by workflow

Latest / flagship

Vision / multimodal

Safety / guardrails

Local-friendly

Compare

All Llama models in the directory

ModelTypeBest forLocal runner notesLicenseDetail
Llama 3 70BChatBuilders who want a widely supported open-weight chat model with broad runtime compatibility.Commonly used in local workflows through quantized builds, but 70B-class models are best with high-memory GPUs or workstation/server hardware.Llama 3 CommunityOpen
Llama 4 ScoutMultimodalTeams evaluating Llama-family models for multimodal assistant, long-context, and application workflows.Evaluate local fit with the exact checkpoint and quantization available for your runtime.Llama license / check exact model cardOpen
Llama 4 MaverickMultimodalBuilders comparing current Llama-family models for assistant, multimodal, and reasoning-oriented workflows.Evaluate local fit with the exact checkpoint and quantization available for your runtime.Llama license / check exact model cardOpen
Llama 3.3 70B InstructChatGeneral assistant workflows, app prototypes, and Llama-family baseline comparisons.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama 3.1 405B InstructChatServer-class assistant evaluation and comparisons against smaller Llama variants.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama 3.1 70B InstructChatTeams comparing widely supported Llama-family 70B-class models.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama 3.1 8B InstructEdgeLocal prototypes, small assistants, and lower-resource evaluation.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama 3 8B InstructEdgeLocal baseline comparisons and lightweight app prototypes.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama Guard 3SafetySafety checks, guardrail experiments, and policy classification workflows.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
Llama 3.2 VisionVisionVision-language experiments, screenshot reasoning, and multimodal app prototypes.Use the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen