Hardware directory
Local AI Hardware Fit Directory
Hardware fit guides for local AI: what each GPU and memory tier can run, which model sizes and quantization levels to target, and how to get started with Ollama, LM Studio, and llama.cpp.
NVIDIA GPU guides
Budget · 12GB
RTX 3060 12GB
7B at Q8, 14B at Q4. Budget entry into the 12GB tier. Same model ceiling as higher-end 12GB cards.
Prosumer · 12GB
RTX 4070 Ti 12GB
7B and 8B at Q8, 14B at Q4. Fastest 12GB Ada Lovelace option.
Prosumer · 16GB
RTX 4080 / 4080 Super 16GB
7B at FP16, 13B and 14B at Q8. The 16GB tier upgrade over 12GB cards.
Enthusiast · 24GB (used market)
RTX 3090 24GB
7B and 8B at FP16, 14B at Q8, 30B at Q4. Same ceiling as RTX 4090 at lower cost. NVLink supported.
Enthusiast · 24GB
RTX 4090 24GB
7B and 8B at FP16, 14B at Q8, 30B at Q4. Fastest consumer GPU for local AI.
Apple Silicon guides
VRAM tier guides
Not sure which GPU you have? Browse by VRAM amount to see what fits.
VRAM tier
12GB VRAM tier
What can 12GB of GPU VRAM run? Model fit table, GPU options at this tier, and when 24GB is worth the upgrade.
VRAM tier
16GB VRAM tier
7B FP16 and 13B–14B Q8 — the two meaningful upgrades the 16GB tier has over 12GB cards.
VRAM tier
24GB VRAM tier
What can 24GB of GPU VRAM run? 7B FP16, 30B Q4, and how it compares to Apple Silicon unified memory.
VRAM tier
48GB VRAM tier
Workstation-class local AI — 70B at Q4 and 30B at Q8. Hardware paths: RTX A6000, dual RTX 3090 NVLink, Apple Silicon 64GB.
VRAM tier
CPU-only local AI
Running local LLMs without a GPU — which models are practical, how fast to expect, RAM requirements, and when a GPU is worth adding.
Not sure where to start?
Use the compatibility checker to enter your specific GPU model and VRAM amount and get model recommendations matched to your hardware.