Getting started · OpenSourcesAI
New to local AI?
Start here.
This page walks you through what local AI is, what hardware you need, which tools to start with, and how to run your first model — step by step. Pick your path below or scroll straight through.
Step 0
Pick your starting path
Select what you're here to do and we'll show you your first move.
What are you here to do?
Foundation
Three concepts that unlock everything else
You don't need to memorize these. You just need enough of each to avoid the most common mistakes.
What is a local LLM?
A large language model that runs entirely on your own hardware. Your prompts never leave your machine.
Read the full guide →What is VRAM?
GPU memory — the single most important hardware constraint for running local models. Model size must fit here for fast inference.
VRAM sizing guide →What is quantization?
Compressing model weight precision (e.g. 16-bit → 4-bit) to cut memory use. Q4 models run on half the VRAM with a small quality tradeoff.
Read the full guide →Before you download
Check your hardware first
The most common beginner mistake is downloading a 14B model on 8 GB VRAM and watching it crawl or crash. The compatibility checker takes 60 seconds and tells you which model sizes realistically fit your GPU memory.
First build path
Five steps to your first working local AI setup
Follow these in order. Each step builds on the last. Do not skip ahead to RAG or coding assistants until plain chat is working reliably.
Check your hardware
Before downloading any model, check your GPU memory. The compatibility checker gives you a shortlist of model sizes that fit your hardware.
Open compatibility checker →Install Ollama
Ollama is the fastest path from zero to a running local model. It handles download, runtime, and a local API in a single install.
Ollama setup guide →Pull one small model and test it
Run: ollama pull qwen3:8b — then test a few prompts. Log response speed and answer quality before adding any other tools. This baseline matters.
Add Open WebUI for a better chat workspace
Once Ollama is running, Open WebUI adds a browser-based chat interface. Add it only after the model itself is working.
Ollama + Open WebUI stack →Add RAG, coding tools, or agents when ready
Only after plain chat quality is acceptable on your hardware. Each layer adds complexity — add one at a time and test between steps.
Build a local RAG stack →Essential tools
The tools most beginners need first
Start with Ollama. Add LM Studio or Open WebUI only after you have one model running. Everything else comes later.
Ollama
The simplest CLI runtime for local models. Exposes a local API that other tools connect to.
CLI · Free · Open sourceView profile →LM Studio
Desktop GUI with a model browser, chat interface, and local server. Best for visual workflows.
Desktop · FreeView profile →Open WebUI
Browser-based chat workspace for Ollama and OpenAI-compatible backends. Self-hostable.
Web · Open sourceView profile →Continue
Open-source AI coding assistant for VS Code and JetBrains. Connects to local or cloud models.
IDE · Open sourceView profile →Qdrant
Local vector database for RAG workflows. Add it once your base chat setup is working.
Vector DB · Open sourceView profile →Model Builder Wizard
Answer a few questions and get a shortlist of model families that fit your use case and hardware.
Tool · Free · On this siteView profile →Explore deeper
Ready to go further?
Once your base setup is working, these sections have everything you need to build on it.
Your actual next step
Use the compatibility checker to see which models fit your hardware right now. Then install Ollama and run one prompt. That is the entire first session.