Qwythos-9B (Claude-Mythos-5 1M)

Empero AI · Qwythos

9B parameter open-weight model. Full 1M-token context requires tensor-parallel multi-GPU or KV-cache offloading; for single-GPU use with an RTX 4070 Ti (12 GB), target the Q4_K_M GGUF at 256K context via Ollama or llama.cpp — the hybrid Gated-DeltaNet attention stack keeps memory growth sub-quadratic below ~256K tokens.

Editorial review

Reviewed byOpenSourcesAI EditorialLast updatedJune 2026SourcesHuggingFace model card (empero-ai/Qwythos-9B-Claude-Mythos-5-1M), official docs, OpenSourcesAI editorial review.

VRAM figures are empirical estimates. Actual usage varies by runtime, context length, and system configuration. Verify on your specific hardware before production use.

Ready to run this model locally?

Find a compatible interface in our Local AI Tools directory →