Model family
E5 Models
E5 models are widely used for multilingual embeddings, semantic search, retrieval, low-overhead indexing, and RAG pipelines.
Best for
Embedding
Use this family hub to compare E5 variants for embedding workflows, then open the detail page for deeper deployment notes.
RAG
Use this family hub to compare E5 variants for rag workflows, then open the detail page for deeper deployment notes.
Multilingual
Use this family hub to compare E5 variants for multilingual workflows, then open the detail page for deeper deployment notes.
Search
Use this family hub to compare E5 variants for search workflows, then open the detail page for deeper deployment notes.
Variants
E5 models grouped by workflow
Embedding and reranking
Multilingual E5 Large
Microsoft / intfloat · E5
Best for: Teams building multilingual retrieval, semantic search, and RAG pipelines.
Local: Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.
e5-mistral-7b-instruct
Microsoft / intfloat · E5
Best for: Instruction-tuned embedding workflows
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
multilingual-e5-large-v2
Microsoft / intfloat · E5
Best for: Multilingual semantic search and RAG
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-large-v2
Microsoft / intfloat · E5
Best for: English semantic search and RAG
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-base-v2
Microsoft / intfloat · E5
Best for: Balanced embedding workflows
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-small-v2
Microsoft / intfloat · E5
Best for: Lightweight embedding workflows
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-large
Microsoft / intfloat · E5
Best for: Legacy large embedding baseline
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-base
Microsoft / intfloat · E5
Best for: Legacy base embedding baseline
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
e5-small
Microsoft / intfloat · E5
Best for: Legacy lightweight embedding baseline
Local: Use the exact checkpoint and quantization that matches your hardware and latency target.
Compare
All E5 models in the directory
| Model | Type | Best for | Local runner notes | License | Detail |
|---|---|---|---|---|---|
| Multilingual E5 Large | Embedding | Teams building multilingual retrieval, semantic search, and RAG pipelines. | Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware. | MIT | Open |
| e5-mistral-7b-instruct | Embedding | Instruction-tuned embedding workflows | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| multilingual-e5-large-v2 | Embedding | Multilingual semantic search and RAG | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-large-v2 | Embedding | English semantic search and RAG | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-base-v2 | Embedding | Balanced embedding workflows | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-small-v2 | Embedding | Lightweight embedding workflows | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-large | Embedding | Legacy large embedding baseline | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-base | Embedding | Legacy base embedding baseline | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |
| e5-small | Embedding | Legacy lightweight embedding baseline | Use the exact checkpoint and quantization that matches your hardware and latency target. | Check exact model card | Open |