Model family

E5 Models

E5 models are widely used for multilingual embeddings, semantic search, retrieval, low-overhead indexing, and RAG pipelines.

Microsoft / intfloatUpdated 2026EmbeddingRAGMultilingualSearch

Best for

Embedding

Use this family hub to compare E5 variants for embedding workflows, then open the detail page for deeper deployment notes.

RAG

Use this family hub to compare E5 variants for rag workflows, then open the detail page for deeper deployment notes.

Multilingual

Use this family hub to compare E5 variants for multilingual workflows, then open the detail page for deeper deployment notes.

Search

Use this family hub to compare E5 variants for search workflows, then open the detail page for deeper deployment notes.

Variants

E5 models grouped by workflow

Embedding and reranking

EmbeddingRAGSemantic searchMultilingual

Multilingual E5 Large

Microsoft / intfloat · E5

Best for: Teams building multilingual retrieval, semantic search, and RAG pipelines.

Local: Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-mistral-7b-instruct

Microsoft / intfloat · E5

Best for: Instruction-tuned embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

multilingual-e5-large-v2

Microsoft / intfloat · E5

Best for: Multilingual semantic search and RAG

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-large-v2

Microsoft / intfloat · E5

Best for: English semantic search and RAG

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-base-v2

Microsoft / intfloat · E5

Best for: Balanced embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-small-v2

Microsoft / intfloat · E5

Best for: Lightweight embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-large

Microsoft / intfloat · E5

Best for: Legacy large embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-base

Microsoft / intfloat · E5

Best for: Legacy base embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

e5-small

Microsoft / intfloat · E5

Best for: Legacy lightweight embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →

Compare

All E5 models in the directory

ModelTypeBest forLocal runner notesLicenseDetail
Multilingual E5 LargeEmbeddingTeams building multilingual retrieval, semantic search, and RAG pipelines.Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.MITOpen
e5-mistral-7b-instructEmbeddingInstruction-tuned embedding workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
multilingual-e5-large-v2EmbeddingMultilingual semantic search and RAGUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-large-v2EmbeddingEnglish semantic search and RAGUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-base-v2EmbeddingBalanced embedding workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-small-v2EmbeddingLightweight embedding workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-largeEmbeddingLegacy large embedding baselineUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-baseEmbeddingLegacy base embedding baselineUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
e5-smallEmbeddingLegacy lightweight embedding baselineUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen