Model family

E5 Models

E5 models are widely used for multilingual embeddings, semantic search, retrieval, low-overhead indexing, and RAG pipelines.

Microsoft / intfloatUpdated 2026EmbeddingRAGMultilingualSearch

Best for

Embedding

Use this family hub to compare E5 variants for embedding workflows, then open the detail page for deeper deployment notes.

RAG

Use this family hub to compare E5 variants for rag workflows, then open the detail page for deeper deployment notes.

Multilingual

Use this family hub to compare E5 variants for multilingual workflows, then open the detail page for deeper deployment notes.

Search

Use this family hub to compare E5 variants for search workflows, then open the detail page for deeper deployment notes.

Variants

E5 models grouped by workflow

Embedding and reranking

EmbeddingRAGSemantic searchMultilingual

Multilingual E5 Large

Microsoft / intfloat · E5

Best for: Teams building multilingual retrieval, semantic search, and RAG pipelines.

Local: Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.

EmbeddingOpen weights where releasedembeddingrag

e5-mistral-7b-instruct

Microsoft / intfloat · E5

Best for: Instruction-tuned embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

multilingual-e5-large-v2

Microsoft / intfloat · E5

Best for: Multilingual semantic search and RAG

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-large-v2

Microsoft / intfloat · E5

Best for: English semantic search and RAG

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-base-v2

Microsoft / intfloat · E5

Best for: Balanced embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-small-v2

Microsoft / intfloat · E5

Best for: Lightweight embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-large

Microsoft / intfloat · E5

Best for: Legacy large embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-base

Microsoft / intfloat · E5

Best for: Legacy base embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

EmbeddingOpen weights where releasedembeddingrag

e5-small

Microsoft / intfloat · E5

Best for: Legacy lightweight embedding baseline

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Compare

All E5 models in the directory

Model	Type	Best for	Local runner notes	License	Detail
Multilingual E5 Large	Embedding	Teams building multilingual retrieval, semantic search, and RAG pipelines.	Runs locally for many embedding and semantic search prototypes on CPU or modest GPU hardware.	MIT	Open
e5-mistral-7b-instruct	Embedding	Instruction-tuned embedding workflows	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
multilingual-e5-large-v2	Embedding	Multilingual semantic search and RAG	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-large-v2	Embedding	English semantic search and RAG	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-base-v2	Embedding	Balanced embedding workflows	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-small-v2	Embedding	Lightweight embedding workflows	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-large	Embedding	Legacy large embedding baseline	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-base	Embedding	Legacy base embedding baseline	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open
e5-small	Embedding	Legacy lightweight embedding baseline	Use the exact checkpoint and quantization that matches your hardware and latency target.	Check exact model card	Open