Model family

BGE Models

BGE models cover embeddings, reranking, retrieval, semantic search, and vector database workflows for RAG builders.

BAAIUpdated 2026EmbeddingRerankingRAGSearch

Best for

Embedding

Use this family hub to compare BGE variants for embedding workflows, then open the detail page for deeper deployment notes.

Reranking

Use this family hub to compare BGE variants for reranking workflows, then open the detail page for deeper deployment notes.

RAG

Use this family hub to compare BGE variants for rag workflows, then open the detail page for deeper deployment notes.

Search

Use this family hub to compare BGE variants for search workflows, then open the detail page for deeper deployment notes.

Variants

BGE models grouped by workflow

Embedding and reranking

RerankingRAGRetrievalSearch

BGE Reranker v2 M3

BAAI · BGE

Best for: RAG builders who need a practical reranker after Qdrant, Chroma, pgvector, or other vector search.

Local: Commonly used in local RAG stacks as a reranking step after vector search.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-m3

BAAI · BGE

Best for: Multilingual embedding and retrieval workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-large-en-v1.5

BAAI · BGE

Best for: English embedding retrieval workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-base-en-v1.5

BAAI · BGE

Best for: Balanced English embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-small-en-v1.5

BAAI · BGE

Best for: Lightweight local embedding workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-large-zh-v1.5

BAAI · BGE

Best for: Chinese embedding and retrieval workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
RerankingOpen weights where releasedrerankingrag

bge-reranker-large

BAAI · BGE

Best for: RAG reranking workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
RerankingOpen weights where releasedrerankingrag

bge-reranker-base

BAAI · BGE

Best for: Lightweight reranking workflows

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →
EmbeddingOpen weights where releasedembeddingrag

bge-embedding-gemma2

BAAI · BGE

Best for: Gemma-based embedding experiments

Local: Use the exact checkpoint and quantization that matches your hardware and latency target.

Details →

Compare

All BGE models in the directory

ModelTypeBest forLocal runner notesLicenseDetail
BGE Reranker v2 M3RerankingRAG builders who need a practical reranker after Qdrant, Chroma, pgvector, or other vector search.Commonly used in local RAG stacks as a reranking step after vector search.Check model cardOpen
bge-m3EmbeddingMultilingual embedding and retrieval workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-large-en-v1.5EmbeddingEnglish embedding retrieval workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-base-en-v1.5EmbeddingBalanced English embedding workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-small-en-v1.5EmbeddingLightweight local embedding workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-large-zh-v1.5EmbeddingChinese embedding and retrieval workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-reranker-largeRerankingRAG reranking workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-reranker-baseRerankingLightweight reranking workflowsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen
bge-embedding-gemma2EmbeddingGemma-based embedding experimentsUse the exact checkpoint and quantization that matches your hardware and latency target.Check exact model cardOpen