Artificial Intelligence 8 min read

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

Explore the BGE (BAAI General Embedding) family—including v1, v1.5, M3, Multilingual Gemma2, and EN‑ICL—detailing their multilingual capabilities, model variants, token limits, optimal use cases, and step‑by‑step installation and Python usage instructions with code examples for embedding generation and similarity scoring.

Ma Wei Says

Mar 24, 2025

Master BGE Multilingual Embeddings: Models, Installation, and Quick Usage

BGE (BAAI General Embedding) is a series of open‑source embedding models released by the Beijing Academy of Artificial Intelligence (BAAI) that convert text into high‑dimensional vectors for retrieval, classification, clustering, and other tasks.

01 BGE v1: Bilingual Base Model

The first version focuses on Chinese‑English bilingual semantic vectors and provides six models: large, base, and small for each language, all with a maximum length of 512 tokens.

English: BAAI/bge-large-en, BAAI/bge-base-en, BAAI/bge-small-en
Chinese: BAAI/bge-large-zh, BAAI/bge-base-zh, BAAI/bge-small-zh

These models are suitable for information retrieval, semantic matching, and lightweight applications; the small variants are ideal for resource‑constrained environments such as edge devices.

02 BGE v1.5: Optimized Similarity Distribution

v1.5 improves the similarity distribution and boosts zero‑instruction retrieval performance, though using query_instruction_for_retrieval is still recommended for the best results. It retains the same six model sizes with a -v1.5 suffix.

English: BAAI/bge-large-en-v1.5, BAAI/bge-base-en-v1.5, BAAI/bge-small-en-v1.5
Chinese: BAAI/bge-large-zh-v1.5, BAAI/bge-base-zh-v1.5, BAAI/bge-small-zh-v1.5

Compared with v1, the refined similarity scores are more stable in semantic matching and retrieval tasks.

03 BGE M3: Multi‑function, Multi‑lingual, Multi‑granularity

The flagship M3 model supports dense retrieval, multi‑vector retrieval, and sparse retrieval, covers over 100 languages, and can process inputs up to 8192 tokens. BAAI/bge-m3 It excels in cross‑language search, long‑document embedding (e.g., academic papers), and hybrid retrieval that combines dense, sparse, and multi‑vector methods.

04 BGE Multilingual Gemma2: LLM‑based Multilingual Model

Built on Google’s gemma‑2‑9b architecture, this model is trained on many languages (English, Chinese, Japanese, Korean, French, etc.) and tasks (retrieval, classification, clustering). It achieves top scores on benchmarks such as MIRACL, MTEB‑pl, MTEB‑fr, C‑MTEB, and AIR‑Bench. BAAI/bge-multilingual-gemma2 The model is especially suited for multilingual tasks and complex semantic understanding, leveraging the reasoning power of large language models.

05 BGE‑EN‑ICL: In‑Context Learning for Enhanced Performance

This LLM‑based vector model introduces in‑context learning (ICL); providing a few example prompts dramatically improves performance on new tasks. It attains excellent results on BEIR and AIR‑Bench, particularly for high‑precision semantic matching scenarios.

BAAI/bge-en-icl

06 Installation and Usage

1. pip installation

For inference only (no fine‑tuning): pip install -U FlagEmbedding For fine‑tuning support:

pip install -U FlagEmbedding[finetune]

2. Source installation

git clone https://github.com/FlagOpen/FlagEmbedding.git
cd FlagEmbedding
# Inference only
pip install .
# With fine‑tuning support
# pip install .[finetune]

Editable mode:

# Inference only
pip install -e .
# With fine‑tuning support
# pip install -e .[finetune]

3. Basic usage

from FlagEmbedding import FlagAutoModel
model = FlagAutoModel.from_finetuned('BAAI/bge-base-en-v1.5',
                                   query_instruction_for_retrieval="Represent this sentence for searching relevant passages:",
                                   use_fp16=True)
sentences_1 = ["I love NLP", "I love machine learning"]
sentences_2 = ["I love BGE", "I love text retrieval"]
embeddings_1 = model.encode(sentences_1)
embeddings_2 = model.encode(sentences_2)
similarity = embeddings_1 @ embeddings_2.T
print(similarity)

For more details, refer to the GitHub repository:

https://github.com/FlagOpen/FlagEmbedding

Python LLM Embedding multilingual retrieval

Written by

Ma Wei Says

Follow me! Discussing software architecture and development, AIGC and AI Agents... Sometimes sharing insights on IT professionals' life experiences.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.