New BGE Vector Models Set SOTA in Code and Multimodal Retrieval – What Makes Them So Powerful?
Three newly released BGE vector models—BGE‑Code‑v1, BGE‑VL‑v1.5, and BGE‑VL‑Screenshot—deliver state‑of‑the‑art performance on code, multimodal, and visual document retrieval benchmarks, are open‑source on Hugging Face and GitHub, and aim to boost retrieval‑augmented applications across languages and modalities.
Retrieval‑augmented generation (RAG) and multimodal search rely on high‑quality vector encoders. Three new open‑source models have been released to cover code, general multimodal, and visual‑document retrieval scenarios.
BGE‑Code‑v1
This encoder is built on the Qwen2.5‑Coder‑1.5B backbone. Training combines the CoIR benchmark dataset with a large corpus of synthetic code‑text pairs, using curriculum learning. Additional retrieval and semantic‑textual similarity (STS) data from BGE‑gemma2‑multilingual are incorporated as auxiliary tasks. The model excels at code‑document search, cross‑language code retrieval, and outperforms commercial and open‑source baselines on both CoIR and CodeRAG‑Bench.
Model hub: https://huggingface.co/BAAI/bge-code-v1 GitHub repository:
https://github.com/FlagOpen/FlagEmbedding/tree/master/research/BGE_CoderPaper:
https://arxiv.org/abs/2505.12697BGE‑VL‑v1.5
The multimodal encoder is based on LLaVA‑1.6 (7.57 B parameters). It is trained on 3 M image‑caption pairs from the MegaPairs collection and an additional 1 M natural‑plus‑synthetic samples covering image captioning, visual‑question‑answering, and image classification. This curriculum yields strong zero‑shot performance on the MMEB benchmark and a fine‑tuned state‑of‑the‑art score of 72.16.
Model hub: https://huggingface.co/BAAI/BGE-VL-v1.5-zs GitHub repository:
https://github.com/FlagOpen/FlagEmbedding/tree/master/research/BGE_VLPaper:
https://arxiv.org/abs/2412.14475BGE‑VL‑Screenshot
This visual‑document encoder derives from Qwen2.5‑VL‑3B‑Instruct . Training data consist of more than 13 M screenshots and 7 M caption‑question pairs collected from news sites, e‑commerce platforms, academic papers, and project homepages. Evaluation uses the newly introduced MVRB benchmark (20 datasets, 4 tasks). The model achieves a combined score of 60.61, establishing a new SOTA and demonstrating multilingual capability beyond English.
Model hub: https://huggingface.co/BAAI/BGE-VL-Screenshot GitHub repository:
https://github.com/FlagOpen/FlagEmbedding/tree/master/research/BGE_VL_ScreenshotPaper: https://arxiv.org/abs/2502.11431 MVRB leaderboard: https://huggingface.co/spaces/BAAI/MVRB_leaderboard All three models are fully open‑source and provide a one‑stop solution for efficient vector representation and semantic search across code, text, and visual documents.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
