How Alibaba’s Proxima Engine Revolutionizes Vector Search for AI Applications

Alibaba’s Damo Academy unveils Proxima, a high‑performance vector search engine that powers e‑commerce, video, and payment services, detailing its core capabilities, large‑scale indexing, distributed construction, real‑time updates, and challenges such as algorithm diversity, scalability, and multi‑modal retrieval.

Alibaba Cloud Developer
Alibaba Cloud Developer
Alibaba Cloud Developer
How Alibaba’s Proxima Engine Revolutionizes Vector Search for AI Applications

Introduction

Artificial Intelligence (AI) is a core technology that enables computers to assist human work by learning from data using mathematics such as probability, statistics, and linear algebra. AI algorithms embed unstructured data (voice, images, video, text, behavior) into high‑dimensional vectors, and vector retrieval searches these vectors to find relevant entities.

Business Scenarios

1. Voice/Image/Video Retrieval Traditional search indexes only metadata, while vector search understands content, enabling image‑based search by converting images to vectors and matching them against a pre‑built vector index.

2. Text Retrieval By embedding text into vectors, semantic similarity can be captured, allowing queries like “浙一医院” to match the full official name despite missing keywords.

3. Search/Recommendation/Advertising E‑commerce platforms use vector embeddings for item‑item, user‑item, and hybrid retrieval to quickly find similar products and personalize recommendations.

4. Broad AI Coverage Vector search underlies many AI scenarios, from face recognition to multimodal search, becoming an indispensable component of modern AI pipelines.

Current Status and Challenges

Vector retrieval solves K‑Nearest Neighbor (KNN) and Radius Nearest Neighbor (RNN) problems, but exact solutions are costly at large scale, leading to Approximate Nearest Neighbor (ANN) techniques. Numerous algorithms exist, grouped into space‑partitioning (e.g., KD‑Tree), space‑encoding (e.g., LSH, PQ), and graph‑based (e.g., HNSW) methods.

Key challenges include:

Precision and performance of ultra‑large indexes (billions of vectors).

Distributed index construction and retrieval, where excessive sharding hurts efficiency and fast index merging remains hard.

Streaming online updates: maintaining index consistency while supporting real‑time inserts, deletes, and queries.

Joint tag‑and‑vector retrieval: combining attribute filters with similarity search without degrading recall.

Adapting algorithms to diverse data distributions, dimensions, and latency requirements.

Proxima: Alibaba Damo Academy’s Vector Search Engine

Proxima is a self‑developed, general‑purpose vector search kernel deployed across Alibaba and Ant Group services (Taobao search, AntFace payment, Youku video search, Alibaba advertising, Hologres, ElasticSearch, MaxCompute, etc.). It supports multiple hardware platforms (ARM64, x86, GPU) and scales from edge devices to cloud servers, handling single‑shard indexes of tens of billions of vectors with high accuracy.

Core Capabilities

Ultra‑large scale index construction and retrieval (billions of vectors per shard).

Horizontal index scaling via non‑equivalent sharding and graph‑based index merging.

High‑dimensional, high‑precision search with algorithm selection based on data characteristics.

Streaming real‑time online index building and dynamic updates.

Integrated tag‑plus‑vector retrieval for conditional similarity queries.

Heterogeneous computing: optimized batch offline search and low‑latency online inference on GPU.

Performance‑cost balance across diverse platforms.

Automatic scenario adaptation through hyper‑parameter tuning and composite indexing.

Comparison with Faiss

Faiss, the open‑source engine from Facebook AI, is widely used but shows limitations in large‑scale, real‑time, and heterogeneous scenarios. Benchmarks show Proxima achieving several‑fold higher retrieval speed and accuracy, building a 200 M‑vector index in about one hour versus Faiss’s 15–45 hours, and delivering superior GPU‑accelerated low‑latency performance for small‑batch queries.

Future Outlook

As AI adoption grows and data volumes explode, vector search will continue to expand its multimodal capabilities. Future research must address mixed‑space, sparse‑space, ultra‑high‑dimensional, and universal similarity retrieval, while engineering efforts will focus on building systematic, adaptable systems that serve increasingly complex scenarios.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AIvector searchlarge-scale indexingAlibaba Proximaretrieval algorithms
Alibaba Cloud Developer
Written by

Alibaba Cloud Developer

Alibaba's official tech channel, featuring all of its technology innovations.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.