How Baidu’s PUCK Dominated the First BigANN Vector Search Competition
The inaugural BigANN competition, organized by NeurIPS, showcased large‑scale ANN research, and Baidu's self‑developed PUCK algorithm secured top scores across all four tracks by leveraging multi‑layer quantization, two‑level inverted indexing, and extensive system‑level optimizations.
BigANN Competition Overview
The first International Vector Retrieval Competition, BigANN, was launched by the premier AI conference NeurIPS to accelerate research and production adoption of large‑scale Approximate Nearest Neighbor (ANN) methods. Although it was the inaugural event, NeurIPS’s reputation attracted leading companies and top universities, and the results were announced at NeurIPS 2021.
Competition Tracks and Evaluation
BigANN comprises three tracks: (1) in‑memory retrieval, (2) disk‑based retrieval, and (3) non‑standard hardware (GPU) retrieval. Baidu’s PUCK participated in Track T1, which measures recall at a fixed query rate of 10,000 queries per second on a 32‑vCPU machine. The evaluation used six billion‑scale datasets, including public sets from Facebook, Microsoft Turing, Microsoft Bing, and four new datasets released by Yandex for the competition.
State of ANN Research
Recent years have seen a surge of ANN innovations, ranging from partition‑based and graph‑based indexing to hybrid RAM‑SSD storage, accelerator hardware, and learned dimensionality reduction. Prominent open‑source solutions include Spotify’s ANNOY, Google’s ScaNN, Facebook’s Faiss, and HNSW. However, most empirical studies focus on datasets around one million points, leaving a gap in understanding performance at true billion‑scale workloads.
PUCK Architecture and Optimizations
PUCK, Baidu’s internally developed ANN algorithm, has been iterated for years and is deployed in dozens of Baidu products, handling trillions of items and billions of daily queries. Its key technical innovations are:
Two‑level inverted index that sensitively captures data distribution, enabling fine‑grained sub‑space partitioning and reducing search scope; shared secondary clustering centers also cut training time.
Heuristic iterative training that continuously refines first‑ and second‑level cluster centers via equivalent space transformations, yielding a more accurate data distribution model.
Multi‑level quantization: a coarse large‑scale quantizer quickly selects candidate vectors, followed by a finer quantizer for re‑ranking.
Aggressive pruning at every retrieval stage, using multiple loss‑function‑derived formulas to minimize online computation and latency.
Strict cache‑line alignment and compact memory layout to reduce cache misses.
Support for large‑scale quantization, including non‑uniform quantization, to handle high‑dimensional features with minimal loss.
Additional Functional Extensions
Real‑time insertion with lock‑free structures for immediate index updates.
Distributed index building via map‑reduce, allowing the whole index to be constructed across multiple nodes without manual sharding.
Conditional queries that filter results during the retrieval process, addressing truncation issues in multi‑stage recall merging.
Adaptive parameter selection that provides sensible defaults for most workloads, lowering the barrier for users unfamiliar with ANN tuning.
References
Competition website: https://big-ann-benchmarks.com/
Result repository: https://github.com/harsha-simhadri/big-ann-benchmarks/tree/main/t1_t2
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
