Databases 17 min read

Engineering Practices and Evolution of Douyin’s Cloud‑Native Vector Database

This article outlines Douyin’s step‑by‑step engineering evolution of its cloud‑native vector database, covering the background of vector search, core concepts, algorithmic optimizations, storage‑compute separation, streaming updates, multi‑tenant orchestration, and future applications such as large language model integration.

DataFunTalk
DataFunTalk
DataFunTalk
Engineering Practices and Evolution of Douyin’s Cloud‑Native Vector Database

With the widespread adoption of deep learning, embedding‑based retrieval has become a common requirement, prompting Douyin to develop a dedicated vector database to handle large‑scale unstructured data.

Background of Vector Databases – Traditional inverted‑index methods (BM25, TF‑IDF) struggle with semantic search and multimodal data. Embedding models (doc2vec, BERT, LLMs) transform text, images, and video into vectors, turning retrieval into an approximate nearest‑neighbor problem.

Core Concepts of Vector Retrieval – Similarity is measured by Euclidean distance, inner product, or cosine similarity; results are limited by a configurable top‑K; and performance is balanced against accuracy. Common solutions include ANN algorithms (HNSW, IVF), quantization (PQ, scalar quantization), and low‑level optimizations (SIMD, cache‑friendly memory layout).

Douyin’s Engineering Optimizations

Optimized open‑source HNSW and built a proprietary IVF algorithm, improving performance without sacrificing accuracy.

Developed a custom scalar quantization supporting int16, int8, and int4, enabling retrieval of 200 M vectors on a single T4 GPU.

Implemented SIMD‑based acceleration and memory‑layout tuning for higher throughput.

From Retrieval Algorithms to a Full‑Featured Vector Database – Integrated storage, search, and analytics APIs, ensuring high availability, performance, and ease of use.

Technical Evolution

Hybrid vector‑scalar filtering: post‑filtering (expand top‑K then filter) and pre‑filtering (DSL filter before vector search) to handle structured‑data constraints.

Built a DSL‑directed engine that performs low‑cost DSL filtering alongside vector similarity, supports early termination, and provides execution‑plan optimization.

Storage‑Compute Separation – Shifted from an all‑in‑one architecture to a three‑component design: vector storage cluster, batch index‑building cluster, and online search service. Benefits include reduced index‑build resources, full CPU utilization during batch jobs, and improved online stability.

Streaming Updates – Introduced a dual‑buffer design for concurrent safe index updates (both HNSW and IVF), achieving sub‑second latency for insert/delete/modify operations.

Cloud‑Native Transformation – Migrated to a multi‑tenant, cloud‑native framework with automated scheduling, slot‑based index allocation, and fine‑grained cost monitoring, enabling elastic scaling and resource‑efficient operation.

Application Outlook

Enhance large language models (LLMs) with long‑term memory, domain‑specific knowledge, and real‑time information via vector retrieval.

Mitigate LLM security risks by enforcing per‑user knowledge‑base permissions through vector database access controls.

Douyin’s cloud‑native vector database (VikingDB) now powers numerous internal services and is being offered on Volcano Engine, delivering sub‑10 ms latency for billions of vectors, real‑time updates, and robust multi‑tenant support.

cloud nativevector databaselarge language modelsearchANNDouyin
DataFunTalk
Written by

DataFunTalk

Dedicated to sharing and discussing big data and AI technology applications, aiming to empower a million data scientists. Regularly hosts live tech talks and curates articles on big data, recommendation/search algorithms, advertising algorithms, NLP, intelligent risk control, autonomous driving, and machine learning/deep learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.