Design and Implementation of a Faiss‑Based Vector Search Platform
The article describes the design, architecture, and key components of a vector search platform built on Faiss that supports full‑index construction, incremental and distributed indexing, online retrieval, city‑level search, and vector update/delete operations to meet large‑scale AI application needs.
Vector retrieval is widely used in AI scenarios such as recommendation recall, question‑answering, and image/video search, where similarity between user or query vectors and stored item vectors is computed.
To satisfy business requirements for vector search, a platform was developed and launched, leveraging the Faiss library to provide full‑index building, real‑time incremental indexing, and online similarity search, thereby reducing learning cost and improving development efficiency.
Overall Architecture consists of a web UI for visual management (permissions, tasks, resources, indexes), SCF (Service Communication Framework) RPC services for distributed index and query handling, a logic layer that implements full and incremental index construction as well as online search, and a storage layer using MySQL for metadata, HDFS/WFS for raw data, WOS for index files, and Redis for incremental updates.
Full Index Construction supports three representative Faiss index types: IndexFlatL2 (brute‑force Euclidean), IndexIVFFlat (inverted file with clustering, supporting Euclidean and inner‑product), and IndexIVFPQ (inverted file with product quantization), with optional distributed building via consistent hashing and data sharding.
Incremental Index Construction allows new vectors to be added in real time: incoming vectors are recorded in Redis, incremental nodes load the existing full index, periodically read pending updates from Redis, merge them into the index, and persist the updated index to WFS/WOS for online use.
Distributed Index Construction splits the original vector dataset into non‑overlapping shards, builds an independent index for each shard, and stores the resulting index files in WFS/WOS, thus reducing per‑node memory usage and improving build efficiency.
Online Retrieval comprises SCF services for request forwarding and load balancing and a logic layer exposing Faiss search and incremental APIs via gRPC, managed by a Kubernetes cluster. Load balancing uses dynamic weighted round‑robin based on node health.
Distributed Retrieval adopts a shard‑merge approach: each sub‑index is deployed with multiple replicas, SCF forwards queries to one replica per shard, aggregates and re‑ranks results, and returns a unified response, with per‑shard load balancing.
City‑Level Retrieval creates separate indexes for each city ID, packages them together, and at query time selects the appropriate city index based on the request’s city identifier.
Vector Update and Deletion updates are handled similarly to incremental indexing via Redis, while deletions are performed by marking vectors in Redis and filtering them out during search, as Faiss does not natively support deletions.
Conclusion The platform provides comprehensive vector search capabilities that enable AI algorithm engineers to focus on model development. Future work includes integrating more retrieval models, enhancing debugging tools, and supporting additional algorithm libraries.
58 Tech
Official tech channel of 58, a platform for tech innovation, sharing, and communication.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.