Redis Introduces Multi‑Threaded Query Engine to Boost Vector Search Performance
Redis has launched an enhanced, multi‑threaded query engine that dramatically increases throughput and reduces latency for vector similarity searches, enabling vertical scaling and better support for real‑time RAG applications while maintaining sub‑10 ms response times.
Redis, the popular in‑memory data‑structure store, announced a major upgrade to its query engine by adding multi‑threading capabilities. This change allows concurrent access to indexes, enabling vertical scaling that improves both standard Redis operations and query throughput.
The new engine is especially important as vector databases gain prominence in generative AI and Retrieval‑Augmented Generation (RAG) workloads, where handling billions of documents and complex queries can become a bottleneck.
Redis explains that traditional single‑threaded processing can cause congestion for long‑running queries, particularly when using inverted indexes. By parallelising query execution across multiple threads, Redis achieves up to a 16‑fold increase in query throughput while keeping average latency below 10 ms and overall response times under a millisecond.
The architecture follows a three‑step process: the main thread prepares the query context and queues it; worker threads pull tasks from the shared queue and execute the query pipeline concurrently; results are then returned to the main thread for final delivery.
Extensive benchmarking compared the upgraded engine against three categories of vector‑database providers—pure vector stores, general‑purpose databases with vector features, and fully managed in‑memory Redis cloud services. Redis claims its engine outperforms pure vector databases in speed and scalability and surpasses the other categories in overall performance.
Benchmarks used datasets such as gist‑960‑euclidean, glove‑100‑angular, deep‑image‑96‑angular, and dbpedia‑openai‑1M‑angular, measuring ingestion time with HNSW indexing and query performance via k‑NN searches, reporting requests per second (RPS) and average client latency.
The upgraded query engine is already available in Redis Software and will be released for Redis Cloud in the fall, offering developers a more integrated and efficient solution for AI‑driven retrieval tasks.
Architecture Digest
Focusing on Java backend development, covering application architecture from top-tier internet companies (high availability, high performance, high stability), big data, machine learning, Java architecture, and other popular fields.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.