Artificial Intelligence 6 min read

How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps

Redis has introduced a multithreaded query engine that dramatically lowers latency and multiplies throughput for vector‑based retrieval, enabling real‑time RAG applications to approach the 100 ms response target while scaling vertically to billions of documents.

macrozheng

Jan 20, 2025

How Redis’s New Multithreaded Query Engine Boosts Vector Search for Real‑Time AI Apps

Redis, the popular in‑memory database, has upgraded its query engine to meet the growing demands of Retrieval‑Augmented Generation (RAG) and vector‑database workloads.

The upgrade adds multithreaded query execution, keeping average latency under 10 ms while significantly increasing throughput. By allowing multiple queries to access the index concurrently, Redis achieves vertical scaling that handles billions of documents without becoming a performance bottleneck.

Traditional single‑threaded Redis struggles with long‑running queries that use inverted indexes, as search involves multiple O(log n) index scans rather than O(1) operations, leading to congestion.

The new architecture solves this by letting several worker threads execute queries in parallel while the main thread continues to serve other Redis operations.

Main thread prepares the query context and places it into a shared queue.

Worker threads pull tasks from the queue and execute the query pipeline concurrently, greatly increasing throughput.

After execution, results are returned to the main thread, which aggregates them and sends the final response to the client.

Extensive benchmarks compare Redis with pure vector databases, general‑purpose databases that support vectors, and managed Redis cloud services. Using datasets such as gist‑960‑euclidean, glove‑100‑angular, deep‑image‑96‑angular, and dbpedia‑openai‑1M‑angular, and tools like Qdrant’s vector‑db‑benchmark, Redis outperforms competitors in speed and scalability across ingestion (HNSW, ANN) and k‑NN search workloads.

The upgrade arrives as the vector‑database market expands, but experts warn that a plethora of options can overwhelm users. Redis’s approach aligns with the view that vector databases are just one layer of the AI stack, and enhancing existing infrastructure offers a more integrated solution.

Performance gains of up to 16× in query throughput make the new engine especially suitable for real‑time RAG scenarios, helping developers meet the “100 ms rule” for responsive AI applications.

Original English article: https://www.infoq.com/news/2024/07/redis-vector-database-genai-rag/

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

RAG Redis vector database multithreading benchmark AI performance

Written by

macrozheng

Dedicated to Java tech sharing and dissecting top open-source projects. Topics include Spring Boot, Spring Cloud, Docker, Kubernetes and more. Author’s GitHub project “mall” has 50K+ stars.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.