Accelerating AliSQL Vector Search with Nodes Cache and SIMD
AliSQL 8.0 introduces a shared Nodes Cache and per‑transaction cache to speed up vector queries, implements RC‑level transaction isolation for read‑only and read‑write operations, and leverages SIMD‑based pre‑computation to dramatically improve high‑dimensional vector distance calculations and concurrency performance.
Introduction
AliSQL 8.0 (release 20251031) adds a series of optimizations for vector indexes, aiming to meet production‑grade performance and reliability requirements. The enhancements focus on in‑memory node caching, transaction‑level isolation, concurrency control, and computational acceleration.
Nodes Cache Design
Two cache layers are introduced: a shared public cache (MHNSW Share) for read‑only transactions and a per‑transaction cache (MHNSW Trx) for read‑write transactions. The public cache reduces repeated loading of vector nodes, while the transaction cache isolates write‑side modifications until commit.
Public Cache (MHNSW Share)
Mounted on the auxiliary table TABLE_SHARE.
Provides read‑only access to vector nodes.
Reduces duplicate node loads by sharing cached nodes across sessions.
Transaction Cache (MHNSW Trx)
Derived from MHNSW Share and attached to the session via thd_set_ha_data.
Each read‑write transaction creates an independent instance.
Caches nodes accessed or modified by the transaction, preventing pollution of the public cache.
Updates the public cache only at commit time.
Transaction Isolation (RC Level)
AliSQL supports the Read‑Committed (RC) isolation level for vector operations. The isolation is achieved by separating the access paths of read‑only and read‑write transactions.
Read‑Only Transaction : Executes the HNSW query algorithm, first consulting the public cache. If a node is missing, it is loaded from InnoDB with RC visibility. Subsequent reads of the same node reuse the cached copy, improving query throughput.
Read‑Write Transaction : Creates a session‑level transaction cache. Insertion proceeds in three stages:
Read phase – load required nodes from InnoDB, run the HNSW insertion algorithm in the transaction cache, and determine neighbor relationships across layers.
Write phase – persist the new node and updated neighbor nodes to InnoDB.
Commit or rollback – on commit, update the public cache version and evict modified nodes; on rollback, discard the transaction cache and rely on InnoDB’s rollback mechanism.
Concurrency Control
The cache system employs a lock hierarchy that currently supports read‑read and read‑write concurrency, but not write‑write concurrency on the same vector table.
Read‑Read Concurrency
Two locks are used: a cache mutex ( cache_lock) protecting the hash‑based public cache, and a node lock ( lock_node) ensuring exclusive loading from InnoDB when a node is absent. The process is illustrated below.
Read‑Write Concurrency
A commit read‑write lock ( commit rwlock) synchronizes read requests with write commits. Read requests hold a read lock for the whole duration, while write requests operate on their private transaction cache and acquire the commit write lock only at the final commit step to evict expired nodes from the public cache.
Vector Computation Optimizations
Pre‑Computation Strategy
During node cache loading, vector distances are pre‑computed and stored. Frequently queried nodes keep their distance results in the FVectorNode structure, guarded by a version field. If the node data is unchanged, the cached result is reused; otherwise, a recomputation is triggered. This reduces redundant calculations and cuts query latency for hot nodes by more than 40%.
SIMD Instruction Set Acceleration
AliSQL leverages modern CPU SIMD extensions (e.g., AVX‑512) to parallelize distance calculations. A Bloom filter batches multiple vectors, converting scalar operations into vectorized ones. Benchmarks show a >75% speed‑up for single‑node distance computation and a >3× increase in query throughput on a 10‑million‑vector dataset.
The two techniques complement each other: pre‑computation lowers latency for repeated accesses, while SIMD accelerates the raw computation cost, together delivering a substantial overall efficiency gain.
Conclusion
The combined design of public and transaction caches enables efficient vector indexing with RC‑level isolation and safe concurrent reads and writes. While write‑write concurrency is still unsupported, the current lock strategy ensures data consistency under high load. Pre‑computation and SIMD optimizations further boost vector calculation speed, making AliSQL’s vector capabilities suitable for large‑scale, production environments.
Alibaba Cloud Developer
Alibaba's official tech channel, featuring all of its technology innovations.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
