How a Harmless Query Can Hijack LLM Agents: The First Semantic Cache Key Collision Attack

A new study presented at ICML 2026 reveals that the fuzzy matching used in LLM semantic caching creates an integrity vulnerability, allowing attackers to craft adversarial suffixes that cause cache‑key collisions and achieve up to 86 % response‑hijacking success on major cloud services such as AWS and Azure.

Machine Heart
Machine Heart
Machine Heart
How a Harmless Query Can Hijack LLM Agents: The First Semantic Cache Key Collision Attack

Large language models (LLMs) and AI agents are increasingly deployed on edge and cloud, making inference cost and latency critical concerns. To improve cache hit rates, cloud providers (e.g., AWS, Azure) and open‑source frameworks have adopted semantic caching, where a user query is embedded into a vector that serves as the cache key.

While this fuzzy matching boosts performance, the authors from Hong Kong University of Science and Technology and Fudan University argue that it introduces a fundamental integrity risk. The semantic cache behaves like a locality‑preserving fuzzy hash: unlike cryptographic hashes that aim for the avalanche effect, the cache deliberately reduces hash sensitivity so that semantically similar inputs map to the same key.

Exploiting this design, the researchers built CacheAttack , an automated black‑box attack framework that crafts adversarial suffixes which keep the malicious instruction semantics unchanged but force the embedding vector to align with a victim’s benign query. When the victim’s request hits the cache, the system returns the attacker‑pre‑seeded malicious response, achieving response hijacking.

The attack consists of two components:

Generator : Using a surrogate model and the GCG search algorithm, the generator produces adversarial prompts of the form "[original query] + adversarial suffix". A perplexity (PPL) penalty ensures the generated suffix remains fluent and can bypass input filters.

Validator : Because the cache state is hidden, the validator treats cache verification as a hidden‑state inference problem. It measures execution latency as a side‑channel, fits a Gaussian Mixture Model (GMM), and applies a MAP decision rule to infer whether a cache hit occurred, filtering out network jitter.

Two attack variants are offered:

CacheAttack‑1 (direct validation) : Repeatedly probes the target black‑box model. It suffers from TTL constraints and is more detectable.

CacheAttack‑2 (surrogate‑assisted filtering) : Most adversarial iterations run on the high‑throughput surrogate; only when a candidate suffix triggers a collision locally does the attacker send a single verification request to the target, eliminating TTL limits and improving stealth.

Experimental evaluation covered major cloud services (AWS, Azure) and AI agents. In the basic response‑hijacking scenario (RQ1), CacheAttack‑1 achieved an 86.9 % hit rate and CacheAttack‑2 83.1 % on the GPTCache semantic cache. In a more complex agent workflow scenario (RQ2), the attack induced cascading planning errors, causing the agent to select malicious tools and dramatically dropping task success rates.

A concrete case study demonstrates a financial AI agent being compromised. The attacker first injects a malicious cache entry that issues a forced sell order for Stock A. Later, a benign user query collides with this entry, causing the agent to execute the sell order without the user’s consent, leading to substantial financial loss.

The authors formalize the performance‑vs‑security trade‑off, proving a lower bound on the false‑positive rate of semantic caches. Tightening the matching threshold improves security but destroys the cache’s utility; loosening it preserves performance but expands the attack surface.

Overall, the work highlights an inherent paradox in current LLM serving architectures: achieving low latency through fuzzy semantic caching inevitably opens a backdoor for integrity attacks, and any mitigation must balance cache effectiveness against collision resistance.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

cloud servicesAI Agentsadversarial attacksLLM securitysemantic cachingcache collision attack
Machine Heart
Written by

Machine Heart

Professional AI media and industry service platform

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.