Tagged articles

Cache Hit Rate

7 articles · Page 1 of 1

Apr 24, 2026 · Artificial Intelligence

How Claude Code Achieves a 92% Prompt Caching Hit Rate with Three Unbreakable Engineering Rules

Claude Code’s prompt‑caching delivers a 92% hit rate, slashing a 50‑round agent session cost from $6 to $1.15 by separating stable prefixes from dynamic tails, using a three‑layer cache architecture, exact token‑sequence matching, and three strict engineering rules that keep the cache hot and reliable.

Agent EngineeringCache Hit RateClaude Code

0 likes · 13 min read

How Claude Code Achieves a 92% Prompt Caching Hit Rate with Three Unbreakable Engineering Rules

Architect

Apr 21, 2026 · Artificial Intelligence

Why a 92% Prompt Cache Hit Rate Slashes LLM Costs: A Deep Dive into Context Engineering

The article dissects Anthropic's Prompt Caching mechanism, explaining how a 92% cache‑hit rate dramatically reduces pre‑fill costs for long‑running AI agents by structuring stable and dynamic context, managing TTL, look‑back limits, and applying seven practical engineering checks.

AI agentsCache Hit RateClaude

0 likes · 22 min read

Why a 92% Prompt Cache Hit Rate Slashes LLM Costs: A Deep Dive into Context Engineering

AI Insight Log

Feb 27, 2026 · Artificial Intelligence

Claude Code Prompt‑Caching Bug Drained Quotas—Anthropic’s Hotfix and Architecture Reveal

A prompt‑caching bug in Claude Code caused users' quota to deplete rapidly, prompting Anthropic to issue an emergency hotfix in version 2.1.62, reset rate limits, and publicly disclose the core architecture and five counter‑intuitive caching rules for building reliable AI agents.

AI agentsAnthropicCache Hit Rate

0 likes · 12 min read

Claude Code Prompt‑Caching Bug Drained Quotas—Anthropic’s Hotfix and Architecture Reveal

Kuaishou Tech

May 28, 2025 · Databases

Optimizing Kuaishou's Photo Object Storage: Reducing Size and Boosting Cache Hit Rate

This article details how Kuaishou dramatically cut storage costs and improved cache efficiency for its core Photo data object by cleaning up redundant JSON fields, applying selective serialization, and performing large‑scale data cleaning, achieving a 25% size reduction, a 2% cache‑hit increase, and multi‑hundred‑TB savings.

Cache Hit RateKuaishouPhoto Object

0 likes · 20 min read

Optimizing Kuaishou's Photo Object Storage: Reducing Size and Boosting Cache Hit Rate

Programmer DD

Sep 16, 2021 · Backend Development

Mastering Cache Strategies: Types, Hit Rates, Eviction & Design Patterns

Explore comprehensive caching concepts—including client, server, and CDN caches—along with hit rate metrics, eviction methods, popular strategies like FIFO, LRU, LFU, and design patterns such as Cache‑Aside, Read/Write‑Through, and Write‑Behind, plus essential testing considerations for robust application performance.

Backend PerformanceCache EvictionCache Hit Rate

0 likes · 8 min read

Mastering Cache Strategies: Types, Hit Rates, Eviction & Design Patterns

Liangxu Linux

Oct 31, 2020 · Fundamentals

How CPU Cache Works and How to Write Faster Code

Understanding CPU cache hierarchy, its speed advantages over memory, and the mechanics of cache lines, tags, and offsets reveals why code that maximizes cache hit rates—through sequential data access, branch prediction, and core affinity—can run dramatically faster on modern processors.

CPU cacheCache Hit RateMemory Hierarchy

0 likes · 18 min read

How CPU Cache Works and How to Write Faster Code

Java Backend Technology

Oct 21, 2018 · Backend Development

Boost System Performance: Master Cache Hit Rate Monitoring & Optimization

This article explains what cache hit rate is, how to monitor it in Memcached and Redis, the key factors that influence it, and practical strategies for architects to improve hit rates and overall application performance.

Cache Hit RateCachingMemcached

0 likes · 7 min read

Boost System Performance: Master Cache Hit Rate Monitoring & Optimization