Tagged articles
6 articles
Page 1 of 1
Baidu Intelligent Cloud Tech Hub
Baidu Intelligent Cloud Tech Hub
Dec 4, 2025 · Artificial Intelligence

How Offloading Latent Cache to CPU Boosts DeepSeek‑V3.2‑Exp Decoding Throughput

This report details the analysis of memory bottlenecks in DeepSeek‑V3.2‑Exp, proposes the Expanded Sparse Server (ESS) that offloads latent cache to CPU memory, and demonstrates through high‑fidelity simulation that the approach, combined with cache‑warmup and overlap techniques, can double decoding throughput for long‑context inference.

Cache offloadGPU‑CPU optimizationLLM inference
0 likes · 21 min read
How Offloading Latent Cache to CPU Boosts DeepSeek‑V3.2‑Exp Decoding Throughput
Bilibili Tech
Bilibili Tech
May 19, 2023 · Backend Development

Local Cache Optimization for Outbox Redis in a High‑Traffic Feed Stream Service

To protect the outbox Redis cluster from extreme read amplification during hot events, the service adds a resident local cache for hot creators’ latest posts, using a threshold‑based list, change‑broadcast updates, and checksum verification, which achieved over 55% cache hits and cut peak Redis load by roughly 44% and CPU usage by 37%.

ConsistencyPerformance Scalingcache optimization
0 likes · 10 min read
Local Cache Optimization for Outbox Redis in a High‑Traffic Feed Stream Service
Architects' Tech Alliance
Architects' Tech Alliance
Oct 16, 2021 · Fundamentals

The New Golden Age of Computer Architecture: Trends, Challenges, and Opportunities

This article reviews the historical evolution of computer architecture, analyzes the end of Dennard scaling and Moore's Law, discusses domain‑specific architectures, open ISAs like RISC‑V, security vulnerabilities, and emerging opportunities such as agile hardware development and specialized accelerators.

Performance ScalingRISC-VSecurity Vulnerabilities
0 likes · 41 min read
The New Golden Age of Computer Architecture: Trends, Challenges, and Opportunities
Efficient Ops
Efficient Ops
May 25, 2017 · Operations

How a Bank Transformed IT Ops with Automated DevOps and SRE Practices

This article outlines how China Merchants Bank’s data‑center application management team identified traditional financial IT operational pain points, introduced DevOps and SRE concepts, built non‑functional management frameworks, and implemented automated tooling, monitoring, and capacity‑scaling to achieve fully automated operations.

DevOpsIT OperationsPerformance Scaling
0 likes · 24 min read
How a Bank Transformed IT Ops with Automated DevOps and SRE Practices
dbaplus Community
dbaplus Community
Jul 17, 2016 · Databases

How JD Scaled Its One‑Yuan Grab Treasure System with Database Sharding and ES Aggregation

This article details JD's One‑Yuan Grab Treasure platform redesign, covering business growth drivers, database sharding estimation, hash‑plus‑range routing implementation, Elasticsearch aggregation, Canal‑based sync, historical data migration, and downgrade mechanisms to ensure high‑throughput, reliable order processing during massive sales events.

Backend ArchitectureData MigrationPerformance Scaling
0 likes · 11 min read
How JD Scaled Its One‑Yuan Grab Treasure System with Database Sharding and ES Aggregation