Tagged articles

Performance Scaling

6 articles · Page 1 of 1

Dec 4, 2025 · Artificial Intelligence

How Offloading Latent Cache to CPU Boosts DeepSeek‑V3.2‑Exp Decoding Throughput

This report details the analysis of memory bottlenecks in DeepSeek‑V3.2‑Exp, proposes the Expanded Sparse Server (ESS) that offloads latent cache to CPU memory, and demonstrates through high‑fidelity simulation that the approach, combined with cache‑warmup and overlap techniques, can double decoding throughput for long‑context inference.

Cache offloadGPU‑CPU optimizationLLM Inference

0 likes · 21 min read

How Offloading Latent Cache to CPU Boosts DeepSeek‑V3.2‑Exp Decoding Throughput

Architects' Tech Alliance

Sep 8, 2024 · Industry Insights

How Nvidia’s Rapid GPU Cycle Is Shaping the Future of AI Super‑Scale Networking

The article analyzes Nvidia’s accelerated GPU rollout, highlighting the Blackwell series’ massive performance and energy gains, the company’s AI‑focused Ethernet Spectrum‑X roadmap, and the broader impact on NVLink, InfiniBand, and Ethernet interconnects for upcoming massive AI clusters.

AI EthernetGPUNVIDIA

0 likes · 6 min read

How Nvidia’s Rapid GPU Cycle Is Shaping the Future of AI Super‑Scale Networking

Bilibili Tech

May 19, 2023 · Backend Development

Local Cache Optimization for Outbox Redis in a High‑Traffic Feed Stream Service

To protect the outbox Redis cluster from extreme read amplification during hot events, the service adds a resident local cache for hot creators’ latest posts, using a threshold‑based list, change‑broadcast updates, and checksum verification, which achieved over 55% cache hits and cut peak Redis load by roughly 44% and CPU usage by 37%.

Cache OptimizationPerformance ScalingRedis

0 likes · 10 min read

Local Cache Optimization for Outbox Redis in a High‑Traffic Feed Stream Service

Architects' Tech Alliance

Oct 16, 2021 · Fundamentals

The New Golden Age of Computer Architecture: Trends, Challenges, and Opportunities

This article reviews the historical evolution of computer architecture, analyzes the end of Dennard scaling and Moore's Law, discusses domain‑specific architectures, open ISAs like RISC‑V, security vulnerabilities, and emerging opportunities such as agile hardware development and specialized accelerators.

Hardware DesignPerformance ScalingRISC-V

0 likes · 41 min read

The New Golden Age of Computer Architecture: Trends, Challenges, and Opportunities

Efficient Ops

May 25, 2017 · Operations

How a Bank Transformed IT Ops with Automated DevOps and SRE Practices

This article outlines how China Merchants Bank’s data‑center application management team identified traditional financial IT operational pain points, introduced DevOps and SRE concepts, built non‑functional management frameworks, and implemented automated tooling, monitoring, and capacity‑scaling to achieve fully automated operations.

DevOpsIT OperationsPerformance Scaling

0 likes · 24 min read

How a Bank Transformed IT Ops with Automated DevOps and SRE Practices

dbaplus Community

Jul 17, 2016 · Databases

How JD Scaled Its One‑Yuan Grab Treasure System with Database Sharding and ES Aggregation

This article details JD's One‑Yuan Grab Treasure platform redesign, covering business growth drivers, database sharding estimation, hash‑plus‑range routing implementation, Elasticsearch aggregation, Canal‑based sync, historical data migration, and downgrade mechanisms to ensure high‑throughput, reliable order processing during massive sales events.

Data MigrationPerformance Scalingbackend-architecture

0 likes · 11 min read

How JD Scaled Its One‑Yuan Grab Treasure System with Database Sharding and ES Aggregation