Tagged articles
11 articles
Page 1 of 1
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 7, 2025 · Artificial Intelligence

Why Does GPU Memory Keep Growing in DeepSeek‑R1 Inference? Uncovering PyTorch’s Cache

After deploying the full‑precision DeepSeek‑R1 model on a 2×8‑GPU ACS cluster, repeated stress tests showed GPU memory usage continuously rising without release; this article details the investigation, reproduces the behavior, examines vLLM logs, Prometheus metrics, and reveals PyTorch’s caching allocator as the root cause, offering mitigation tips.

DeepSeekGPU MemoryMemory Cache
0 likes · 21 min read
Why Does GPU Memory Keep Growing in DeepSeek‑R1 Inference? Uncovering PyTorch’s Cache
Rare Earth Juejin Tech Community
Rare Earth Juejin Tech Community
Jan 29, 2024 · Backend Development

Design and Implementation of a High‑Performance In‑Memory Cache in Go (MemoryCache)

This article analyzes the shortcomings of existing Go caching libraries, introduces the MemoryCache project, explains its hash‑based bucket design, 4‑ary heap LRU implementation, unified timer strategy, and provides practical usage examples with code snippets for SetWithCallback and GetOrCreateWithCallback.

HeapMemory Cachecaching
0 likes · 13 min read
Design and Implementation of a High‑Performance In‑Memory Cache in Go (MemoryCache)
Java Architect Essentials
Java Architect Essentials
Sep 18, 2022 · Databases

Redis vs Dragonfly: Benchmark Comparison and Architectural Insights

This article examines the open‑source memory cache Dragonfly, compares its performance and architecture against Redis through detailed benchmark results, discusses Redis’s response and design principles, and provides reproducible test configurations and command lines for both systems.

DragonflyMemory Cachearchitecture
0 likes · 16 min read
Redis vs Dragonfly: Benchmark Comparison and Architectural Insights
Baidu App Technology
Baidu App Technology
Jun 29, 2021 · Mobile Development

Optimization of T7 Browser Kernel Cache Mechanism in Baidu App

The Baidu App’s T7 browser kernel team optimized its HTTPCache and MemoryCache by enabling main‑document and file‑protocol caching, introducing a fixed‑size LRU cleanup, providing custom preload/query APIs, and leveraging NoState Prefetch, which together cut page‑load latency by over 30 % and markedly improve user experience.

Browser CacheMemory Cachehttp cache
0 likes · 15 min read
Optimization of T7 Browser Kernel Cache Mechanism in Baidu App
vivo Internet Technology
vivo Internet Technology
Sep 25, 2019 · Mobile Development

Analysis of Glide Image Loading Cache Mechanisms on Android

The article dissects Glide’s five‑level caching system—active resources, memory cache, resource and data disk caches, and network cache—explaining how active weak references feed the LRU memory cache, how DiskCacheStrategy governs transformed and raw data storage, and how I/O and network tasks run on separate executors while network responses are cached before delivery.

AndroidCacheDisk Cache
0 likes · 21 min read
Analysis of Glide Image Loading Cache Mechanisms on Android