Operations 8 min read

Weibo’s Cross‑IDC Image Storage: Scaling Architecture & Real‑Time Compression

This article explains how Weibo’s massive image‑hosting platform uses a cross‑IDC distributed object storage system, optimized upload/download pipelines, and a custom compression library to handle billions of images and extreme traffic spikes during events like the Chinese New Year.

21CTO
21CTO
21CTO
Weibo’s Cross‑IDC Image Storage: Scaling Architecture & Real‑Time Compression

Images are a core content element on Weibo, with picture‑laden posts accounting for nearly 60% of daily activity. The Weibo image‑hosting platform (image bed) processes over 30 million uploads per day and stores more than 10 billion images.

Cross‑IDC Distributed Storage System The platform is a large‑scale distributed object storage system spanning multiple data centers, implementing multi‑master writes for disaster recovery. Using an internal BOR replication protocol, it achieves strong consistency for users even when a data center fails.

Image Upload When a user uploads an image, the Upload API receives it, performs pre‑compression, and stores various sizes in a high‑speed cache called iCache, built on SSDs with a dual‑hash‑ring design for high availability. The image is then asynchronously persisted to permanent storage using Sina’s custom notfs user‑space file system, offering three times the throughput of traditional solutions with constant low latency.

Image Download User‑side delivery relies on a global CDN; cache misses trigger the Download API, which first checks iCache, then permanent storage, and if necessary retrieves the image from the original upload location via a dedicated link, ensuring strong consistency. Retrieved images may be re‑compressed on the fly to meet presentation requirements.

Chinese New Year Challenge The partnership with the Spring Festival Gala caused upload traffic to surge up to 30 times the normal peak, demanding rapid architectural improvements rather than simple scaling.

Compression Bottleneck Real‑time image compression is CPU‑ and memory‑intensive. Skipping compression at upload shifts the load to download, but diverse device requirements still necessitate on‑the‑fly processing, creating a classic two‑sided trade‑off.

Pipeline‑Based Processing (webpress) Traditional real‑time compression using PHP ImageMagick suffers from high latency under load. The webpress system separates I/O stages from CPU‑heavy compression, allocating resources more predictably and maintaining stable latency during high concurrency.

Lightweight Image Library (webimg) To replace ImageMagick, the team built webimg, a compact library using delayed decoding, JPEG pre‑resampling, SIMD optimizations, and other techniques, achieving nearly four‑fold performance gains without memory leaks. Future plans include GPU CUDA and possible FPGA integration.

Conclusion The Weibo image‑hosting system demonstrates how architectural optimizations across storage, caching, and compression enable reliable service during massive traffic spikes, improving user experience while reducing costs.

— Sina R&D Center, Weibo Image Bed Team, Zhu Xin (@stvchu)
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Backend ArchitectureScalabilitycompressiondistributed systemImage storage
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.