Tagged articles
104 articles
Page 1 of 2
Tencent Cloud Middleware
Tencent Cloud Middleware
Apr 22, 2026 · Backend Development

How TDMQ Pulsar Scales Million-Message Delayed Queues with Multi-Level Time Wheels

The article analyzes why large‑scale delayed messaging is needed, identifies the bottlenecks of the Apache Pulsar community solution, and explains TDMQ Pulsar's three‑step redesign—hierarchical time wheels, expiration re‑push, and immutable message IDs—that together enable stable million‑message delayed queues with controlled memory and minute‑level hole impact.

Apache PulsarDelayed MessagingMessage Queue
0 likes · 8 min read
How TDMQ Pulsar Scales Million-Message Delayed Queues with Multi-Level Time Wheels
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 27, 2026 · Artificial Intelligence

How Tair Powers Sub‑Second AI Agent Memory for Real‑Time Ordering

This article examines how Taobao Flash Sale’s AI Agent uses Alibaba Cloud’s Tair as a high‑performance short‑term memory layer, detailing data model design, latency impact, concurrency control, elastic scaling, bandwidth handling, and TTL‑based cleanup to achieve sub‑second response times during massive traffic spikes.

AI AgentLow latencyMemory Management
0 likes · 15 min read
How Tair Powers Sub‑Second AI Agent Memory for Real‑Time Ordering
Yiche Technology
Yiche Technology
Dec 3, 2025 · Artificial Intelligence

How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search

This article details how an automotive AI assistant migrated from keyword matching to a Milvus‑based vector retrieval system, overcoming semantic gaps, scaling to millions of daily queries, optimizing indexing, introducing multi‑vector and sparse‑vector search, and building a real‑time RAG pipeline with Flink.

AI AssistantMilvusRAG
0 likes · 12 min read
How Milvus Powered a Scalable AI Assistant for Car Queries with Vector Search
Ray's Galactic Tech
Ray's Galactic Tech
Nov 9, 2025 · Backend Development

Hybrid Push‑Pull Timeline Architecture: Scaling Social Feeds for Billions

To serve billions of users with real‑time timelines, modern social platforms combine push‑based delivery for regular users and pull‑based retrieval for high‑profile accounts, employing hot‑cold separation, Kafka fan‑out, Redis caching, and scalable storage strategies to balance write and read loads.

Kafkapush-pullredis
0 likes · 9 min read
Hybrid Push‑Pull Timeline Architecture: Scaling Social Feeds for Billions
Baidu Geek Talk
Baidu Geek Talk
Oct 13, 2025 · Big Data

How Baidu Scaled Its Data Warehouse to Handle Billions of PVs and Petabytes

This article details Baidu APP's massive data‑warehouse overhaul, describing the two‑step strategy that stabilized log cleaning, modernized the ETL framework, introduced wide‑table architectures, and implemented tiered storage to dramatically improve processing speed, reliability, and cost efficiency for petabyte‑scale workloads.

Big DataETLdata-warehouse
0 likes · 25 min read
How Baidu Scaled Its Data Warehouse to Handle Billions of PVs and Petabytes
NiuNiu MaTe
NiuNiu MaTe
Oct 11, 2025 · Backend Development

How to Build a Million‑User Real‑Time High‑Availability Comment System

This article explains how to design a highly available comment system that can handle millions of concurrent users by analyzing comment fundamentals, traffic patterns, storage choices, caching layers, sharding strategies, architectural evolution from monolith to distributed micro‑services, and fault‑tolerance mechanisms.

Comment SystemMicroservicesscalable architecture
0 likes · 18 min read
How to Build a Million‑User Real‑Time High‑Availability Comment System
Go Development Architecture Practice
Go Development Architecture Practice
Jul 31, 2025 · Backend Development

Building a Scalable WebSocket Push Service in Go: From Basics to Million‑User Architecture

This article explains WebSocket fundamentals, compares pull and push models, details the WebSocket handshake flow, presents a complete Go server and client implementation, analyzes performance bottlenecks of a million‑user bullet‑screen system, and proposes concrete optimizations such as packet merging, lock granularity, JSON encoding reduction, and HTTP/2‑based clustering.

Goreal-time messagingscalable architecture
0 likes · 13 min read
Building a Scalable WebSocket Push Service in Go: From Basics to Million‑User Architecture
Alibaba Cloud Big Data AI Platform
Alibaba Cloud Big Data AI Platform
Jul 24, 2025 · Artificial Intelligence

How Alibaba Cloud’s Asynchronous Inference Transforms AI Model Deployment

This article explains how Alibaba Cloud's PAI platform uses an asynchronous inference framework with dedicated queue and inference services to overcome high‑latency challenges, enable load‑balanced request distribution, provide health‑check failover, and support automatic scaling for large‑model AI workloads.

AI inferenceAlibaba CloudCloud AI
0 likes · 7 min read
How Alibaba Cloud’s Asynchronous Inference Transforms AI Model Deployment
Architects' Tech Alliance
Architects' Tech Alliance
Apr 8, 2025 · Artificial Intelligence

How NVSwitch Revolutionizes Multi‑GPU Interconnect for AI Workloads

This article examines NVIDIA's NVSwitch technology, explaining why it was needed, how it builds on NVLink to overcome PCIe bottlenecks, tracing its evolution from Pascal to the third‑generation design, and detailing its architectural features, scalability, full‑duplex bandwidth, non‑blocking communication, and optimized network topologies for high‑performance AI and HPC systems.

AI hardwareGPU interconnectHigh‑performance computing
0 likes · 9 min read
How NVSwitch Revolutionizes Multi‑GPU Interconnect for AI Workloads
21CTO
21CTO
Aug 17, 2024 · Artificial Intelligence

Vector Store vs Vector Database: Which Powers Your AI Apps Better?

This guide explains the differences between vector stores and vector databases, covering vector embeddings, performance, scalability, integration, and ideal use‑cases, helping developers choose the right tool—or a hybrid approach—for AI applications.

AI embeddingsVector Storescalable architecture
0 likes · 12 min read
Vector Store vs Vector Database: Which Powers Your AI Apps Better?
Volcano Engine Developer Services
Volcano Engine Developer Services
Jun 14, 2024 · Operations

How ByteDance Built an EB‑Scale Log Service: Design & Optimization

This article details the evolution of ByteDance's TLS (Tinder Log Service) from a Loki‑based prototype to an EB‑scale, cloud‑native log system, covering its core properties, data organization, architecture, caching, hybrid storage, private codec, ecosystem compatibility, intelligent features, and real‑world case studies.

ByteDanceCloud NativeTLS
0 likes · 24 min read
How ByteDance Built an EB‑Scale Log Service: Design & Optimization
JD Tech
JD Tech
May 17, 2024 · Artificial Intelligence

Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency

The article details how JD's advertising retrieval platform tackles the core challenge of balancing limited compute resources with massive data by optimizing compute allocation, improving model scoring efficiency, and enhancing iteration speed through distributed execution graphs, adaptive algorithms, and platform‑level infrastructure improvements.

ANNAdvertisingDeep Learning
0 likes · 24 min read
Optimizing JD Advertising Retrieval Platform: Balancing Compute, Data Scale, and Iterative Efficiency
Architecture & Thinking
Architecture & Thinking
Feb 13, 2024 · Backend Development

How to Evolve High‑Traffic Data Architecture: From Simple Reads to Decentralized Caching

This article explains how internet‑scale services can progressively upgrade their data architecture—from a single read/write instance to read‑write separation, multi‑region “two‑site three‑center” designs, cache fallback, and local decentralised caching—to achieve high availability and scalability.

Read-Write Separationcachingscalable architecture
0 likes · 6 min read
How to Evolve High‑Traffic Data Architecture: From Simple Reads to Decentralized Caching
21CTO
21CTO
Feb 11, 2024 · Backend Development

How Apollo and Microservices Combine to Build Scalable Applications

This article explains how integrating Apollo with a microservice architecture enables developers to create highly scalable, resilient applications by detailing microservice fundamentals, challenges, best‑practice solutions, and practical steps for building extensible data graphs and robust monitoring.

ApolloGraphQLMicroservices
0 likes · 10 min read
How Apollo and Microservices Combine to Build Scalable Applications
dbaplus Community
dbaplus Community
Jan 23, 2024 · Operations

How We Built a Scalable Real‑Time Log Center with ClickHouse and ELK

Facing massive data volumes, the team at Kuaidi100 redesigned their logging platform, moving from a file‑based system to an ELK stack and finally to a ClickHouse‑based architecture, achieving real‑time, scalable, cost‑effective log collection, analysis, and alerting while addressing storage, performance, and maintenance challenges.

ELKLog Managementclickhouse
0 likes · 12 min read
How We Built a Scalable Real‑Time Log Center with ClickHouse and ELK
ITPUB
ITPUB
Oct 1, 2023 · Backend Development

Scaling Schema‑Free Classified Ads Platforms: Storage & Search for Billions

This article explains how to design a scalable architecture for classification‑info platforms that handle billions of rows, ten‑thousand attributes, and hundred‑thousand QPS by using vertical partitioning, unified post, category, and search services, along with compressed JSON extensions and external indexing.

Vertical Partitioninglarge-scale datascalable architecture
0 likes · 12 min read
Scaling Schema‑Free Classified Ads Platforms: Storage & Search for Billions
JD Cloud Developers
JD Cloud Developers
Apr 4, 2023 · Databases

How to Scale B‑Token Systems with Horizontal Sharding and Consistent Hashing

This article examines the challenges of growing B‑token data volumes, including table size limits and data skew, and proposes a solution using horizontal sharding with a consistent‑hash ring, dynamic table allocation, water‑level thresholds, periodic archiving, and monitoring to support future growth without costly migrations.

Data Skewconsistent hashingscalable architecture
0 likes · 13 min read
How to Scale B‑Token Systems with Horizontal Sharding and Consistent Hashing
JD Retail Technology
JD Retail Technology
Mar 2, 2023 · Databases

Evolution of JD VOP Message Warehouse: From V1.0 to V3.0+ with Database Sharding and Performance Optimization

This article details the architectural evolution of JD's VOP message warehouse, describing the challenges of massive data volumes, the transition from V1.0 to V3.0+ through database sharding, MongoDB adoption, traffic governance, stability improvements, and cost reduction strategies, while presenting performance metrics and future outlook.

Message QueueMongoDBbackend systems
0 likes · 17 min read
Evolution of JD VOP Message Warehouse: From V1.0 to V3.0+ with Database Sharding and Performance Optimization
vivo Internet Technology
vivo Internet Technology
Nov 16, 2022 · Big Data

Vivo Hawking A/B Experiment Platform: Architecture, Practices, and Solutions

The Vivo Hawking platform provides a company‑wide, one‑stop A/B testing solution with a layered architecture, covariate‑balanced split algorithms, real‑time monitoring, and unified SDKs for Android, Java and H5, enabling thousands of daily experiments, automated analysis, and rapid product iteration across multiple departments.

Covariate balancingExperiment PlatformJava SDK
0 likes · 22 min read
Vivo Hawking A/B Experiment Platform: Architecture, Practices, and Solutions
Architecture Digest
Architecture Digest
Sep 7, 2022 · Backend Development

Building a High-Performance Scalable Instant Messaging System with Go and WebSocket

This article guides readers through the design and implementation of a high‑performance, scalable instant‑messaging (IM) system using Go, detailing WebSocket protocol fundamentals, server‑side architecture, authentication, message handling, code examples, and optimization strategies for production deployment.

GoInstant MessagingWebSocket
0 likes · 39 min read
Building a High-Performance Scalable Instant Messaging System with Go and WebSocket
DeWu Technology
DeWu Technology
Aug 19, 2022 · Big Data

DeWu Reach Strategy Platform and HBase Buffer Pool Architecture

The DeWu Reach Strategy platform uses a task‑strategy‑action model and an HBase‑backed buffer pool that temporarily stores billions of user records, enabling large‑scale algorithmic push, AB testing, and dynamic horizontal scaling while ensuring even data distribution and low‑latency processing.

Big DataHBaseReach Strategy
0 likes · 9 min read
DeWu Reach Strategy Platform and HBase Buffer Pool Architecture
Huolala Tech
Huolala Tech
Aug 18, 2022 · R&D Management

How Huolala Built a Scalable A/B Testing Platform with Five Allocation Algorithms

Huolala’s A/B testing platform, serving over 200 business scenarios and thousands of experiments, combines a three‑stage workflow with a modular architecture—including a configuration console, SDK for traffic routing and data collection, and a robust data‑recovery pipeline—while offering five distinct allocation algorithms to ensure scientific experiment results.

A/B testingExperiment Platformalgorithm design
0 likes · 17 min read
How Huolala Built a Scalable A/B Testing Platform with Five Allocation Algorithms
NetEase Smart Enterprise Tech+
NetEase Smart Enterprise Tech+
Jun 10, 2022 · Backend Development

How NetEase Cloud IM Solves Massive Group Relationships: Architecture & Challenges

The article explains NetEase Cloud IM's "Circle Group" product, detailing its massive relationship complexity and scale, the technical challenges it creates, and the layered backend architecture and mechanisms designed to handle multi‑entity coupling, huge member volumes, and efficient change notifications.

Cloud ServicesIMbackend-development
0 likes · 13 min read
How NetEase Cloud IM Solves Massive Group Relationships: Architecture & Challenges
Meituan Technology Team
Meituan Technology Team
May 5, 2022 · Databases

Database Autonomy Service (DAS): Architecture, Design, and Implementation

The Database Autonomy Service (DAS) is a platform that uses big‑data, machine‑learning, and expert knowledge to automatically collect, compress, and analyze MySQL metrics, providing self‑service fault detection, root‑cause diagnosis, and security management, thereby reducing manual effort, shortening MTTR, and supporting Meituan’s rapid database growth.

AI-driven opsDatabase AutonomyPerformance Monitoring
0 likes · 20 min read
Database Autonomy Service (DAS): Architecture, Design, and Implementation
DaTaobao Tech
DaTaobao Tech
Mar 17, 2022 · Cloud Native

NextRPC: Multi‑Stage RPC Model for Scalable Transaction and Recommendation Services

NextRPC is a multi‑stage RPC model that delivers partial responses over multiple network channels, improving latency and conversion in Taobao’s transaction and recommendation services; it employs a hybrid asynchronous streaming and parallel execution architecture with client‑side orchestration and server‑side sub‑request handling, achieving over 5% UV lift and up to 25% recommendation uplift.

AlibabaAsynchronous StreamingMicroservices
0 likes · 8 min read
NextRPC: Multi‑Stage RPC Model for Scalable Transaction and Recommendation Services
21CTO
21CTO
Feb 27, 2022 · Databases

Designing Scalable Login Schemas for Billions of Users

This article explains how to design a flexible, extensible database schema and login flow for a system with a billion users, covering multi‑credential handling, sharding strategies, hash‑based routing, and practical considerations such as password updates and caching.

high concurrencylogin systemscalable architecture
0 likes · 9 min read
Designing Scalable Login Schemas for Billions of Users
IT Architects Alliance
IT Architects Alliance
Feb 4, 2022 · Backend Development

Designing a Scalable Architecture for Million‑Level DAU Systems

The article outlines a comprehensive backend architecture for handling million‑to‑tens‑of‑million daily active users, covering DNS routing, L4/L7 load balancing, monolithic versus microservice deployment, caching, database sharding, hybrid‑cloud strategies, elastic scaling, and multi‑level degradation mechanisms.

Microservicesdatabase shardingelastic scaling
0 likes · 11 min read
Designing a Scalable Architecture for Million‑Level DAU Systems
Tencent Cloud Developer
Tencent Cloud Developer
Jan 29, 2022 · Cloud Computing

How WeChat’s Serverless Cloud Functions Powered Billion-User Red Packet Covers

During the 2022 Chinese New Year, WeChat leveraged Serverless cloud functions and cloud development to handle the explosive demand of red‑packet cover creation, achieving sub‑ten‑thousand‑yuan costs, supporting over 100 million daily calls, and delivering rapid, scalable, and cost‑effective deployment for developers.

Cloud FunctionsCost OptimizationServerless
0 likes · 7 min read
How WeChat’s Serverless Cloud Functions Powered Billion-User Red Packet Covers
High Availability Architecture
High Availability Architecture
Jan 12, 2022 · Cloud Native

Designing a Scalable Architecture for Million‑Level DAU Internet Applications

The article explains how to build a highly available, horizontally scalable architecture for million‑level daily active users by combining DNS routing, L4/L7 load balancing, micro‑service decomposition, caching, sharded databases, hybrid‑cloud deployment, elastic scaling and multi‑level degradation strategies.

Microservicescachinghybrid cloud
0 likes · 11 min read
Designing a Scalable Architecture for Million‑Level DAU Internet Applications
vivo Internet Technology
vivo Internet Technology
Dec 16, 2021 · Industry Insights

How Vivo Scaled Push, Storage, 3D, and Live Streaming for 270M Users

The 2021 Vivo Developer Conference showcased how the company’s engineering teams built high‑performance push services, a self‑developed storage platform, a full‑link 3D display system, a front‑end code‑coverage tool, a traffic replay solution, and a customizable live‑streaming SDK to support hundreds of millions of users.

3d-visualizationPush ServiceStorage Platform
0 likes · 8 min read
How Vivo Scaled Push, Storage, 3D, and Live Streaming for 270M Users
Programmer DD
Programmer DD
Oct 17, 2021 · Databases

How JD Baitiao Scaled to Billions with Apache ShardingSphere

This article chronicles JD Baitiao's data‑architecture evolution from Solr + HBase to MongoDB and finally to Apache ShardingSphere, highlighting the challenges of massive data growth, the need for decoupling, and the performance, scalability, and operational benefits achieved by adopting ShardingSphere.

ApacheJD BaitiaoShardingSphere
0 likes · 10 min read
How JD Baitiao Scaled to Billions with Apache ShardingSphere
Architects' Tech Alliance
Architects' Tech Alliance
Sep 7, 2021 · Fundamentals

Understanding Fat-Tree (CLOS) Network Architecture for Data Centers

The article explains the Fat-Tree (CLOS) network topology introduced in 2008, describing its non‑convergent bandwidth design, three‑layer structure, practical benefits, common configurations, and limitations, while also providing references and visual illustrations of the architecture.

CLOSData Center NetworkFat-Tree
0 likes · 7 min read
Understanding Fat-Tree (CLOS) Network Architecture for Data Centers
Open Source Linux
Open Source Linux
Jul 5, 2021 · Operations

Designing Scalable, High‑Availability Network Services with Linux LVS

This article explains the principles and architecture of scalable, high‑availability network services using Linux Virtual Server (LVS), covering definitions, requirements, load‑balancing mechanisms, cluster components, geographic distribution, BGP routing, and practical deployment considerations for web, media, cache, and mail services.

LVShigh availabilityload balancing
0 likes · 25 min read
Designing Scalable, High‑Availability Network Services with Linux LVS
Volcano Engine Developer Services
Volcano Engine Developer Services
Jun 16, 2021 · Backend Development

How ByteDance’s Video Processing Platform Achieves Billion‑Scale High Availability

This article explains how ByteDance’s Volcano Engine video platform handles the entire video lifecycle—from client‑side capture to cloud processing, delivery, and playback—by employing a multi‑plane architecture, scalable workflow system, function compute platform, and the dynamic BMF framework to meet massive scale, ensure high availability, improve user experience, and reduce costs.

Function ComputeVideo processinghigh availability
0 likes · 19 min read
How ByteDance’s Video Processing Platform Achieves Billion‑Scale High Availability
IT Architects Alliance
IT Architects Alliance
Jun 7, 2021 · Industry Insights

How WeChat Scales: Agile Practices and Architecture Behind Billions of Users

The article analyzes WeChat's success by detailing its three‑pronged strategy of precise product timing, agile project management, and robust technical support, and explains how the team applies agile attitudes, modular design, extensible protocols, disaster‑recovery mechanisms, and fine‑grained monitoring to operate a massive, highly available system.

Agile DevelopmentWeChatindustry insights
0 likes · 18 min read
How WeChat Scales: Agile Practices and Architecture Behind Billions of Users
Meituan Technology Team
Meituan Technology Team
May 13, 2021 · Artificial Intelligence

Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan

Meituan’s Turing OS unifies the end‑to‑end machine‑learning lifecycle—data preprocessing, feature generation, model training, deployment, online prediction and A/B testing—through a lightweight SDK, plugin‑based algorithms, DAG orchestration, sandbox validation and replay tools, cutting algorithm iteration from days to hours while handling billions of daily predictions.

Algorithm PlatformModel Deploymentonline serving
0 likes · 31 min read
Design and Practice of Turing OS: An Online Service Framework for Machine Learning and Deep Learning at Meituan
dbaplus Community
dbaplus Community
Mar 14, 2021 · Databases

How Vivo Built a Scalable Comment Platform with MongoDB Sharding

This article explains how Vivo designed a company‑wide comment middle‑platform, chose MySQL and MongoDB, deep‑dived into MongoDB cluster architecture, shard key strategies, and practical solutions for scaling, migration, and high availability in a high‑traffic environment.

Comment SystemDatabase designMongoDB
0 likes · 11 min read
How Vivo Built a Scalable Comment Platform with MongoDB Sharding
Beike Product & Technology
Beike Product & Technology
Nov 2, 2020 · Artificial Intelligence

Beike Commercialization Strategy Algorithm Platform: Architecture, AI Techniques, and System Evolution

This presentation details Beike's AI‑driven commercial strategy platform, covering business scenarios, the evolution of its architecture from 2018 to 2020, the challenges faced, the redesign into online, near‑real‑time, and offline layers, key technologies such as vector search, model serving, and microservice governance, as well as performance gains and future directions.

AIReal Estatescalable architecture
0 likes · 23 min read
Beike Commercialization Strategy Algorithm Platform: Architecture, AI Techniques, and System Evolution
DataFunTalk
DataFunTalk
Aug 21, 2020 · Big Data

Design and Implementation of 58.com Commercial DMP Platform

This talk presents the architecture, feature extraction, storage, real-time computation, monitoring, and optimization strategies of 58.com’s commercial DMP platform, detailing business requirements, system design across data, storage, compute, and service layers, and future plans for unified services and advanced analytics.

DMPData PlatformReal-time Processing
0 likes · 13 min read
Design and Implementation of 58.com Commercial DMP Platform
Cloud Native Technology Community
Cloud Native Technology Community
Apr 8, 2020 · Operations

Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring

This article provides a detailed analysis of Thanos' architecture, explaining each core component—Query, Sidecar, Store Gateway, Ruler, Compact, and the upcoming Receiver—how they enable global view, high availability, and long‑term storage for distributed Prometheus deployments, and discusses design trade‑offs and optimization strategies.

Cloud NativeLong‑term StoragePrometheus
0 likes · 12 min read
Decoding Thanos Architecture: From Query to Compact for Scalable Monitoring
Programmer DD
Programmer DD
Mar 26, 2020 · Backend Development

How Zhihu Built a Scalable Long‑Connection Gateway for Real‑Time Messaging

Zhihu’s infrastructure team designed a high‑performance, scalable long‑connection gateway that decouples business logic via publish‑subscribe, leverages OpenResty, Kafka, and Redis, implements fine‑grained ACL, sliding‑window flow control, and ensures message reliability and horizontal scalability for millions of concurrent devices.

KafkaMessage ReliabilityOpenResty
0 likes · 15 min read
How Zhihu Built a Scalable Long‑Connection Gateway for Real‑Time Messaging
Architecture Digest
Architecture Digest
Jun 21, 2019 · Backend Development

Design and Implementation of a High‑Availability Scalable IM Group‑Chat Messaging System

This article presents a comprehensive design and implementation of a high‑availability, horizontally scalable instant‑messaging group‑chat system, detailing its architecture, component interactions, scaling strategies, reliability mechanisms, and extensions for offline and single‑chat messaging.

IMgroup chathigh availability
0 likes · 48 min read
Design and Implementation of a High‑Availability Scalable IM Group‑Chat Messaging System
58 Tech
58 Tech
Apr 9, 2019 · Backend Development

Optimizing Group Chat Performance in an Instant Messaging Backend

This article analyzes the challenges of scaling group chat in an instant messaging system and presents architectural optimizations—including shared message storage, periodic conversation list updates, offline count handling, and version‑based incremental sync—to reduce write and read amplification while improving overall performance.

Message QueueWrite Amplificationbackend optimization
0 likes · 12 min read
Optimizing Group Chat Performance in an Instant Messaging Backend
MaGe Linux Operations
MaGe Linux Operations
Apr 2, 2019 · Backend Development

Essential Design Principles for Building Scalable Microservices

This article outlines ten key design considerations for building robust microservice architectures, covering API gateways, stateless services, database scaling, caching, service decomposition, orchestration, configuration management, logging, resilience patterns, and comprehensive monitoring to ensure high availability and performance.

MicroservicesStateless Designapi-gateway
0 likes · 12 min read
Essential Design Principles for Building Scalable Microservices
Architecture Digest
Architecture Digest
Apr 2, 2019 · Databases

Designing Scalable Database Architecture for High‑Concurrency Systems

This article explains how to design a database architecture that can handle millions of daily active users and tens of thousands of concurrent requests by using multi‑server sharding, extensive table partitioning, read‑write separation, and distributed unique‑ID generation techniques such as Snowflake.

Read-Write SeparationSnowflake IDdatabase sharding
0 likes · 20 min read
Designing Scalable Database Architecture for High‑Concurrency Systems
iQIYI Technical Product Team
iQIYI Technical Product Team
Mar 15, 2019 · Cloud Computing

Design and Architecture of QLive Large‑Scale Live Streaming Service

The QLive service powers iQIYI’s massive live‑streaming events—such as the Spring Festival Gala—by combining vertical and horizontal scaling, a three‑layer architecture with dual data‑center isolation, multi‑level caching, circuit‑breaker/degradation controls, and a Flume‑Kafka‑Hive monitoring pipeline to sustain over 400 k QPS and 99.9999 % availability.

Vertical Scalingcachingfault tolerance
0 likes · 9 min read
Design and Architecture of QLive Large‑Scale Live Streaming Service
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Nov 11, 2018 · Backend Development

Technical Architecture Behind Alibaba's Double 11 Flash Sale

The article analyzes the massive technical challenges of Alibaba's Double 11 flash‑sale event and explains how cloud elasticity, distributed messaging, containerization, real‑time data processing, AI, front‑end optimization, caching, monitoring, and database sharding together enable billions of transactions within minutes.

AlibabaDouble 11Message Queue
0 likes · 9 min read
Technical Architecture Behind Alibaba's Double 11 Flash Sale
Zhongtong Tech
Zhongtong Tech
Nov 2, 2018 · Backend Development

How to Build a High‑Availability, Scalable E‑Commerce Backend for Mega Sales

This article explains the architectural challenges of large‑scale e‑commerce platforms during massive promotional events and provides a detailed, layer‑by‑layer guide to designing a highly available, horizontally scalable, stateless micro‑service backend with robust data handling, caching, messaging, and traffic‑management strategies.

backend-developmente‑commercehigh availability
0 likes · 10 min read
How to Build a High‑Availability, Scalable E‑Commerce Backend for Mega Sales
Big Data and Microservices
Big Data and Microservices
Aug 22, 2018 · Industry Insights

Designing Scalable Internet Platforms: Key Subsystems and Best Practices

The article outlines the architecture of large‑scale internet application platforms, detailing essential subsystems such as web front‑ends, load balancing, database clusters, caching, distributed storage, server management, and code deployment, and explains how they work together to achieve high availability, performance, and scalability.

Deploymentcachingdatabase clustering
0 likes · 8 min read
Designing Scalable Internet Platforms: Key Subsystems and Best Practices
Architecture Digest
Architecture Digest
Jun 18, 2018 · Operations

Design and Optimization of Large‑Scale Log Systems

This article examines the challenges of handling massive log data in high‑traffic e‑commerce platforms and presents a comprehensive architecture, optimization strategies, and practical implementations—including Rsyslog, Kafka, Fluentd, and the ELK stack—to improve scalability, performance, and reliability of log management systems.

Big DataELKFluentd
0 likes · 17 min read
Design and Optimization of Large‑Scale Log Systems
Tencent Cloud Developer
Tencent Cloud Developer
Jun 4, 2018 · Cloud Computing

In-depth Analysis and Practice of Tencent Cloud EB-level Object Storage Architecture

At the 2023 Tencent Cloud + Future summit, Liu Jinming detailed Tencent Cloud COS’s three‑tier EB‑level object storage architecture—covering network, application, and data layers—highlighting its 99.95% availability, 11‑nine durability, end‑to‑end encryption, scalable performance, tiered pricing, and real‑world media and security use cases.

Data ReliabilityTencent Cloudobject storage
0 likes · 10 min read
In-depth Analysis and Practice of Tencent Cloud EB-level Object Storage Architecture
dbaplus Community
dbaplus Community
Apr 24, 2018 · Databases

Scaling Baidu’s TSDB to Trillions of Points: Elastic, High‑Performance Architecture

Baidu’s TSDB processes over 20 million data points per second per node and tens of thousands of queries per second cluster‑wide by employing a stateless read/write‑separated elastic architecture, multi‑layer storage across Redis, HBase and Hadoop, minute‑level geo‑redundant self‑healing, and a modified Gorilla compression that cuts storage by 80% with minimal CPU overhead.

Big DataTSDBTime Series Database
0 likes · 8 min read
Scaling Baidu’s TSDB to Trillions of Points: Elastic, High‑Performance Architecture
Architecture Digest
Architecture Digest
Dec 22, 2017 · Big Data

Redesign and Optimization of the WeChat Pay Transaction Record System

This article presents a comprehensive case study of how WeChat Pay rebuilt its transaction record storage system to handle massive data volumes, improve performance, ensure data completeness, support flexible queries, and strengthen security through distributed key‑value storage, data partitioning, and operational safeguards.

Big DataData PartitioningWeChat Pay
0 likes · 11 min read
Redesign and Optimization of the WeChat Pay Transaction Record System
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Dec 7, 2017 · Operations

How 360’s Private Cloud Powers Elasticsearch: Architecture, Security, and Scaling

This article explains how 360’s Hulk private cloud platform deploys Elasticsearch with a dedicated master architecture, load‑balancing, per‑business isolated clusters, SearchGuard security, dynamic tokenization, self‑service user features, and advanced monitoring to achieve high‑performance, scalable search services.

Elasticsearchmonitoringprivate cloud
0 likes · 6 min read
How 360’s Private Cloud Powers Elasticsearch: Architecture, Security, and Scaling
21CTO
21CTO
Nov 21, 2017 · Operations

How We Scaled WeChat Pay’s Transaction Records to Billions Daily

This article chronicles the evolution of WeChat Pay’s transaction record system—from early key/value storage bottlenecks and incomplete data to a distributed, tiered architecture that supports billions of daily records, improves query performance, ensures data security, and handles holiday traffic spikes through flexible throttling.

Distributed SystemsWeChat Paydata security
0 likes · 11 min read
How We Scaled WeChat Pay’s Transaction Records to Billions Daily
Didi Tech
Didi Tech
Sep 22, 2017 · Operations

Didi’s “Jianmu” Build System: Architecture, Advantages, and Future Enhancements

Didi’s self‑developed “Jianmu” build system replaces Jenkins with a master‑slave architecture that colocates build scripts with source code, offering stable, high‑performance, permission‑controlled builds across multiple OSes via lightweight scheduling, workspace reuse, two‑level caching, Docker image management, and an “Env As Code” approach, while roadmap plans auto‑upgrades, elastic scaling, and further 30% efficiency gains.

Build SystemContainerized BuildDevOps
0 likes · 9 min read
Didi’s “Jianmu” Build System: Architecture, Advantages, and Future Enhancements
Efficient Ops
Efficient Ops
Aug 23, 2017 · Operations

Inside Tencent’s DevOps Pipeline: How Continuous Delivery Powers Scalable Operations

Tencent builds a complete DevOps pipeline using four platforms—TAPD, TGit, CIS, and ZhiYun—explaining the eight principles of continuous delivery, the four stages of operability, a three‑layer architecture, and showcasing ZhiYun’s configuration, automation, and self‑healing practices to deliver a systematic operations solution for enterprises.

Configuration ManagementDevOpsscalable architecture
0 likes · 11 min read
Inside Tencent’s DevOps Pipeline: How Continuous Delivery Powers Scalable Operations
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jul 15, 2017 · Databases

How to Design and Implement Horizontal Database Sharding: A Real‑World Case Study

This article presents a comprehensive analysis of horizontal database sharding, detailing the design decisions, partitioning dimensions, routing, pagination handling, lookup mapping, and deployment steps based on a real‑world implementation at a large e‑commerce platform, offering practical guidance for scaling order databases.

DALOraclee‑commerce
0 likes · 14 min read
How to Design and Implement Horizontal Database Sharding: A Real‑World Case Study
21CTO
21CTO
May 8, 2017 · Backend Development

How Facebook Live Scales to Millions: Inside Its Backend Architecture

This article explains how Facebook Live handles millions of concurrent streams and viewers by using a multi‑layer edge cache system, request merging, and load balancing to achieve high‑availability, low‑latency video delivery at massive scale.

Facebook Liveedge cacheload balancing
0 likes · 7 min read
How Facebook Live Scales to Millions: Inside Its Backend Architecture
MaGe Linux Operations
MaGe Linux Operations
Feb 5, 2017 · Backend Development

How MogileFS Powers Scalable Distributed File Storage: Architecture & Deployment Guide

This article introduces the open‑source MogileFS distributed file system, explains its server, storage, and client components, outlines its key features and operating principles, and provides step‑by‑step installation, configuration, and Nginx reverse‑proxy load‑balancing instructions for large‑scale image storage.

Backend StorageDistributed File SystemMogileFS
0 likes · 6 min read
How MogileFS Powers Scalable Distributed File Storage: Architecture & Deployment Guide
Qunar Tech Salon
Qunar Tech Salon
Jan 16, 2017 · Backend Development

Scalable Web Architecture and Distributed Systems

This article explains the key design principles, components, and techniques—such as availability, performance, reliability, scalability, cost, redundancy, partitioning, caching, proxies, indexing, load balancing, and queuing—required to build large‑scale, high‑performance, and fault‑tolerant web and distributed systems, illustrated with an image‑hosting example.

Web Performancecachingredundancy
0 likes · 37 min read
Scalable Web Architecture and Distributed Systems
Architecture Digest
Architecture Digest
Sep 5, 2016 · Backend Development

Designing a Scalable E‑Commerce Architecture with SOA, Dubbo, and a Product Query DSL – Haier’s Experience

The article describes how Haier’s e‑commerce platform uses a service‑oriented architecture based on Dubbo, a fine‑grained product query DSL, and a highly scalable product service design to handle massive traffic spikes during major shopping events while maintaining performance, extensibility, and reliability.

DSLDubboProduct Service
0 likes · 13 min read
Designing a Scalable E‑Commerce Architecture with SOA, Dubbo, and a Product Query DSL – Haier’s Experience
Architect
Architect
Jun 26, 2016 · Backend Development

WhatsApp Architecture: High‑Scalability Design and Engineering Insights

The article analyzes WhatsApp's high‑scalability backend architecture, detailing its Erlang‑based server stack, massive user statistics, hardware deployment, custom protocols, performance‑tuning tools, and lessons learned from scaling to billions of messages with a tiny engineering team.

Backend EngineeringErlangWhatsApp
0 likes · 20 min read
WhatsApp Architecture: High‑Scalability Design and Engineering Insights
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
Jun 24, 2016 · Backend Development

Scalable Web Architecture for Startup Companies

This article explains how startup engineers can design and implement a scalable web architecture—covering server clustering, load balancing, distributed caching, database replication, and team organization—to handle rapid user growth without compromising performance or reliability.

Database Replicationdistributed cachingload balancing
0 likes · 15 min read
Scalable Web Architecture for Startup Companies
21CTO
21CTO
Jun 17, 2016 · Backend Development

How to Build a Scalable Web Architecture for Fast‑Growing Startups

This article explains how startup engineers can design a scalable web system by separating services onto multiple servers, using load balancers, distributed caches, master‑slave replication, and team‑splitting strategies, ensuring performance and reliability as user traffic and data volumes surge.

Database Replicationbackend-developmentdistributed cache
0 likes · 15 min read
How to Build a Scalable Web Architecture for Fast‑Growing Startups
Hulu Beijing
Hulu Beijing
May 18, 2016 · Backend Development

How Hulu Scaled Its Multi‑Device Video Platform with MPEG‑DASH

Presented by Hulu’s senior software development lead, this talk outlines the evolution of Hulu’s full‑platform video system—covering its origins, scaling challenges, multi‑device support, DRM complexities, and the innovative MPEG‑DASH architecture that powers seamless streaming across desktops, mobiles, and living‑room devices.

DRMMPEG-DASHVideo Streaming
0 likes · 9 min read
How Hulu Scaled Its Multi‑Device Video Platform with MPEG‑DASH
Efficient Ops
Efficient Ops
Apr 13, 2016 · Operations

How Octopux Achieves 99.9% Bandwidth Monitoring Accuracy at Scale

Octopux is an open‑source bandwidth monitoring platform designed by Baishan Cloud to deliver 99.9% data integrity, cross‑operator and cross‑country coverage, minute‑level granularity, and horizontal scalability for tens of thousands of devices, addressing the limitations of traditional tools like Cacti.

InfluxDBbandwidth monitoringnetwork operations
0 likes · 8 min read
How Octopux Achieves 99.9% Bandwidth Monitoring Accuracy at Scale
21CTO
21CTO
Mar 23, 2016 · Mobile Development

How to Build a Scalable Mobile E‑Commerce Architecture from Scratch

Learn a comprehensive framework for rapidly constructing a high‑scalability mobile e‑commerce system, covering hybrid app architecture, SOA backend design, container‑based virtualization, private‑cloud deployment, and practical strategies for handling massive traffic spikes during major sales events.

Hybrid AppSOAcontainerization
0 likes · 18 min read
How to Build a Scalable Mobile E‑Commerce Architecture from Scratch
Qunar Tech Salon
Qunar Tech Salon
Feb 21, 2016 · Backend Development

Evolution and Best Practices of Image Server Architecture for Large-Scale Websites

This article chronicles the progression of image server architectures—from single‑machine setups to clustered, shared‑storage, and CDN‑backed solutions—highlighting design pitfalls, scalability challenges, and practical recommendations for building reliable, high‑performance image services in modern web applications.

CDNFastDFSdistributed storage
0 likes · 16 min read
Evolution and Best Practices of Image Server Architecture for Large-Scale Websites
Architecture Digest
Architecture Digest
Feb 16, 2016 · Backend Development

Scalable Web Architecture for Startup Companies

The article explains how internet startups can design and evolve a scalable web architecture—through service separation, clustering, load balancing, distributed caching, database replication, and effective team organization—to handle rapid user growth and avoid performance bottlenecks.

Database ReplicationWeb Developmentcaching
0 likes · 15 min read
Scalable Web Architecture for Startup Companies
21CTO
21CTO
Feb 4, 2016 · Backend Development

Key Principles for Building Scalable Distributed Web Systems

This article outlines essential design principles for large‑scale web architectures—including availability, performance, reliability, scalability, manageability and cost—and demonstrates their application through a detailed image‑hosting service example, covering services, redundancy, partitioning, caching, proxies, indexing, load balancing, and queuing to achieve efficient, scalable data access.

Data PartitioningDistributed Systemscaching
0 likes · 37 min read
Key Principles for Building Scalable Distributed Web Systems
21CTO
21CTO
Jan 31, 2016 · Operations

Designing Scalable Image Servers: From Windows Monoliths to Cloud‑Native CDN Solutions

This article examines why Windows‑based image servers are often seen as conservative, outlines the limitations of single‑server and clustered designs, and presents modern, scalable architectures using virtual directories, shared storage, FastDFS, and CDN integration for high‑performance web applications.

CDNFastDFSWindows .NET
0 likes · 13 min read
Designing Scalable Image Servers: From Windows Monoliths to Cloud‑Native CDN Solutions
21CTO
21CTO
Jan 30, 2016 · Operations

Designing Scalable Dynamic Web Platforms: Key Subsystems and Best Practices

This article explains how large‑scale dynamic web applications are built from core subsystems—including web front‑end, load balancing, database clustering, caching, distributed storage, server management, and code deployment—to achieve reliability, scalability, and maintainability for high‑traffic sites.

cachingdatabase clusteringdistributed storage
0 likes · 7 min read
Designing Scalable Dynamic Web Platforms: Key Subsystems and Best Practices
Architect
Architect
Jan 26, 2016 · Operations

Evolution of Image Server Architecture: From Single‑Node to Distributed File System and CDN

The article examines how large‑scale web sites handle massive image resources, tracing the progression from simple single‑machine storage to clustered virtual directories, shared UNC storage, and finally a FastDFS‑based distributed file system combined with CDN acceleration, highlighting the architectural trade‑offs and operational considerations.

CDNFastDFSOperations
0 likes · 13 min read
Evolution of Image Server Architecture: From Single‑Node to Distributed File System and CDN
21CTO
21CTO
Jan 20, 2016 · Operations

From Single‑Server to Distributed CDN: Evolving Image Server Architecture

This article traces the evolution of image server architectures—from a simple single‑directory setup on Windows/.NET, through cluster‑based real‑time synchronization and shared‑storage solutions, to modern FastDFS‑backed distributed file systems with CDN acceleration—highlighting scalability, reliability, and migration challenges.

CDNFastDFSbackend operations
0 likes · 13 min read
From Single‑Server to Distributed CDN: Evolving Image Server Architecture
High Availability Architecture
High Availability Architecture
Jan 17, 2016 · Operations

High‑Availability Architecture and Scaling Practices of Meipai Short‑Video Platform

This article outlines Meipai’s evolution into a billion‑user short‑video platform, detailing its high‑availability, scalable architecture, service discovery via etcd, data storage challenges, CDN and cloud‑storage redundancy, fault‑tolerance mechanisms, and future directions such as H.265 adoption and P2P‑CDN hybrid delivery.

CDNcloud storageetcd
0 likes · 18 min read
High‑Availability Architecture and Scaling Practices of Meipai Short‑Video Platform
21CTO
21CTO
Jan 8, 2016 · Backend Development

How Didi Scaled Ride‑Hailing: LBS, Long‑Connection, and Real‑Time Data Solutions

Facing explosive traffic growth in 2014, Didi’s ride‑hailing platform tackled critical challenges by redesigning its LBS architecture, replacing unstable long‑connection services with an AIO‑based framework, partitioning databases, adopting Dubbo and RocketMQ for distributed processing, and building a real‑time monitoring and data center using Storm, HBase, and custom SQL‑to‑HBase translation.

Real-time ProcessingRide Hailingdatabase sharding
0 likes · 14 min read
How Didi Scaled Ride‑Hailing: LBS, Long‑Connection, and Real‑Time Data Solutions
ITPUB
ITPUB
Jan 4, 2016 · Backend Development

Designing a Scalable 100k-Server Monitoring System: Architecture and Lessons Learned

The article outlines the architecture, design principles, challenges, and performance optimizations of a large‑scale server monitoring system built for handling hundreds of gigabytes of data per day with high availability, low latency alerts, and multi‑platform support.

C Programmingmonitoringreal-time alerts
0 likes · 11 min read
Designing a Scalable 100k-Server Monitoring System: Architecture and Lessons Learned
Architect
Architect
Dec 30, 2015 · Backend Development

Snapdeal Ads System Architecture: Scaling to 5 Billion Daily Requests

The article details how Snapdeal built a high‑performance, low‑latency ad‑serving platform that handles billions of daily requests by employing horizontal and vertical scaling, AP‑focused CAP choices, in‑memory data structures, and a suite of open‑source backend technologies.

Ad TechBackendDistributed Systems
0 likes · 6 min read
Snapdeal Ads System Architecture: Scaling to 5 Billion Daily Requests
21CTO
21CTO
Nov 28, 2015 · Backend Development

Key Principles for Building Scalable Distributed Web Architectures

This article outlines essential design principles—availability, performance, reliability, scalability, manageability, and cost—and practical techniques such as service separation, redundancy, partitioning, caching, proxies, indexing, load balancing, and queuing to help engineers construct high‑performance, fault‑tolerant web systems.

backend designcachingload balancing
0 likes · 36 min read
Key Principles for Building Scalable Distributed Web Architectures
Java High-Performance Architecture
Java High-Performance Architecture
Oct 27, 2015 · Backend Development

How Consistent Hashing Powers Scalable Memcached Clusters

Caching dramatically improves website performance by storing data in memory for faster responses and reducing database load, with Memcached and Redis as popular solutions; proper routing algorithms like consistent hashing are essential to scale clusters without causing cache misses or service disruption.

Backend PerformanceMemcachedcaching
0 likes · 2 min read
How Consistent Hashing Powers Scalable Memcached Clusters
21CTO
21CTO
Oct 16, 2015 · Backend Development

Inside Uber's Real‑Time Dispatch: How the Company Scales Its Marketplace

This article details Uber's rapid growth and the engineering choices behind its real‑time dispatch platform, covering geospatial indexing, microservice architecture, scaling techniques like Ringpop and TChannel, and strategies for availability and fault tolerance.

Distributed SystemsGeospatial IndexingMicroservices
0 likes · 18 min read
Inside Uber's Real‑Time Dispatch: How the Company Scales Its Marketplace
21CTO
21CTO
Sep 14, 2015 · Backend Development

Why Simple‑Looking Sites Like Taobao Need Hundreds of Top Engineers

Although sites like Taobao appear simple to users, they rely on massive distributed search, caching, storage, load‑balancing, CDN, logging, and data‑analysis systems that demand sophisticated backend engineering, massive infrastructure, and specialized algorithms, explaining why countless top engineers are required to keep them running.

Big DataDistributed Systemscaching
0 likes · 12 min read
Why Simple‑Looking Sites Like Taobao Need Hundreds of Top Engineers