Tagged articles
1414 articles
Page 13 of 15
21CTO
21CTO
Oct 22, 2017 · Operations

How to Build Highly Available Systems: Fault Tolerance and Scalability Strategies

This article explains why high availability is critical for internet services, outlines key techniques such as stateless design, service discovery, heartbeat checks, idempotent operations, load balancing, throttling, caching, and micro‑service architecture, and discusses the operational challenges and monitoring tools needed to maintain resilient, scalable systems.

IdempotencyMicroservicesScalability
0 likes · 8 min read
How to Build Highly Available Systems: Fault Tolerance and Scalability Strategies
Architecture Digest
Architecture Digest
Oct 22, 2017 · Operations

Ensuring High Availability in Internet Services: Stateless Design, Service Discovery, Idempotency, Rate Limiting, and Microservices

The article discusses how to achieve high availability for large‑scale internet services by adopting stateless architecture, service discovery and registration, heartbeat monitoring, idempotent design, retry mechanisms, rate limiting, caching, and micro‑service decomposition to handle machine failures, network glitches, and high concurrency.

IdempotencyMicroservicesScalability
0 likes · 9 min read
Ensuring High Availability in Internet Services: Stateless Design, Service Discovery, Idempotency, Rate Limiting, and Microservices
Architecture Digest
Architecture Digest
Oct 21, 2017 · Cloud Computing

High‑Performance Load Balancing Design and Implementation Using LVS and Tengine

This article reviews Alibaba Cloud's high‑performance load‑balancing solution, explaining the evolution from basic load‑balancing concepts to the architecture of LVS and Tengine, detailing their modes, optimizations, high‑availability designs across groups, AZs and regions, and outlining current use cases and future directions.

LVSTenginecloud computing
0 likes · 12 min read
High‑Performance Load Balancing Design and Implementation Using LVS and Tengine
21CTO
21CTO
Oct 15, 2017 · Operations

Mastering High Concurrency & High Availability: Core Principles for Scalable Systems

This article outlines essential principles for designing high‑concurrency and high‑availability systems, covering stateless architecture, service decomposition, caching strategies, message queues, data heterogeneity, degradation, rate limiting, traffic switching, rollback, and comprehensive business design rules such as idempotency, anti‑duplication, and documentation.

Backend ArchitectureScalabilitySystem Design
0 likes · 12 min read
Mastering High Concurrency & High Availability: Core Principles for Scalable Systems
Architecture Digest
Architecture Digest
Oct 15, 2017 · Operations

High Concurrency and High Availability Design Principles

This article outlines essential high‑concurrency and high‑availability principles—including stateless design, service decomposition, caching strategies, message queues, data heterogeneity, degradation, rate limiting, traffic switching, and rollback mechanisms—to help architects build scalable, reliable, and resilient systems.

ScalabilitySystem Designarchitecture
0 likes · 12 min read
High Concurrency and High Availability Design Principles
Architects' Tech Alliance
Architects' Tech Alliance
Oct 14, 2017 · Operations

How FCoE Unifies LAN and SAN: Design Benefits and Deployment Strategies

This article explains how integrating IP SAN and FC SAN with FCoE simplifies data‑center networks, reduces hardware and power consumption, improves flexibility and reliability, and details the deployment modes, access‑layer design, high‑availability considerations, and traffic models for a unified LAN/SAN architecture.

CNAFCoELAN
0 likes · 12 min read
How FCoE Unifies LAN and SAN: Design Benefits and Deployment Strategies
MaGe Linux Operations
MaGe Linux Operations
Oct 11, 2017 · Operations

When Celebrities Crash Weibo: Inside the Ops Battle and Hybrid Cloud Solution

A sudden surge of traffic triggered by a celebrity relationship announcement caused a Weibo outage, prompting frantic reactions from developers, operations, and management, and leading to an in‑depth analysis of high‑availability architecture, elastic scaling, hybrid‑cloud DCP platforms, and Docker‑based service deployment.

Operationshigh availabilityhybrid cloud
0 likes · 19 min read
When Celebrities Crash Weibo: Inside the Ops Battle and Hybrid Cloud Solution
dbaplus Community
dbaplus Community
Oct 3, 2017 · Databases

What Oracle 18c Sharding and MySQL 8.0 Reveal About Modern Database Evolution

The article reviews Oracle's high‑availability and sharding advances in 18c, including RAC Sharding 2.0 and Active Data Guard features, and examines MySQL 8.0's major improvements such as the removal of MyISAM, data‑dictionary optimizations, self‑tuning, enhanced security, and new performance capabilities, while also comparing MySQL with PostgreSQL.

8.0Oraclehigh availability
0 likes · 12 min read
What Oracle 18c Sharding and MySQL 8.0 Reveal About Modern Database Evolution
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Sep 26, 2017 · Operations

Design and Implementation of a High‑Performance, High‑Reliability Four‑Layer Load Balancer (TVS) with FullNat Model

The article describes the motivation, architecture, FullNat forwarding process, high‑availability mechanisms, and performance‑optimizing techniques of TVS, a custom four‑layer load‑balancing platform built to overcome LVS limitations and meet the company’s demanding traffic and reliability requirements.

Cluster ArchitectureDPDKFullNAT
0 likes · 11 min read
Design and Implementation of a High‑Performance, High‑Reliability Four‑Layer Load Balancer (TVS) with FullNat Model
Architects' Tech Alliance
Architects' Tech Alliance
Sep 23, 2017 · Operations

Understanding EMC vPlex Dual‑Active Storage Architecture and Distributed Cache Mechanisms

The article explains EMC vPlex's evolution, dual‑active Metro and Geo configurations, distributed cache coherence, storage virtualization, and integration with RecoverPoint, highlighting how these technologies enable high‑availability, low‑latency data access across multiple data‑center sites.

Dual-ActiveEMC vPlexStorage Virtualization
0 likes · 17 min read
Understanding EMC vPlex Dual‑Active Storage Architecture and Distributed Cache Mechanisms
Architects' Tech Alliance
Architects' Tech Alliance
Sep 19, 2017 · Industry Insights

Unlocking Dual‑Active Storage: Inside HDS’s HAM and GAD High‑Availability Architecture

This article explains HDS’s High Availability Manager (HAM) and Global Active Device (GAD) technologies, detailing how they virtualize mirrored LUNs, use TrueCopy replication, arbitration mechanisms, and various network topologies to provide seamless active‑active storage, support for NAS, and flexible clustering across data centers.

GADHAMHDS
0 likes · 11 min read
Unlocking Dual‑Active Storage: Inside HDS’s HAM and GAD High‑Availability Architecture
WeChat Backend Team
WeChat Backend Team
Sep 12, 2017 · Backend Development

How PhxQueue Achieves High‑Throughput, High‑Reliability Distributed Queuing with Paxos

PhxQueue, an open‑source, Paxos‑based distributed queue from WeChat, delivers at‑least‑once delivery, synchronous disk flushing, strict ordering, multi‑subscription, and high availability, outperforming Kafka in reliability and latency while maintaining comparable throughput, as demonstrated through detailed design, performance, and failover analyses.

Distributed SystemsKafkaPaxos
0 likes · 26 min read
How PhxQueue Achieves High‑Throughput, High‑Reliability Distributed Queuing with Paxos
21CTO
21CTO
Sep 6, 2017 · Cloud Computing

How Alibaba Cloud SLB Achieves High Availability Across Four Layers

This article explains Alibaba Cloud's Server Load Balancer (SLB) architecture and its four-tier high‑availability design—application processing, cluster forwarding, cross‑zone disaster recovery, and cross‑region disaster recovery—detailing both product features and user‑side best practices.

Alibaba CloudSLBcloud computing
0 likes · 12 min read
How Alibaba Cloud SLB Achieves High Availability Across Four Layers
dbaplus Community
dbaplus Community
Sep 5, 2017 · Big Data

Why Kafka Needs High Availability: Deep Dive into Replication and Leader Election

This article explains why Kafka introduced High Availability in version 0.8, covering the necessity of data replication and leader election, the internal replication and ACK mechanisms, Zookeeper metadata structures, broker failover procedures, and the command‑line tools that help manage and rebalance a Kafka cluster.

KafkaReplicationhigh availability
0 likes · 36 min read
Why Kafka Needs High Availability: Deep Dive into Replication and Leader Election
Architecture Digest
Architecture Digest
Sep 2, 2017 · Big Data

Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data

This article examines the principles, features, and implementation details of distributed scheduling for big‑data ETL pipelines, covering decentralised schedulers, host selection strategies, fault‑tolerance, operator abstraction, elasticity, trigger mechanisms, visual monitoring, alarm handling, data fan‑in/fan‑out, parameter consistency, real‑time quality checks, lineage tracking, and field‑level traceability.

Big DataData LineageDistributed Scheduling
0 likes · 23 min read
Designing a High‑Availability, High‑Efficiency Distributed Scheduling Platform for Big Data
Architecture Digest
Architecture Digest
Sep 1, 2017 · Operations

Comprehensive Guide to Scalable Website Architecture from an Operations Perspective

This article presents a step‑by‑step operations‑focused roadmap for evolving a website from a single‑server prototype to a highly available, horizontally scalable architecture using load balancing, caching, database replication, service‑oriented design, DNS round‑robin, CDN, and disaster‑recovery techniques.

Database ReplicationOperationsScalability
0 likes · 10 min read
Comprehensive Guide to Scalable Website Architecture from an Operations Perspective
Efficient Ops
Efficient Ops
Aug 24, 2017 · Databases

Mastering Redis Disaster Recovery: Sentinel and Manual Failover Strategies

This article explains how to protect Redis deployments from single‑point failures by using master‑slave replication, manual failover procedures, and the automated high‑availability Sentinel solution, providing practical guidance for reliable disaster recovery in non‑clustered environments.

high availabilitymaster-slave replicationredis
0 likes · 11 min read
Mastering Redis Disaster Recovery: Sentinel and Manual Failover Strategies
UCloud Tech
UCloud Tech
Aug 24, 2017 · Databases

How UDB Achieves High Availability: Deep MySQL Replication Optimizations

This article explains UDB's high‑availability architecture, detailing its dual‑node design with virtual IP and HAProxy, and describes the kernel‑level optimizations applied to MySQL's native semi‑synchronous replication, relay log handling, master.info management, and lock contention to boost stability and performance.

Database OptimizationMySQL replicationUDB
0 likes · 7 min read
How UDB Achieves High Availability: Deep MySQL Replication Optimizations
dbaplus Community
dbaplus Community
Aug 22, 2017 · Databases

How UDB Supercharges MySQL Replication with Deep Kernel Optimizations

This article details UDB's high‑availability architecture and four kernel‑level optimizations—binlog replication, relay‑log recording, master.info handling, and relay‑log locking—that together improve MySQL semi‑synchronous replication performance and reliability.

Database OptimizationReplicationSemi-synchronous
0 likes · 9 min read
How UDB Supercharges MySQL Replication with Deep Kernel Optimizations
dbaplus Community
dbaplus Community
Aug 14, 2017 · Databases

How Meituan‑Dianping Evolved MySQL HA: From MMM to MHA+Zebra and Beyond

This article traces Meituan‑Dianping's MySQL high‑availability journey, detailing the legacy MMM system, the transition to MHA, integrations with Zebra and Proxy middleware, current challenges, and future designs such as distributed agents, semi‑sync replication, and MySQL Group Replication.

Database ArchitectureDistributed SystemsMHA
0 likes · 13 min read
How Meituan‑Dianping Evolved MySQL HA: From MMM to MHA+Zebra and Beyond
Alibaba Cloud Developer
Alibaba Cloud Developer
Aug 9, 2017 · Databases

How AliSQL X‑Cluster Achieves Strong Consistency and Global Scalability

AliSQL X‑Cluster is Alibaba's MySQL‑compatible distributed database that integrates the X‑Paxos consensus protocol to provide strong consistency, multi‑region deployment, low‑cost replica types, asynchronous transaction commit, hotspot‑update optimizations and superior performance compared with native MySQL and Group Replication, while offering flexible online configuration and robust failover mechanisms.

Cross-Region Deploymentconsensus protocoldistributed databases
0 likes · 28 min read
How AliSQL X‑Cluster Achieves Strong Consistency and Global Scalability
dbaplus Community
dbaplus Community
Aug 7, 2017 · Databases

Mastering MySQL Architecture: Standards, HA, Sharding, and Redis Integration

This article outlines comprehensive MySQL best practices, covering development and operational standards, high‑availability architecture choices such as Keepalived, MHA and Percona XtraDB Cluster, sharding strategies (vertical and horizontal), and how Redis can be leveraged to offload read pressure.

Database designhigh availabilitymysql
0 likes · 19 min read
Mastering MySQL Architecture: Standards, HA, Sharding, and Redis Integration
Architecture Digest
Architecture Digest
Aug 7, 2017 · Operations

Website Availability and High‑Availability Architecture Overview

This article explains website availability metrics, fault‑weight scoring, layered high‑availability architecture, session management strategies, reusable service design, data redundancy, quality assurance processes, and monitoring practices essential for maintaining reliable large‑scale web systems.

AvailabilityOperationsSession Management
0 likes · 9 min read
Website Availability and High‑Availability Architecture Overview
High Availability Architecture
High Availability Architecture
Aug 7, 2017 · Backend Development

Highlights of the CCF TF Architecture SIG Microservices Practice Seminar

The CCF TF Architecture SIG hosted a densely attended microservices practice seminar in Beijing, featuring leading experts from 25 top tech companies who shared deep insights on service discovery, high‑availability architectures, Spring Cloud adoption, and large‑scale microservice frameworks such as Vintage, OCTO, and rest.li.

Backendarchitecturehigh availability
0 likes · 8 min read
Highlights of the CCF TF Architecture SIG Microservices Practice Seminar
WeChat Backend Team
WeChat Backend Team
Aug 3, 2017 · Databases

How PhxSQL Achieves Strong Consistency and High Availability for MySQL

This article explains the design and implementation of PhxSQL, a MySQL‑compatible high‑availability solution that uses a reliable log storage based on Paxos, Proxy request forwarding, automatic master election, and other mechanisms to overcome native MySQL replication flaws and provide strong data consistency and fault‑tolerant performance.

Database ReplicationDistributed SystemsPaxos
0 likes · 17 min read
How PhxSQL Achieves Strong Consistency and High Availability for MySQL
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jul 30, 2017 · Backend Development

Memcached Slab Allocator Explained: Memory Management & Scaling

This article explains Memcached's slab allocator memory management, key concepts like items, chunks, slab classes and pages, the calcium problem, and how master‑slave double‑layer and L1 cache architectures enable high concurrency, high availability, and linear scaling.

Slab Allocatorcachinghigh availability
0 likes · 12 min read
Memcached Slab Allocator Explained: Memory Management & Scaling
dbaplus Community
dbaplus Community
Jul 26, 2017 · Databases

How Ele.me Achieved Sub‑Second MySQL Multi‑Active Replication with DRC

This article details Ele.me's design and implementation of a MySQL bidirectional replication component (DRC) that enables sub‑second, high‑throughput data synchronization across Beijing and Shanghai data centers, addressing latency, consistency, and failover challenges in a multi‑active environment.

Distributed Systemsdata replicationdatabase-consistency
0 likes · 18 min read
How Ele.me Achieved Sub‑Second MySQL Multi‑Active Replication with DRC
Meituan Technology Team
Meituan Technology Team
Jun 30, 2017 · Operations

How Meituan‑Dianping Evolved MySQL HA from MMM to MHA‑Zebra and Beyond

This article traces Meituan‑Dianping's MySQL high‑availability journey from the early MMM replication manager to the modern MHA‑Zebra and MHA‑Proxy solutions, compares each architecture, highlights their shortcomings, and outlines future directions such as distributed agents, semi‑sync replication, and Paxos‑based MySQL Group Replication.

Database ArchitectureDistributed SystemsMHA
0 likes · 12 min read
How Meituan‑Dianping Evolved MySQL HA from MMM to MHA‑Zebra and Beyond
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Jun 26, 2017 · Databases

Building Scalable MySQL HA: From MHA to 7‑Layer Proxy and RDS

After initially focusing on a distributed MySQL system, the author describes why open‑source HA solutions like MHA were unsuitable, then details the design and implementation of a 4‑layer NAT‑based proxy (RDS) and a more advanced 7‑layer application‑level proxy, highlighting features such as authentication, load balancing, read/write splitting, and multi‑datacenter awareness.

ProxyRDSdatabase
0 likes · 11 min read
Building Scalable MySQL HA: From MHA to 7‑Layer Proxy and RDS
dbaplus Community
dbaplus Community
Jun 18, 2017 · Fundamentals

Demystifying Paxos: How Distributed Systems Achieve Consensus

This article explains why Paxos is needed for consistency in distributed systems, details its roles and three-phase protocol, illustrates the algorithm with a real‑world analogy, and shows how Paxos underpins high‑availability database replication such as MySQL binlog synchronization.

Database ReplicationPaxosdistributed consensus
0 likes · 13 min read
Demystifying Paxos: How Distributed Systems Achieve Consensus
21CTO
21CTO
Jun 18, 2017 · Databases

Mastering Redis High Availability: Sentinel, VIP, and Cluster Strategies

This article explains why Redis high‑availability is essential, details the inner workings of Redis Sentinel, compares several HA architectures—including Sentinel with DNS or VIP, client‑side Sentinel, Keepalived/Haproxy, Redis Cluster, Twemproxy, and Codis—lists their pros and cons, and shares practical best‑practice recommendations for building reliable Redis deployments.

Clusterdatabasehigh availability
0 likes · 19 min read
Mastering Redis High Availability: Sentinel, VIP, and Cluster Strategies
Architecture Digest
Architecture Digest
Jun 2, 2017 · Backend Development

Evolution, Architecture, Performance, Scalability, and Security of Large-Scale Websites

This article provides a comprehensive overview of large‑scale website architecture, covering key metrics, evolutionary stages, core design patterns, performance testing, high‑availability strategies, scalability techniques, and security measures essential for building and operating robust web systems.

Scalabilityarchitecturehigh availability
0 likes · 20 min read
Evolution, Architecture, Performance, Scalability, and Security of Large-Scale Websites
ITFLY8 Architecture Home
ITFLY8 Architecture Home
May 30, 2017 · Fundamentals

CAP Theory, Shared‑Nothing, Load Balancing & High Availability Explained

This article explores core distributed system design principles, detailing the CAP theorem and its implications, the BASE extension, shared‑nothing architecture, various load‑balancing algorithms and deployment modes, as well as high‑availability strategies such as active‑standby, active‑active, and clustering to eliminate single points of failure.

CAP theoremDistributed Systemshigh availability
0 likes · 18 min read
CAP Theory, Shared‑Nothing, Load Balancing & High Availability Explained
dbaplus Community
dbaplus Community
May 24, 2017 · Operations

How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime

This guide details the step‑by‑step process for replacing a faulty ZooKeeper node (myid 5) in a five‑node cluster, covering configuration updates in zoo.cfg, Hadoop’s hdfs‑site.xml, yarn‑site.xml, HBase‑site.xml, and the required service restarts to ensure continuous high‑availability.

ConfigurationHBaseHadoop
0 likes · 10 min read
How to Replace a ZooKeeper Node in a 5‑Node Cluster Without Downtime
ITPUB
ITPUB
May 24, 2017 · Databases

How to Build a Redis High‑Availability Cluster with Sentinel and VIP

This guide walks through setting up a Redis high‑availability solution using master‑slave replication, Redis Sentinel for automatic failover, and a floating VIP to provide a stable endpoint, covering environment preparation, configuration files, firewall rules, testing, and client integration.

Linuxfailoverhigh availability
0 likes · 10 min read
How to Build a Redis High‑Availability Cluster with Sentinel and VIP
Qunar Tech Salon
Qunar Tech Salon
May 12, 2017 · Databases

High Availability Solutions for MySQL and UDB: Techniques and Case Study

The article explains high‑availability concepts, compares typical MySQL HA architectures—including replication, clustering, and Paxos‑based solutions—and presents UDB’s dual‑master semi‑synchronous design with a Proxy layer that ensures automatic failover, data consistency, and operational resilience.

ProxyReplicationUDB
0 likes · 12 min read
High Availability Solutions for MySQL and UDB: Techniques and Case Study
MaGe Linux Operations
MaGe Linux Operations
May 5, 2017 · Backend Development

How Meituan Scaled Its Food‑Delivery Order System to Millions of Daily Orders

This article chronicles the evolution of Meituan's food‑delivery order system from a simple modular prototype to a distributed, high‑performance, highly available architecture, detailing the business characteristics, architectural milestones, performance optimizations, consistency safeguards, scalability techniques, and intelligent operations that enable handling millions of orders per day.

Distributed SystemsScalabilityhigh availability
0 likes · 20 min read
How Meituan Scaled Its Food‑Delivery Order System to Millions of Daily Orders
Alibaba Cloud Developer
Alibaba Cloud Developer
May 4, 2017 · Operations

From Taobao to the Cloud: Proven High‑Availability Strategies for Massive Traffic

In this talk, Alibaba expert Mu Jian shares how the massive Taobao e‑commerce platform achieved high availability through layered networking, cache design, OS‑level tuning, rate limiting, disaster‑recovery planning, and cloud‑native architectures, offering practical guidance for building resilient systems at scale.

Alibabacachingcloud architecture
0 likes · 19 min read
From Taobao to the Cloud: Proven High‑Availability Strategies for Massive Traffic
Meituan Technology Team
Meituan Technology Team
Apr 21, 2017 · Backend Development

Design and Implementation of Meituan's Distributed ID Generation System Leaf

Meituan’s Leaf system merges segment‑based caching with Snowflake‑style bit fields to deliver globally unique, trend‑increasing 64‑bit IDs at ultra‑low latency, using double‑buffered DB segments, master‑slave MySQL replication, Zookeeper‑assigned worker IDs, and clock‑rollback safeguards, achieving ~50 k QPS and 1 ms 99.9th‑percentile response across billions of daily IDs.

BackendDistributed SystemsID generation
0 likes · 18 min read
Design and Implementation of Meituan's Distributed ID Generation System Leaf
dbaplus Community
dbaplus Community
Apr 18, 2017 · Databases

How to Build a Reliable Redis Sentinel HA Setup and Fix Password Auto‑Rewrite

This article explains how to deploy a Redis master‑slave cluster with Sentinel for high availability, details application configuration, highlights a subtle issue where Redis passwords are automatically rewritten after failover, and provides three remediation approaches including a source‑code patch with GitHub links.

Configurationhigh availabilityredis
0 likes · 7 min read
How to Build a Reliable Redis Sentinel HA Setup and Fix Password Auto‑Rewrite
High Availability Architecture
High Availability Architecture
Apr 13, 2017 · Backend Development

Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo

This article examines the architecture of Weibo's high‑availability advertising platform, covering match service design with OpenResty, index sharding, business logic optimization, dynamic auto‑scaling, and a real‑time monitoring pipeline to ensure stable, high‑performance ad delivery at massive scale.

AdvertisingBackend ArchitectureOpenResty
0 likes · 11 min read
Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 11, 2017 · Databases

Why High Availability Triggers a Consistency‑Performance Trade‑off in Distributed Databases

The article explains how achieving high availability through data redundancy introduces consistency challenges that in turn affect performance, and it reviews partitioning, mirroring, consistency models, replication architectures, and two/three‑phase commit protocols in distributed systems.

Data ConsistencyDistributed SystemsReplication
0 likes · 18 min read
Why High Availability Triggers a Consistency‑Performance Trade‑off in Distributed Databases
Alibaba Cloud Developer
Alibaba Cloud Developer
Apr 6, 2017 · Backend Development

How Alibaba Scaled GitLab to Support Millions of Users with Sharding and High‑Availability

This article details Alibaba Group's journey of transforming its GitLab deployment from a single‑node setup to a distributed, sharded architecture that handles tens of millions of daily requests, achieves near‑perfect reliability, and incorporates performance, monitoring, and disaster‑recovery innovations.

GitLabhigh availabilitymonitoring
0 likes · 15 min read
How Alibaba Scaled GitLab to Support Millions of Users with Sharding and High‑Availability
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 27, 2017 · Cloud Native

How Microservice Architecture Powers Scalable Smart Campus Platforms

This article explains how a decentralized microservice and SOA architecture, combined with cloud deployment, service registration, gateways, and unified APIs, enables high‑performance, high‑availability, and low‑coupling smart campus systems that support both mobile and PC applications while simplifying development, testing, and operations.

MicroservicesScalable Systemscloud-native
0 likes · 22 min read
How Microservice Architecture Powers Scalable Smart Campus Platforms
Baidu Intelligent Testing
Baidu Intelligent Testing
Mar 22, 2017 · Operations

Load Balancing: Concepts, Mechanisms, and Enterprise Practices

This article explains the principles of load balancing, distinguishes stateless service and stateful data balancing, describes DNS, hardware and software solutions such as F5, HAProxy, LVS, and GSLB, and illustrates real‑world implementations at Alibaba and Tencent while offering practical guidance on sharding, caching, and fault tolerance.

DNSGSLBHAProxy
0 likes · 18 min read
Load Balancing: Concepts, Mechanisms, and Enterprise Practices
ITPUB
ITPUB
Mar 15, 2017 · Operations

Mastering LVS: Complete Guide to Linux Load Balancing (NAT, DR, TUN) and HA

This article provides a comprehensive overview of Linux Virtual Server (LVS) load‑balancing clusters, detailing core concepts, packet‑flow mechanisms, key terminology, NAT/DR/TUN modes, scheduling algorithms, step‑by‑step configuration scripts, and high‑availability setup with keepalived, all illustrated with diagrams and command examples.

LVSLinux Virtual ServerNetworking
0 likes · 22 min read
Mastering LVS: Complete Guide to Linux Load Balancing (NAT, DR, TUN) and HA
Architecture Digest
Architecture Digest
Mar 7, 2017 · Backend Development

Load Balancing Layer Design Scenarios and Solution Architectures

This article examines various business load scenarios for a logistics management system and presents four progressive load‑balancing architectures—ranging from simple Nginx/Haproxy to DNS round‑robin with LVS and Keepalived—while defining key performance terms and outlining future discussion topics.

Backend ArchitectureLVSScalability
0 likes · 12 min read
Load Balancing Layer Design Scenarios and Solution Architectures
Tencent Cloud Developer
Tencent Cloud Developer
Feb 22, 2017 · Databases

Building a SQL Server Failover Cluster on QCloud – Final Guide

This guide walks through building a SQL Server Failover Cluster on QCloud, covering architecture choices, network layout, required roles like DTC, step‑by‑step installation on two nodes, configuration of virtual IPs, and recommendations to prefer AlwaysOn or PaaS solutions for production reliability.

AlwaysOnFailover ClusterQCloud
0 likes · 9 min read
Building a SQL Server Failover Cluster on QCloud – Final Guide
Tencent Cloud Developer
Tencent Cloud Developer
Feb 21, 2017 · Databases

Building SQL Cluster on Tencent Cloud: Step-by-Step Configuration Guide

This step‑by‑step guide walks readers through securely building a four‑node SQL Server Failover Cluster on Tencent Cloud—covering jump‑server setup, domain controller and storage gateway configuration, AD account creation, internal CLB and DNS setup, network and iSCSI storage configuration, CSV and quorum settings, and final failover verification.

Failover ClusterNetwork ConfigurationSQL Cluster
0 likes · 5 min read
Building SQL Cluster on Tencent Cloud: Step-by-Step Configuration Guide
Efficient Ops
Efficient Ops
Feb 12, 2017 · Cloud Computing

How Hengfeng Bank Built a High‑Availability OpenStack Cloud for Financial Services

This article details Hengfeng Bank's practical experience with OpenStack, covering why the bank chose the open‑source cloud platform, its multi‑site deployment architecture, high‑availability design, management practices, and lessons learned from operating a large‑scale financial cloud environment.

Financial ServicesInfrastructure AutomationOpenStack
0 likes · 19 min read
How Hengfeng Bank Built a High‑Availability OpenStack Cloud for Financial Services
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Feb 7, 2017 · Operations

Master System Architecture: CAP Theory, Shared‑Nothing, Load Balancing & HA

This article explores core system architecture concepts—including the CAP theorem and its BASE extension, the shared‑nothing design, various load‑balancing algorithms and deployment modes, and high‑availability patterns such as active‑standby, active‑active and clustering—providing practical guidance for building scalable, reliable distributed applications.

CAP theoremDistributed Systemshigh availability
0 likes · 22 min read
Master System Architecture: CAP Theory, Shared‑Nothing, Load Balancing & HA
Architecture Digest
Architecture Digest
Feb 6, 2017 · Backend Development

Key Elements and Evolution of Large‑Scale Website Architecture

This article summarizes the evolution, patterns, and five core factors—performance, availability, scalability, extensibility, and security—of large‑scale website architecture, covering server tiers, caching, clustering, load balancing, data redundancy, and security measures.

Scalabilitycachinghigh availability
0 likes · 13 min read
Key Elements and Evolution of Large‑Scale Website Architecture
Architecture Digest
Architecture Digest
Jan 23, 2017 · Backend Development

Design and Implementation of the Diablo Distributed Configuration Management Platform

The article introduces the need for a distributed configuration platform, outlines its typical scenarios and essential characteristics, and details the lightweight open‑source implementation Diablo—including its architecture, data model, Redis storage, client behavior, real‑time update mechanisms, and deployment recommendations.

BackendConfiguration CenterDistributed Configuration
0 likes · 10 min read
Design and Implementation of the Diablo Distributed Configuration Management Platform
Architecture Digest
Architecture Digest
Jan 21, 2017 · Backend Development

Evolution and Best Practices of the Qinglong Logistics System Architecture

The article chronicles the Qinglong logistics platform from its 2012 MVP launch through successive versions to a smart‑logistics system, detailing architectural evolution, high‑availability, performance, data‑consistency strategies, and user‑experience practices that underpin large‑scale backend development.

BackendData ConsistencyLogistics
0 likes · 16 min read
Evolution and Best Practices of the Qinglong Logistics System Architecture
MaGe Linux Operations
MaGe Linux Operations
Jan 18, 2017 · Databases

What Does China’s 2016 Oracle Database Landscape Reveal?

The 2016 China Oracle Database Usage Report, based on health‑check data from 1,841 instances across 18 industries, provides a multi‑dimensional analysis of version distribution, operating system choices, host and storage configurations, data scale, high‑availability setups, and top failure types, offering a comprehensive view of Oracle adoption and challenges in the Chinese market.

ChinaOracleUsage Report
0 likes · 10 min read
What Does China’s 2016 Oracle Database Landscape Reveal?
Efficient Ops
Efficient Ops
Jan 17, 2017 · Databases

Inside WeChat Pay: Scaling MySQL for Millions of Payments per Second

Zhou Tang, head of WeChat Pay operations at Tencent, shares how his team built a massive MySQL‑based payment platform handling up to 150 k transactions per second, covering background, DB‑CMDB design, change management, monitoring, security, high availability, and why Golang became their core development language.

Database operationsGolangWeChat Pay
0 likes · 21 min read
Inside WeChat Pay: Scaling MySQL for Millions of Payments per Second
Alibaba Cloud Developer
Alibaba Cloud Developer
Jan 16, 2017 · Databases

AliCloudDB’s Secrets for Scaling During Double‑11 Traffic

This article explains how AliCloudDB supports the massive traffic of Alibaba’s Double‑11 shopping festival through elastic scaling (both in‑place and cross‑machine upgrades), secure and standard access paths, robust architecture design, read‑write separation, engine and index optimization, high‑availability configurations, performance tuning, and disaster‑recovery strategies.

AliCloudDBelastic scalinghigh availability
0 likes · 12 min read
AliCloudDB’s Secrets for Scaling During Double‑11 Traffic
dbaplus Community
dbaplus Community
Jan 12, 2017 · Databases

Mastering MySQL Group Replication: Full‑Sync Architecture, Benefits, and Step‑by‑Step Setup

This article explains MySQL's asynchronous, semi‑synchronous, and Group Replication mechanisms, compares their trade‑offs, details Group Replication’s certification‑based full‑sync workflow, lists its features and limitations, and provides a complete configuration and maintenance guide for a three‑node cluster.

ClusterConfigurationGroup Replication
0 likes · 13 min read
Mastering MySQL Group Replication: Full‑Sync Architecture, Benefits, and Step‑by‑Step Setup
Meituan Technology Team
Meituan Technology Team
Jan 5, 2017 · Databases

Inside DBProxy: Open‑Source High‑Availability Database Middleware and Its Key Enhancements

DBProxy, an open‑source MySQL middleware derived from Atlas, offers horizontal scaling, read/write separation, load balancing, advanced blacklist filtering, dynamic configuration, and numerous bug fixes, positioning it as a mature, high‑availability solution for enterprise database environments.

DBProxyDatabase Middlewarehigh availability
0 likes · 12 min read
Inside DBProxy: Open‑Source High‑Availability Database Middleware and Its Key Enhancements
Architecture Digest
Architecture Digest
Dec 30, 2016 · Operations

Zero‑Point Battle: Evolution of Alibaba's Double 11 High‑Availability Architecture

The talk details how Alibaba tackled the massive technical challenges of Double 11 over eight years by evolving a highly available, scalable architecture through capacity planning, distributed middleware, hybrid‑cloud deployment, online stress testing, and fine‑grained traffic control to balance cost, performance, and user experience.

AlibabaDistributed SystemsDouble 11
0 likes · 22 min read
Zero‑Point Battle: Evolution of Alibaba's Double 11 High‑Availability Architecture
dbaplus Community
dbaplus Community
Dec 26, 2016 · Databases

How to Build a Scalable, Automated MySQL Operations Platform

This article explains how to standardize and automate MySQL management at scale, covering dedicated instance deployment, configuration consistency, multi‑instance creation, metadata collection, backup, monitoring, high‑availability with Zookeeper, and task orchestration using DBTask to achieve rapid, reliable database services.

DBTaskDatabase operationsZooKeeper
0 likes · 12 min read
How to Build a Scalable, Automated MySQL Operations Platform
MaGe Linux Operations
MaGe Linux Operations
Dec 26, 2016 · Databases

Mastering MySQL High Availability with MHA: Step‑By‑Step Setup Guide

This article introduces MHA (Master High Availability) for MySQL, explains its architecture, outlines required hardware and software configurations, provides detailed commands to set up master and slave nodes, create configuration files, and demonstrates how to start and verify the high‑availability cluster.

Database ReplicationLinuxMHA
0 likes · 8 min read
Mastering MySQL High Availability with MHA: Step‑By‑Step Setup Guide
Qunar Tech Salon
Qunar Tech Salon
Dec 22, 2016 · Backend Development

Design and Implementation of a VOIP Solution for Overseas Travelers Using Asterisk and Kamailio

This article presents a comprehensive guide on building a VOIP service for overseas users, covering VOIP fundamentals, open‑source PBX selection, SIP client libraries, demo deployment, load‑balancing with Kamailio, high‑availability via Keepalived, NAT handling, TLS/SRTP support, and troubleshooting techniques.

KamailioLinuxSIP
0 likes · 14 min read
Design and Implementation of a VOIP Solution for Overseas Travelers Using Asterisk and Kamailio
Weidian Tech Team
Weidian Tech Team
Dec 15, 2016 · Databases

How to Build a Scalable Automated MySQL Operations Platform

This article explains how to standardize and automate MySQL operations—including multi‑instance deployment, metadata collection, monitoring, backup, and high‑availability using Zookeeper—so that large‑scale database services can be provisioned, managed, and scaled with minimal human intervention.

BackupDatabase operationshigh availability
0 likes · 11 min read
How to Build a Scalable Automated MySQL Operations Platform
Ctrip Technology
Ctrip Technology
Dec 9, 2016 · Operations

Design and Implementation of Ctrip Call Center's Active‑Active Architecture and Unified Login

The article details Ctrip's call‑center architecture evolution, describing the multi‑layer active‑active design, public access, application and client layers, unified login mechanisms, operational challenges, disaster‑recovery drills, and future plans for software‑only and mobile agents, illustrating practical SRE principles in a large‑scale telephony system.

Active-ActiveIP phoneSRE
0 likes · 22 min read
Design and Implementation of Ctrip Call Center's Active‑Active Architecture and Unified Login
dbaplus Community
dbaplus Community
Nov 22, 2016 · Databases

How to Add a ‘STOP ALL SLAVES’ Command to MySQL 5.6 Source Code

This guide walks through extending MySQL 5.6.32 by adding a new SQLCOM_STOP_SLAVES command that stops all replication slaves, detailing modifications to lex.h, sql_cmd.h, sql_yacc.yy, sql_parse.cc, and mysqld.cc, along with compilation tips and troubleshooting steps.

ReplicationSQL Commandc++
0 likes · 7 min read
How to Add a ‘STOP ALL SLAVES’ Command to MySQL 5.6 Source Code
WeChat Backend Team
WeChat Backend Team
Nov 19, 2016 · Databases

How PhxSQL Achieves MySQL-Compatible High Availability with Strong Consistency

PhxSQL is an open‑source, MySQL‑compatible relational database cluster that provides high availability and strong data consistency through a single‑master multi‑slave architecture, automatically switching masters when over half the nodes are alive, without relying on external services like Zookeeper, and requiring no code changes for migration.

Database ClusterPhxSQLhigh availability
0 likes · 6 min read
How PhxSQL Achieves MySQL-Compatible High Availability with Strong Consistency
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Nov 15, 2016 · Databases

How to Set Up Redis Master‑Slave Replication in Minutes

This guide walks you through configuring a simple Redis master‑slave setup, covering the benefits, step‑by‑step file modifications, essential replication directives, testing procedures, and common pitfalls to ensure high availability and read/write separation.

Configurationdatabasehigh availability
0 likes · 5 min read
How to Set Up Redis Master‑Slave Replication in Minutes
Architecture Digest
Architecture Digest
Nov 11, 2016 · Backend Development

High‑Availability Architecture Sessions at the China Software Developers Conference (Nov 18‑20)

The conference featured a series of high‑availability architecture talks covering performance‑driven design, RPC framework resilience, big‑data platform evolution, MySQL cluster consistency, and cloud infrastructure best practices, presented by experts from 58.com, Alibaba, Tencent, Baidu, and others.

Backend ArchitectureBig DataRPC
0 likes · 10 min read
High‑Availability Architecture Sessions at the China Software Developers Conference (Nov 18‑20)
Architecture Digest
Architecture Digest
Nov 10, 2016 · Operations

Interview with Lu Pengcheng on Mogu Street’s Monitoring System Architecture and Evolution

In this interview, Lu Pengcheng, a platform architect at Mogu Street, discusses the company’s large‑scale e‑commerce architecture, the evolution of its monitoring platform, design choices for high‑availability distributed systems, and future open‑source plans, providing practical insights for engineers and technical managers.

C++Distributed SystemsOperations
0 likes · 9 min read
Interview with Lu Pengcheng on Mogu Street’s Monitoring System Architecture and Evolution
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Nov 5, 2016 · Operations

Distributed vs Cluster: What’s the Real Difference and When to Use Each?

This article explains the core differences between distributed systems and clusters, detailing their architectures, efficiency goals, typical use cases such as Hadoop MapReduce and load‑balancing clusters, and outlines key concepts like scalability, high availability, load balancing, and error recovery.

Cluster ComputingDistributed SystemsHPC
0 likes · 10 min read
Distributed vs Cluster: What’s the Real Difference and When to Use Each?
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 24, 2016 · Databases

Mastering Database Sharding: Strategies for Scaling High‑Traffic Applications

This article explains the fundamentals of database sharding, including horizontal partitioning concepts, routing rules, and various sharding strategies such as range, hash, and mapping tables, and how clustering, load balancing, and read/write separation improve scalability, availability, and performance for large‑scale internet applications.

Read-Write Separationhigh availabilityhorizontal scaling
0 likes · 16 min read
Mastering Database Sharding: Strategies for Scaling High‑Traffic Applications
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 15, 2016 · Operations

How E‑Commerce Platforms Achieve High Availability and Scalability: Architecture Practices

This article outlines comprehensive e‑commerce platform architecture practices—including caching strategies, indexing, parallel and distributed computing, load balancing, sharding, high availability, monitoring, resource optimization, and messaging—to improve system performance, scalability, and reliability under high concurrency.

Distributed Systemsarchitecturecaching
0 likes · 28 min read
How E‑Commerce Platforms Achieve High Availability and Scalability: Architecture Practices
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 11, 2016 · Operations

How to Gracefully Degrade Services When Server Load Spikes

This article explains various service degradation strategies—including interface and page refusal, delayed persistence, and persistent‑layer restrictions—along with management approaches and implementation points such as middleware control, NGINX+LUA page blocking, and data‑operation rules, to keep core functions running under high server pressure.

Operationsasynchronous queuecaching
0 likes · 4 min read
How to Gracefully Degrade Services When Server Load Spikes
ITPUB
ITPUB
Oct 4, 2016 · Operations

How to Build a Resilient High‑Traffic Website: A Complete Operations Guide

This guide outlines a step‑by‑step strategy for designing a highly available, secure, and scalable website architecture, covering domain acquisition, CDN deployment, image caching, data center selection, monitoring, DDoS mitigation, redundancy, server configuration, database replication, testing environments, and operational best practices.

Operationshigh availabilitysecurity
0 likes · 14 min read
How to Build a Resilient High‑Traffic Website: A Complete Operations Guide
Meituan Technology Team
Meituan Technology Team
Sep 23, 2016 · Databases

Database Automation Platform at Meituan: Architecture, Practices, and Lessons

Meituan’s Database Automation Platform evolved from a simple Django‑based 1.0 system into a modular 2.0 architecture using CMDB, RabbitMQ, and Celery, standardizing metadata, providing self‑service APIs, and automating tasks such as cluster provisioning, online schema changes, and high‑availability failover, now handling hundreds of daily operations for over a thousand developers while planning further component‑library refactoring.

CMDBDevOpsdatabase automation
0 likes · 27 min read
Database Automation Platform at Meituan: Architecture, Practices, and Lessons
dbaplus Community
dbaplus Community
Sep 20, 2016 · Databases

Upgrading SQL Server 2008 R2 to 2014 with AlwaysOn: A Hands‑On High‑Availability Guide

This article walks through a real‑world upgrade from SQL Server 2008 R2 to 2014, detailing background analysis, data collection, solution design, detailed investigation, testing, performance baselines, side‑by‑side migration, cluster setup, application changes, and common pitfalls such as CDC issues, index‑rebuild logging, and query slowdowns.

AlwaysOnPerformance TestingSQL Server
0 likes · 13 min read
Upgrading SQL Server 2008 R2 to 2014 with AlwaysOn: A Hands‑On High‑Availability Guide
Baidu Maps Tech Team
Baidu Maps Tech Team
Sep 20, 2016 · Databases

Baidu Maps Reverse Geocoding: Grid Indexing & Incremental Updates

This article explains how Baidu Maps’ reverse‑geocoding service converts coordinates into addresses using point, line, and polygon data mapped onto a grid index, describes the incremental indexing mechanism that enables rapid data updates, and highlights the system’s high availability and performance characteristics.

Spatial Datagrid indexinghigh availability
0 likes · 6 min read
Baidu Maps Reverse Geocoding: Grid Indexing & Incremental Updates