Tagged articles
1414 articles
Page 12 of 15
Tencent Cloud Developer
Tencent Cloud Developer
Nov 6, 2018 · Databases

Design and Implementation of High‑Availability MySQL

The talk by Tencent’s senior MySQL engineer explains high‑availability concepts—99.95% uptime, RPO/RTO metrics, backup methods, and replication modes—while comparing single‑node, shared‑storage, and share‑nothing architectures, detailing failover tools (Keepalived, MMM, MHA), cluster solutions (PXC, MGC, Group Replication) and NewSQL examples such as Aurora, PolarDB and CynosDB.

BackupDatabase ArchitectureRPO
0 likes · 15 min read
Design and Implementation of High‑Availability MySQL
Architects' Tech Alliance
Architects' Tech Alliance
Nov 5, 2018 · Operations

Load Balancing: Concepts, Types, Advantages, and Algorithms

This article explains load balancing as a clustering technology that distributes network services across multiple devices or links to improve performance, scalability, reliability, and manageability, and it details various types, strategies, and algorithms used in modern networks.

AlgorithmsScalabilityServer
0 likes · 13 min read
Load Balancing: Concepts, Types, Advantages, and Algorithms
Zhongtong Tech
Zhongtong Tech
Nov 2, 2018 · Backend Development

How to Build a High‑Availability, Scalable E‑Commerce Backend for Mega Sales

This article explains the architectural challenges of large‑scale e‑commerce platforms during massive promotional events and provides a detailed, layer‑by‑layer guide to designing a highly available, horizontally scalable, stateless micro‑service backend with robust data handling, caching, messaging, and traffic‑management strategies.

backend-developmente‑commercehigh availability
0 likes · 10 min read
How to Build a High‑Availability, Scalable E‑Commerce Backend for Mega Sales
AntTech
AntTech
Nov 2, 2018 · Information Security

Ant Group’s TRaaS: A Technological Risk‑Defense Platform for Financial Systems

Ant Group unveiled TRaaS (Technological Risk‑defense as a Service), a comprehensive platform that combines high‑availability, real‑time fund reconciliation and AI‑driven self‑healing capabilities to protect large‑scale financial systems against technical risks.

Distributed SystemsTRaaSaiops
0 likes · 10 min read
Ant Group’s TRaaS: A Technological Risk‑Defense Platform for Financial Systems
DataFunTalk
DataFunTalk
Oct 19, 2018 · Databases

HBase Application and High‑Availability Practices

This article summarizes the current usage of HBase at Ping An Technology, the challenges it addresses, detailed client‑ and server‑side performance and stability optimizations, high‑availability mechanisms, data migration strategies, monitoring and repair practices, and future development plans.

Data MigrationHBasehigh availability
0 likes · 9 min read
HBase Application and High‑Availability Practices
JD Tech
JD Tech
Oct 10, 2018 · Backend Development

Design and Architecture of JD's Virtual Order Center (Hamal)

The article explains the architecture and core mechanisms of JD's Virtual Order Center, describing how the Hamal service leverages MySQL binlog listening, Zookeeper coordination, fast TCP‑based consumption, read‑write separation, and multi‑level search to reliably process billions of virtual orders.

BackendBinlogdata pipeline
0 likes · 7 min read
Design and Architecture of JD's Virtual Order Center (Hamal)
AntTech
AntTech
Oct 9, 2018 · Cloud Computing

Technical Analysis of OceanBase Cloud Platform (OCP) 2.0 Architecture and Solutions

The article provides a comprehensive technical overview of OceanBase Cloud Platform (OCP) 2.0, detailing its redesigned architecture, reduced deployment complexity, high‑availability features, unified resource scheduling, monitoring, diagnostics, and how these innovations address infrastructure and business challenges while lowering costs.

InfrastructureOCP 2.0OceanBase
0 likes · 11 min read
Technical Analysis of OceanBase Cloud Platform (OCP) 2.0 Architecture and Solutions
dbaplus Community
dbaplus Community
Oct 7, 2018 · Databases

How Alibaba Cloud Scaled SQL Server with AlwaysOn for Read/Write Separation

This article details Alibaba Cloud's evolution of SQL Server RDS, covering product growth, the challenges of read/write separation, technical evaluations of AlwaysOn versus Transactional Replication, cloud architecture iterations, and the final productized solution for high‑availability database services.

Alibaba CloudAlwaysOnDatabase Replication
0 likes · 12 min read
How Alibaba Cloud Scaled SQL Server with AlwaysOn for Read/Write Separation
DataFunTalk
DataFunTalk
Sep 29, 2018 · Big Data

Applying HBase in a Risk‑Control System and High‑Availability Practices

This article summarizes Guo Dongdong’s presentation on leveraging HBase for a risk‑control platform, detailing its architecture, data import/export mechanisms, indexing, region server recovery challenges, monitoring, SQL interception, dual‑cluster high‑availability, and future enhancements for large‑scale, low‑latency big‑data services.

Distributed SystemsHBasePhoenix
0 likes · 13 min read
Applying HBase in a Risk‑Control System and High‑Availability Practices
AntTech
AntTech
Sep 25, 2018 · Databases

OceanBase 2.0 Release: Technical Overview and Innovations

The article presents a comprehensive technical overview of OceanBase 2.0, detailing its evolution from a single‑node financial database to a distributed system, the three major migration challenges, new features such as global snapshots, global indexes, load‑balancing, high‑availability mechanisms, operability enhancements, performance improvements, and compatibility extensions, all illustrated with real‑world financial use cases and the upcoming Double‑Eleven stress test.

CompatibilityFinancial ServicesGlobal Snapshot
0 likes · 19 min read
OceanBase 2.0 Release: Technical Overview and Innovations
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 21, 2018 · Databases

Case Study of iQIYI’s Adoption of TiDB for Scalable High‑Availability Database Services

iQIYI migrated its critical Edge Control, Video Transcoding, and User Login services from MySQL to TiDB, gaining automatic sharding, high‑availability multi‑datacenter replication, and stable query performance that eliminated storage bottlenecks, complex sharding logic and frequent downtime, while enabling future OLTP/OLAP integration.

Data MigrationScalabilityTiDB
0 likes · 10 min read
Case Study of iQIYI’s Adoption of TiDB for Scalable High‑Availability Database Services
HomeTech
HomeTech
Sep 18, 2018 · Backend Development

Design and Implementation of a High‑Availability Red Envelope (Coupon) System

This article presents a comprehensive design of a high‑availability red‑envelope system, covering its data model, generation, distribution, verification, query optimization, and product tagging strategies, while detailing the use of Redis, distributed locks, MQ, and caching to meet complex business scenarios.

Backend ArchitectureCoupon SystemRed Envelope
0 likes · 15 min read
Design and Implementation of a High‑Availability Red Envelope (Coupon) System
High Availability Architecture
High Availability Architecture
Sep 18, 2018 · Backend Development

Design and Operation of Zhihu's Redis Platform: Architecture, High Availability, and Scaling

The article details Zhihu's internally built Redis platform, covering its architecture, instance types, high‑availability mechanisms, migration from client‑side sharding to Twemproxy, deployment on Kubernetes, scaling strategies, monitoring tools, and future upgrades, providing valuable insights for backend engineers.

high availabilityredis
0 likes · 23 min read
Design and Operation of Zhihu's Redis Platform: Architecture, High Availability, and Scaling
dbaplus Community
dbaplus Community
Sep 16, 2018 · Databases

How Meituan Dianping Built a Reliable MySQL Group Replication HA Architecture

This article details Meituan Dianping's practical experience deploying MySQL Group Replication (MGR) for CMDB high availability, covering background, MGR fundamentals, configuration limits, parameter tuning, architecture design, deployment timeline, typical issues, a custom Python client, and daily operational practices.

Database operationsGroup ReplicationMGR
0 likes · 11 min read
How Meituan Dianping Built a Reliable MySQL Group Replication HA Architecture
Youzan Coder
Youzan Coder
Sep 14, 2018 · Big Data

Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System

The Youzan search system uses middleware‑driven Elasticsearch optimizations—segment merging, larger buffers, routing, and rollover—to cut index files and document scans, splits large indices into business‑specific or hot‑cold sub‑indices, and adds asynchronous cross‑datacenter replication with soft‑delete versioning for high‑availability and scalable performance.

ElasticsearchHot/Cold IsolationIndex Optimization
0 likes · 10 min read
Elasticsearch Optimization and Index Splitting Strategies in the Youzan Search System
21CTO
21CTO
Sep 12, 2018 · Operations

How to Build a High‑Availability Web Service on CentOS 7 with Keepalived & LVS

This guide walks you through setting up a highly available web service on CentOS 7 by using Vagrant to create four virtual machines, installing Keepalived and OpenResty, configuring VRRP and LVS for load balancing, binding a virtual IP, and testing failover to ensure continuous service delivery.

CentOSLVSVagrant
0 likes · 13 min read
How to Build a High‑Availability Web Service on CentOS 7 with Keepalived & LVS
Architects' Tech Alliance
Architects' Tech Alliance
Sep 6, 2018 · Databases

Design and Implementation of a DB2 pureScale GDPC Dual‑Active Database Platform

The article analyzes the shortcomings of traditional disaster‑recovery methods, explains why DB2 pureScale GDPC was chosen for a dual‑active database solution, and provides detailed design guidelines covering site selection, arbitration node, network architecture, storage layout, resource sizing, client connectivity, and the solution’s advantages and limitations.

DB2Database designDual-Active
0 likes · 14 min read
Design and Implementation of a DB2 pureScale GDPC Dual‑Active Database Platform
Architects' Tech Alliance
Architects' Tech Alliance
Aug 27, 2018 · Fundamentals

Design Principles and Architecture of Distributed File Systems

This article provides a comprehensive overview of distributed file systems, covering their historical evolution, essential requirements, architectural models (centralized and decentralized), persistence strategies, scalability, high availability, performance optimization, security mechanisms, and additional considerations such as space allocation, file deletion, small‑file handling, and fingerprint‑based deduplication.

ConsistencyDistributed SystemsScalability
0 likes · 19 min read
Design Principles and Architecture of Distributed File Systems
Big Data and Microservices
Big Data and Microservices
Aug 27, 2018 · Industry Insights

What Makes Large‑Scale Websites Tick? Architecture Principles and Best Practices

This article outlines the key characteristics of large‑scale websites and presents a comprehensive set of architectural goals, patterns, and techniques—including performance tuning, high availability, scalability, extensibility, security, and agile operations—to guide the design of robust, user‑centric online platforms.

Scalabilityagilearchitecture
0 likes · 10 min read
What Makes Large‑Scale Websites Tick? Architecture Principles and Best Practices
Alibaba Cloud Native
Alibaba Cloud Native
Aug 21, 2018 · Cloud Native

Inside Alibaba’s Sigma: How a Cloud‑Native Scheduler Powers 280× Double‑11 Growth

The article details Alibaba’s Sigma scheduling and cluster management platform—its three‑layer architecture, data and state consistency strategies, real‑world case studies, Go‑based redesign, integration with Kubernetes APIs, and lessons on concurrency, high availability, and pod dispersion for massive Double 11 traffic.

GoKubernetesScheduler
0 likes · 20 min read
Inside Alibaba’s Sigma: How a Cloud‑Native Scheduler Powers 280× Double‑11 Growth
dbaplus Community
dbaplus Community
Aug 19, 2018 · Databases

Redis Deployment Options: Pros & Cons of Single, Replication, Sentinel, Cluster, and Custom Solutions

This article examines five common Redis deployment patterns—single instance, master‑slave replication, Sentinel, Cluster, and custom high‑availability solutions—detailing their architectures, advantages, drawbacks, and practical configuration tips to help engineers choose the most suitable setup for their workloads.

ClusterReplicationdatabase
0 likes · 12 min read
Redis Deployment Options: Pros & Cons of Single, Replication, Sentinel, Cluster, and Custom Solutions
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Aug 14, 2018 · Databases

How 58.com Scales Its Database: Architecture, High Availability, and Performance Tricks

This article explains 58.com’s database architecture, covering availability through replication and dual‑master setups, read‑performance enhancements with indexing, read replicas and caching, consistency solutions, rapid horizontal scaling methods, and a review of Codd’s twelve rules for relational design.

ConsistencySQL OptimizationScalability
0 likes · 14 min read
How 58.com Scales Its Database: Architecture, High Availability, and Performance Tricks
ITPUB
ITPUB
Aug 14, 2018 · Operations

Master Linux Multi‑NIC Bonding: Modes, Configuration Steps & Best Practices

This guide explains Linux network interface bonding, detailing all seven bonding modes, their characteristics, required switch configurations, and provides step‑by‑step instructions for setting up bond0 with configuration files, modprobe options, and verification commands to achieve high‑availability and load‑balanced networking.

LinuxNICNetwork Bonding
0 likes · 13 min read
Master Linux Multi‑NIC Bonding: Modes, Configuration Steps & Best Practices
dbaplus Community
dbaplus Community
Aug 11, 2018 · Databases

Achieving Multi‑Active Disaster Recovery with Distributed Databases in Finance

Amid rising cloud outages and strict financial regulations, this article examines traditional multi‑active database solutions such as Oracle RAC and IBM GDPS, contrasts them with modern distributed database designs, and details SequoiaDB’s multi‑active architecture and concrete disaster‑recovery procedures for single‑node, site‑wide, and network failures.

Distributed SystemsFinancial ServicesSequoiaDB
0 likes · 13 min read
Achieving Multi‑Active Disaster Recovery with Distributed Databases in Finance
21CTO
21CTO
Aug 9, 2018 · Operations

How GitHub’s Open‑Source GLB Load Balancer Achieves High‑Performance Scaling

GitHub created the open‑source GitHub Load Balancer (GLB), a bare‑metal load‑balancing solution that uses Rendezvous hashing to smoothly add or remove nodes, offering high availability, DDoS resilience, and scalable performance for massive connection workloads, and is now publicly available on GitHub.

GitHubRendezvous hashinghigh availability
0 likes · 5 min read
How GitHub’s Open‑Source GLB Load Balancer Achieves High‑Performance Scaling
High Availability Architecture
High Availability Architecture
Aug 9, 2018 · Operations

GitHub GLB Director: Open‑Source High‑Performance Data‑Center Load Balancer

GitHub’s GLB Director is an open‑source, layer‑4 load balancer designed for data‑center environments that scales a single IP across thousands of servers, uses ECMP, a stateless director layer, DPDK‑accelerated packet processing, and health‑check mechanisms to provide high‑availability without disrupting existing connections.

DPDKECMPdata center
0 likes · 19 min read
GitHub GLB Director: Open‑Source High‑Performance Data‑Center Load Balancer
Qunar Tech Salon
Qunar Tech Salon
Aug 8, 2018 · Backend Development

Design and Implementation of a Fault Injection Platform for High‑Availability Backend Systems

This article describes the motivation, architecture, and implementation details of a fault‑injection platform that uses Java Instrumentation and dynamic bytecode weaving to validate high‑availability strategies, isolate failures, and support zero‑cost, runtime fault injection for complex distributed backend services.

BackendFault InjectionJava Instrumentation
0 likes · 12 min read
Design and Implementation of a Fault Injection Platform for High‑Availability Backend Systems
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Aug 3, 2018 · Databases

Understanding Redis Cluster: Architecture, Slot Sharding, Node Management, and High Availability

Redis Cluster is a distributed system that partitions its keyspace into 16,384 hash slots across up to 16,384 nodes, enabling automatic sharding, slot migration, replication, and high‑availability features such as automatic node discovery, master‑slave election, and online resharding without service interruption.

ClusterSlot Migrationdatabase
0 likes · 5 min read
Understanding Redis Cluster: Architecture, Slot Sharding, Node Management, and High Availability
DevOps
DevOps
Aug 3, 2018 · Operations

How to Choose the Right TFS Network Topology and Plan a Data Storage Strategy

This article explains how to select an appropriate Team Foundation Server (TFS) deployment topology and design a data storage strategy, covering single, dual, and cluster deployments, hardware recommendations by team size, high‑availability options, performance testing results, and best practices for managing TFS databases.

DeploymentDevOpsTFS
0 likes · 12 min read
How to Choose the Right TFS Network Topology and Plan a Data Storage Strategy
AntTech
AntTech
Jul 31, 2018 · Databases

High Availability and Disaster Recovery Strategies for OceanBase Distributed Database

This article reviews traditional database high‑availability techniques, explains the advantages of distributed multi‑replica consistency (Paxos/Raft) used by OceanBase, and compares various deployment topologies—from single‑site three‑replica to multi‑city five‑replica designs—highlighting their trade‑offs and best‑practice recommendations.

OceanBasePaxosReplication
0 likes · 23 min read
High Availability and Disaster Recovery Strategies for OceanBase Distributed Database
dbaplus Community
dbaplus Community
Jul 30, 2018 · Backend Development

Understanding Redis Persistence: RDB vs AOF and High‑Availability Strategies

This article explains Redis high‑availability concepts, focusing on persistence mechanisms—RDB snapshots and AOF logging—detailing their triggers, execution flows, file formats, configuration options, performance trade‑offs, and practical guidance for choosing and tuning persistence in production environments.

AOFConfigurationPersistence
0 likes · 29 min read
Understanding Redis Persistence: RDB vs AOF and High‑Availability Strategies
Architects' Tech Alliance
Architects' Tech Alliance
Jul 28, 2018 · Databases

Technical Requirements and Architectural Directions for Cloud Databases

The article explains the key technical requirements of cloud databases, such as elastic scaling, compute‑storage separation, multi‑model support and self‑management, and discusses emerging architectural trends like storage‑SQL separation, multi‑model engines, and disaster‑recovery/multi‑active designs for various enterprise scenarios.

cloud databasedbPaaSelastic scaling
0 likes · 16 min read
Technical Requirements and Architectural Directions for Cloud Databases
MaGe Linux Operations
MaGe Linux Operations
Jul 27, 2018 · Operations

Mastering Keepalived: Step‑by‑Step Server Load Balancing on Linux

This guide walks through planning the server and software environment, installing and configuring keepalived, setting up master‑backup VRRP instances, monitoring logs, handling failover, checking virtual IPs, troubleshooting common errors, and adding a Tomcat service script for high‑availability Linux deployments.

LinuxTomcatVRRP
0 likes · 16 min read
Mastering Keepalived: Step‑by‑Step Server Load Balancing on Linux
JD Retail Technology
JD Retail Technology
Jul 24, 2018 · Operations

Stability and Operational Practices for Large‑Scale Kubernetes Clusters

This article shares practical experience and best‑practice guidelines for operating large‑scale Kubernetes clusters, covering stability checks, component failure impact, recovery strategies, alerting mechanisms, data collection, visualization, and the suite of operational tools that help ensure reliable, high‑performance cloud‑native infrastructure.

Kubernetescluster operationshigh availability
0 likes · 10 min read
Stability and Operational Practices for Large‑Scale Kubernetes Clusters
ITPUB
ITPUB
Jul 15, 2018 · Databases

How Meituan Dianping Evolved MySQL HA: From MMM to MHA+Zebra and Beyond

This article traces Meituan Dianping's MySQL high‑availability journey, detailing the legacy MMM system, its migration to MHA, integration with Zebra and Proxy middleware, and future architectural ideas such as distributed agents, semi‑sync replication, and MySQL Group Replication.

Distributed SystemsMHAZebra
0 likes · 12 min read
How Meituan Dianping Evolved MySQL HA: From MMM to MHA+Zebra and Beyond
Architecture Digest
Architecture Digest
Jul 11, 2018 · Cloud Native

Understanding Modern Distributed Architecture: SOA, Microservices, Service Mesh, CAP & BASE Theories, and High‑Availability Design

This article explains the evolution and core concepts of mainstream distributed architectures—including SOA, microservices, and service mesh—covers fundamental consistency theories such as CAP and BASE, and outlines practical high‑availability and scalability techniques for building resilient cloud‑native systems.

BASE theoryCAP theoremSOA
0 likes · 17 min read
Understanding Modern Distributed Architecture: SOA, Microservices, Service Mesh, CAP & BASE Theories, and High‑Availability Design
Efficient Ops
Efficient Ops
Jul 9, 2018 · Databases

How YY Scaled Its Database Platform: From Manual Ops to Intelligent Automation

This article details YY's journey in transforming its database operations—from early quality and efficiency challenges to a multi‑stage platform that automates resource pooling, high‑availability proxy, cost control, quality monitoring, and security, outlining future intelligent extensions.

Cost OptimizationDatabase operationsResource Management
0 likes · 16 min read
How YY Scaled Its Database Platform: From Manual Ops to Intelligent Automation
JD Tech
JD Tech
Jul 5, 2018 · Backend Development

Design and Optimization of JD's High‑Availability Open Gateway System

This article describes how JD's open gateway handles billions of requests during major sales events by employing a multi‑layer architecture, Nginx + Lua unified access, NIO asynchronous processing, service isolation, dynamic routing, degradation, rate‑limiting, circuit‑breaking, fast‑fail mechanisms, and comprehensive monitoring to ensure high performance and reliability.

Circuit Breakingasynchronous processinggateway
0 likes · 16 min read
Design and Optimization of JD's High‑Availability Open Gateway System
Java Backend Technology
Java Backend Technology
Jul 4, 2018 · Backend Development

Designing High‑Availability Distributed Systems: SOA, Microservices & Service Mesh

This article explores the evolution and core concepts of modern distributed architectures—including SOA, microservices, and service mesh—explains key theories such as CAP and BASE, and provides practical guidelines for achieving high availability, scalability, and efficient content delivery through techniques like load balancing, CDN, and gray‑release strategies.

CAP theoremDistributed SystemsMicroservices
0 likes · 18 min read
Designing High‑Availability Distributed Systems: SOA, Microservices & Service Mesh
ITPUB
ITPUB
Jun 22, 2018 · Databases

How to Build a Highly Available Redis Service with Sentinel and Virtual IP

This article explains how to design and implement a highly available Redis deployment using master‑slave replication, multiple Redis Sentinel instances, and a virtual IP to provide seamless failover while maintaining simple client connectivity, covering failure scenarios, architecture choices, and practical configuration tips.

databasefailoverhigh availability
0 likes · 12 min read
How to Build a Highly Available Redis Service with Sentinel and Virtual IP
Java Captain
Java Captain
Jun 17, 2018 · Fundamentals

Understanding Distributed Systems and Cluster Architecture: Concepts, Types, and Differences

This article explains the distinction between distributed systems and clusters, outlines cluster’s key features such as scalability and high availability, describes essential capabilities like load balancing and error recovery, details core technologies, classifies common Linux cluster types, and provides examples to illustrate their operation.

HPCcomputing fundamentalshigh availability
0 likes · 10 min read
Understanding Distributed Systems and Cluster Architecture: Concepts, Types, and Differences
Architecture Digest
Architecture Digest
Jun 7, 2018 · Backend Development

Technical Summary of Large-Scale Distributed E‑Commerce Website Architecture

This article provides a comprehensive technical overview of large distributed website architecture, covering performance, high availability, scalability, security, and agility, and illustrates the evolution, design patterns, and practical optimization techniques for modern e‑commerce platforms.

Distributed SystemsScalabilityarchitecture
0 likes · 32 min read
Technical Summary of Large-Scale Distributed E‑Commerce Website Architecture
21CTO
21CTO
Jun 6, 2018 · Operations

From Single Machines to Distributed Architecture: Tracing the Evolution of IT Systems

This article outlines the four major stages of IT architecture evolution—from single‑machine setups, through dual‑machine hot‑standby, multi‑node active clusters, to fully distributed systems—explaining the motivations, challenges, and technologies that drive each transition.

Distributed SystemsIT infrastructurearchitecture
0 likes · 8 min read
From Single Machines to Distributed Architecture: Tracing the Evolution of IT Systems
Architecture Digest
Architecture Digest
Jun 6, 2018 · Operations

Evolution of System Architecture: From Single‑Machine to Distributed Solutions

The article outlines the four major stages of enterprise IT architecture—single‑machine, dual‑machine hot‑standby, multi‑node active‑active, and distributed architectures—explaining their motivations, advantages, limitations, and how businesses should choose the appropriate model based on performance, availability, and scalability requirements.

Distributed SystemsScalabilitySystem Design
0 likes · 8 min read
Evolution of System Architecture: From Single‑Machine to Distributed Solutions
ITPUB
ITPUB
Jun 5, 2018 · Operations

How Meituan Achieved Near‑Zero Downtime for Its Account Service

This article details Meituan's practical approaches to boosting account service reliability, covering MTBF/MTTR metrics, business‑level monitoring, flexible availability with circuit‑breaker patterns, cross‑region active‑active deployment, data synchronization techniques, and the measurable performance gains achieved.

Active-ActiveDistributed Systemscircuit breaker
0 likes · 13 min read
How Meituan Achieved Near‑Zero Downtime for Its Account Service
Meituan Technology Team
Meituan Technology Team
May 31, 2018 · Operations

High‑Availability Practices for Account Services at Meituan/Dianping

Meituan/Dianping ensures its critical account service stays online by combining real‑time business monitoring, circuit‑breaker‑driven graceful degradation, and active‑active cross‑region deployment with isolated dependencies, versioned data sync, and automated cache updates, dramatically extending MTBF while cutting MTTR and latency.

data synchronizationfault tolerancehigh availability
0 likes · 13 min read
High‑Availability Practices for Account Services at Meituan/Dianping
Meituan Technology Team
Meituan Technology Team
May 31, 2018 · Mobile Development

High Availability Architecture for Meituan Waimai Mobile Client

Meituan Waimai’s mobile client employs a high‑availability architecture built on loosely‑coupled teams, comprehensive monitoring, encrypted logging, multi‑layer disaster recovery, gray‑release strategies, and an incident‑response workflow, enabling rapid detection and resolution of failures while supporting 20 million daily orders.

disaster recoveryhigh availabilitylogging
0 likes · 16 min read
High Availability Architecture for Meituan Waimai Mobile Client
ITPUB
ITPUB
May 24, 2018 · Operations

Mastering Modern Operations: From Deployment to Automation and High Availability

This article outlines the essential facets of modern IT operations, covering environment deployment, troubleshooting and performance tuning, backup strategies, high‑availability clustering, monitoring and alerting, security and auditing, as well as automation, DevOps practices, virtualization, and cloud services, providing practical insights and tool recommendations.

Deploymentautomationhigh availability
0 likes · 9 min read
Mastering Modern Operations: From Deployment to Automation and High Availability
ITPUB
ITPUB
May 16, 2018 · Databases

How RadonDB Merges Raft and MySQL for Scalable Cloud‑Native Databases

RadonDB is a next‑generation cloud‑native distributed relational database that combines the Raft consensus protocol with MySQL to deliver high availability, strong consistency, seamless scalability, and native support for OLTP, OLAP, distributed transactions, and comprehensive monitoring and backup features.

RadonDBRaftcloud-native
0 likes · 10 min read
How RadonDB Merges Raft and MySQL for Scalable Cloud‑Native Databases
21CTO
21CTO
May 9, 2018 · Operations

How Alipay Built Seamless High Availability and Disaster Recovery for Millions of Transactions

This article examines Alipay's evolution from a simple single‑datacenter setup to a multi‑active‑active, unit‑based architecture, detailing the technical challenges of high availability, disaster recovery, failover design, blue‑green deployment, and how these solutions enable continuous service during massive traffic spikes like Double 11.

AlipayBlue‑Green deploymentDistributed Systems
0 likes · 17 min read
How Alipay Built Seamless High Availability and Disaster Recovery for Millions of Transactions
Architecture Digest
Architecture Digest
May 9, 2018 · Operations

High Availability and Disaster Recovery Architecture: The Evolution of Alipay’s System Design

This article examines the importance of high‑availability and disaster‑recovery architectures, tracing Alipay’s evolution from a simple load‑balanced setup through multi‑datacenter, failover, and unit‑based designs that address scalability, data consistency, and continuous service delivery challenges.

Distributed SystemsScalabilitydisaster recovery
0 likes · 16 min read
High Availability and Disaster Recovery Architecture: The Evolution of Alipay’s System Design
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
May 7, 2018 · Cloud Native

How ServiceComb’s ServiceCenter Guarantees High‑Availability for Cloud‑Native Microservices

This article explains how ServiceComb’s ServiceCenter component provides reliable microservice registration, discovery, and management through features like instance isolation, black‑white list control, asynchronous caching, heartbeat mechanisms, and self‑preservation, ensuring high availability in distributed cloud‑native environments.

Cloud NativeMicroserviceshigh availability
0 likes · 14 min read
How ServiceComb’s ServiceCenter Guarantees High‑Availability for Cloud‑Native Microservices
ITFLY8 Architecture Home
ITFLY8 Architecture Home
May 5, 2018 · Databases

How to Build a MySQL Master‑Slave Cluster: Step‑by‑Step Guide

This article walks readers through setting up MySQL replication, from the basic master‑slave model to a one‑master‑multiple‑slave cluster, covering configuration files, essential parameters, verification commands, performance tips, and common pitfalls for production deployments.

ClusterMaster‑SlaveReplication
0 likes · 15 min read
How to Build a MySQL Master‑Slave Cluster: Step‑by‑Step Guide
21CTO
21CTO
May 5, 2018 · Backend Development

From Single Server to Scalable Architecture: Key Lessons from Large‑Scale Site Design

This comprehensive note distills the evolution of large‑website architecture—from single‑server setups to layered, distributed, and highly available systems—covering caching, clustering, read/write separation, CDN, NoSQL, business splitting, scalability, extensibility, and automation strategies.

Distributed Systemshigh availabilitylarge-scale architecture
0 likes · 20 min read
From Single Server to Scalable Architecture: Key Lessons from Large‑Scale Site Design
Architecture Digest
Architecture Digest
May 5, 2018 · Backend Development

Evolution and Core Principles of Large‑Scale Website Architecture

This article summarizes the evolution stages, architectural patterns, and key concerns such as performance, scalability, extensibility, high availability, and distributed design that large‑scale websites must address, providing practical insights and visual diagrams for each concept.

Distributed SystemsScalabilitycaching
0 likes · 21 min read
Evolution and Core Principles of Large‑Scale Website Architecture
Efficient Ops
Efficient Ops
Apr 24, 2018 · Operations

Guangdong Mobile's DevOps Turnaround: A Practical Operations Blueprint

Facing steep efficiency and quality gaps between traditional telecom operators and fast‑moving internet firms, Guangdong Mobile detailed its DevOps journey—identifying challenges, selecting mature frameworks, prioritizing six key processes, and sharing practical tactics for deployment, monitoring, high‑availability, and cultural change—to accelerate digital transformation.

DevOpsDigital Transformationhigh availability
0 likes · 19 min read
Guangdong Mobile's DevOps Turnaround: A Practical Operations Blueprint
dbaplus Community
dbaplus Community
Apr 24, 2018 · Databases

Scaling Baidu’s TSDB to Trillions of Points: Elastic, High‑Performance Architecture

Baidu’s TSDB processes over 20 million data points per second per node and tens of thousands of queries per second cluster‑wide by employing a stateless read/write‑separated elastic architecture, multi‑layer storage across Redis, HBase and Hadoop, minute‑level geo‑redundant self‑healing, and a modified Gorilla compression that cuts storage by 80% with minimal CPU overhead.

Big DataTSDBTime Series Database
0 likes · 8 min read
Scaling Baidu’s TSDB to Trillions of Points: Elastic, High‑Performance Architecture
360 Zhihui Cloud Developer
360 Zhihui Cloud Developer
Apr 24, 2018 · Databases

How Dynamo Achieves High‑Availability in Distributed Key‑Value Stores

This article explains Dynamo, the decentralized key‑value storage system, covering its design goals, consistent‑hashing partitioning with virtual nodes, replication strategies, quorum‑based consistency, conflict resolution with vector clocks, hinted handoff, Merkle‑tree synchronization, and gossip‑based failure detection.

DynamoGossip ProtocolReplication
0 likes · 9 min read
How Dynamo Achieves High‑Availability in Distributed Key‑Value Stores
Meitu Technology
Meitu Technology
Apr 23, 2018 · Backend Development

Design and Evolution of Live Streaming Bullet Comment System: From HTTP Polling to Long Connection

Meipai’s live‑stream bullet‑comment platform progressed from an initial HTTP‑polling design supporting one million users, through a high‑availability dual‑room architecture, to a scalable long‑connection system with gRPC routing, dynamic degradation, and caching, solving message ordering, Redis bottlenecks, and ensuring seamless user experience.

bullet comment systemhigh availabilitylive streaming
0 likes · 11 min read
Design and Evolution of Live Streaming Bullet Comment System: From HTTP Polling to Long Connection
ITPUB
ITPUB
Apr 23, 2018 · Databases

What’s New in MySQL 8.0 GA? Key Features and Upgrade Tips

MySQL 8.0 GA (8.0.11) brings up to twice the performance of 5.7, adds NoSQL document storage, window functions, hidden and descending indexes, CTEs, enhanced JSON handling, improved InnoDB reliability, built‑in high availability, and stronger security, while requiring in‑place upgrades and backups.

8.0JSONNoSQL
0 likes · 7 min read
What’s New in MySQL 8.0 GA? Key Features and Upgrade Tips
Youzan Coder
Youzan Coder
Apr 20, 2018 · Databases

ZanDB: An Automated Database Management Platform for Large-Scale Operations

ZanDB is Youzan’s automated database‑management platform that standardizes OS and MySQL configurations, employs a Python/Django stack with a Go‑based agent and Celery scheduler, and provides unified modules for backup, host, instance, metadata, log and HA management, currently automating about 70 % of manual operations while targeting full‑scale monitoring, diagnostics and sharding automation.

Backup ManagementZanDBdatabase automation
0 likes · 14 min read
ZanDB: An Automated Database Management Platform for Large-Scale Operations
Meituan Technology Team
Meituan Technology Team
Apr 19, 2018 · Operations

How Meituan‑Dianping Built a 100% High‑Availability Core Transaction System

This article analyzes the rapid growth challenges of Meituan‑Dianping's core payment flow, explains key availability metrics such as MTBF and MTTR, and presents a comprehensive set of architectural, operational, and tooling strategies—including dependency decoupling, timeout tuning, circuit breaking, and full‑link stress testing—to achieve stable, fault‑tolerant transactions.

MicroservicesOperationscircuit breaker
0 likes · 20 min read
How Meituan‑Dianping Built a 100% High‑Availability Core Transaction System
21CTO
21CTO
Apr 5, 2018 · Backend Development

Scaling Tmall’s Double‑11 Pages: From Staticization to CDN Architecture

This article reviews how Tmall’s product‑detail and other browsing systems were transformed through static‑page generation, multi‑level caching, unified web cache layers and CDN deployment to handle Double‑11 traffic spikes while improving performance, availability, and cost efficiency.

CDNhigh availabilitystaticization
0 likes · 17 min read
Scaling Tmall’s Double‑11 Pages: From Staticization to CDN Architecture
Architecture Digest
Architecture Digest
Apr 5, 2018 · Databases

Designing a Highly Available Redis Service Using Sentinel

This article explains how to build a highly available Redis deployment by defining HA requirements, analyzing failure scenarios, and progressively implementing solutions from a single instance to a three‑sentinel architecture with virtual IP failover for seamless client access.

failoverhigh availabilitysentinel
0 likes · 11 min read
Designing a Highly Available Redis Service Using Sentinel
Architecture Digest
Architecture Digest
Mar 29, 2018 · Databases

Designing a High‑Availability Redis Service with Sentinel

This article explains how to build a highly available Redis deployment using Redis Sentinel, compares several architectural options, and details the final three‑sentinel design that tolerates node, process, and network failures while keeping client access simple.

Infrastructurefailoverhigh availability
0 likes · 12 min read
Designing a High‑Availability Redis Service with Sentinel
Architecture Digest
Architecture Digest
Mar 26, 2018 · Operations

Alipay’s Double 11 Architecture: Logical Data Centers, Distributed Transactions, and High‑Availability Strategies

The article details Alipay’s comprehensive architecture for the Double 11 shopping festival, covering its three‑layer IAAS/PAAS/SAAS model, logical data‑center design, multi‑active disaster‑recovery, blue‑green deployment, distributed data sharding, transaction processing, and the Ant Credit Pay service’s performance and risk‑control mechanisms.

AlipayBig DataDistributed Systems
0 likes · 16 min read
Alipay’s Double 11 Architecture: Logical Data Centers, Distributed Transactions, and High‑Availability Strategies
MaGe Linux Operations
MaGe Linux Operations
Mar 25, 2018 · Backend Development

Mastering Nginx: Reverse Proxy, Master‑Worker Model, Hot Reload & High‑Availability

This article explains Nginx's role as a lightweight web and reverse‑proxy server, clarifies forward vs reverse proxy concepts, details the master‑worker architecture, hot deployment, high‑concurrency handling with epoll, and shows how to achieve high availability and load balancing using Keepalived, upstream blocks, and caching.

Nginxhigh availabilityhot-reload
0 likes · 10 min read
Mastering Nginx: Reverse Proxy, Master‑Worker Model, Hot Reload & High‑Availability
Efficient Ops
Efficient Ops
Mar 19, 2018 · Databases

Redis Deep Dive: Core Technologies, Evolution, and Real-World Practices

In this interview, Redis China User Group chair Zhang Donghong shares the database’s key features, version history, data types, high‑availability options, clustering mechanics, automation challenges, future trends, and practical advice for beginners, illustrating how Redis powers massive online services.

ClusterDatabase ArchitectureNoSQL
0 likes · 21 min read
Redis Deep Dive: Core Technologies, Evolution, and Real-World Practices
Tencent Cloud Developer
Tencent Cloud Developer
Mar 14, 2018 · Cloud Computing

Business Continuity Solutions on Tencent Cloud: High Availability and Disaster Recovery

Tencent Cloud’s business continuity solutions combine high‑availability clusters, multi‑AZ load balancing, and cross‑region disaster‑recovery architectures—such as CLB‑CVM‑MySQL configurations, CDB hot‑standby instances, DNS‑based failover, and data‑sync services—to ensure continuous operation and rapid recovery from localized or regional failures.

HATencent Cloudbusiness continuity
0 likes · 10 min read
Business Continuity Solutions on Tencent Cloud: High Availability and Disaster Recovery
MaGe Linux Operations
MaGe Linux Operations
Mar 9, 2018 · Operations

Build a High‑Availability Web Cluster with Keepalived on Linux

This guide walks through installing, compiling, and configuring Keepalived across multiple Linux nodes to create a VRRP‑based high‑availability web service cluster, covering prerequisites, virtual IP setup, load‑balancing rules, and monitoring tools.

high availabilitykeepalivedload balancing
0 likes · 6 min read
Build a High‑Availability Web Cluster with Keepalived on Linux
dbaplus Community
dbaplus Community
Feb 1, 2018 · Databases

Building a Reliable Geo‑Active Dual‑Active Architecture for Massive Online Games

This article details a two‑stage approach for creating a geo‑distributed active‑active infrastructure for large‑scale games, covering a pseudo active‑active design with private lines and smart DNS, followed by a true active‑active solution using Redis Sentinel and MySQL clustering with performance comparisons.

Active-ActiveGame BackendMySQL Cluster
0 likes · 8 min read
Building a Reliable Geo‑Active Dual‑Active Architecture for Massive Online Games
dbaplus Community
dbaplus Community
Jan 4, 2018 · Operations

Understanding ZooKeeper Architecture and FastLeaderElection: A Deep Dive

This article explains ZooKeeper's distributed coordination architecture, the ZAB consensus protocol, server roles, write and read workflows, FastLeaderElection mechanics, configurable election algorithms, and how ZooKeeper can be used to implement reliable distributed locks and leader election.

Distributed CoordinationFastLeaderElectionZAB
0 likes · 26 min read
Understanding ZooKeeper Architecture and FastLeaderElection: A Deep Dive
Dada Group Technology
Dada Group Technology
Dec 29, 2017 · Backend Development

Implementing a Custom Circuit Breaker in Distributed Systems

This article details the implementation of a custom circuit breaker to prevent system failures in distributed systems, covering design principles, Java and Python implementations, and its effectiveness during high traffic periods.

Distributed SystemsPythonSystem Architecture
0 likes · 12 min read
Implementing a Custom Circuit Breaker in Distributed Systems
Efficient Ops
Efficient Ops
Dec 21, 2017 · Databases

Master MySQL: From Beginner Basics to Advanced High‑Availability and Backup Strategies

This comprehensive guide uses a gaming‑level analogy to walk readers through MySQL fundamentals, architecture, storage engines, memory structures, logging, backup and recovery methods, high‑availability designs, and advanced performance tuning, providing practical commands, diagrams, and best‑practice recommendations.

Database Architecturehigh availabilitymysql
0 likes · 19 min read
Master MySQL: From Beginner Basics to Advanced High‑Availability and Backup Strategies
Architecture Digest
Architecture Digest
Dec 21, 2017 · Operations

Design and Implementation of an Open‑Source Load Balancing Solution Using Nginx and LVS

The article describes how a company replaced costly commercial load balancers with an open‑source architecture based on Nginx for layer‑4 traffic and a layer‑7 cluster, detailing project background, technology selection, redundant design, network and Nginx configurations, operational scripts, performance testing, and data analysis.

Operationsautomationhigh availability
0 likes · 11 min read
Design and Implementation of an Open‑Source Load Balancing Solution Using Nginx and LVS
MaGe Linux Operations
MaGe Linux Operations
Dec 21, 2017 · Operations

Mastering High Availability Clusters: Key Concepts, Resource Management, and Failure Handling

This article explains how high‑availability (HA) clusters provide redundancy for directors, RS‑servers, databases and storage, covering active‑passive node roles, resource stickiness, constraints, quorum voting, split‑brain avoidance, failure detection methods, and essential configuration tips.

ClusterOperationsResource Management
0 likes · 12 min read
Mastering High Availability Clusters: Key Concepts, Resource Management, and Failure Handling
21CTO
21CTO
Nov 21, 2017 · Operations

How We Scaled WeChat Pay’s Transaction Records to Billions Daily

This article chronicles the evolution of WeChat Pay’s transaction record system—from early key/value storage bottlenecks and incomplete data to a distributed, tiered architecture that supports billions of daily records, improves query performance, ensures data security, and handles holiday traffic spikes through flexible throttling.

Distributed SystemsWeChat Paydata security
0 likes · 11 min read
How We Scaled WeChat Pay’s Transaction Records to Billions Daily
21CTO
21CTO
Nov 20, 2017 · Operations

Mastering High Availability and Concurrency: Core Principles and Practical Techniques

This article distills essential guiding principles, high‑availability strategies, and high‑concurrency techniques for building resilient, scalable systems, covering stateless design, fault‑handling phases, replication, isolation, rate limiting, caching, async processing, multithreading, and scaling approaches.

System Designfault tolerancehigh availability
0 likes · 21 min read
Mastering High Availability and Concurrency: Core Principles and Practical Techniques
Architecture Digest
Architecture Digest
Nov 19, 2017 · Operations

Guiding Principles and Practices for High Availability and High Concurrency in Large‑Scale Systems

The article outlines core guiding principles, high‑availability strategies, and high‑concurrency techniques—such as stateless design, replica and isolation, quota control, monitoring, degradation, rollback, and scaling—to help engineers build resilient, scalable web architectures for massive traffic.

Distributed SystemsScalabilitySystem Design
0 likes · 20 min read
Guiding Principles and Practices for High Availability and High Concurrency in Large‑Scale Systems
Dada Group Technology
Dada Group Technology
Nov 17, 2017 · Backend Development

Designing a High‑Availability Distributed ID Generator: From UUID to Snowflake

This article examines the requirements for globally unique IDs in distributed systems, compares classic generation schemes such as UUID, Flickr, Snowflake and TDDL, and details a customized Snowflake‑based implementation with ZooKeeper‑managed worker IDs, clock‑rollback handling, deployment optimizations, and JVM tuning to achieve high performance and reliability.

BackendDistributed SystemsID generation
0 likes · 15 min read
Designing a High‑Availability Distributed ID Generator: From UUID to Snowflake
Suning Technology
Suning Technology
Nov 17, 2017 · Operations

How Suning Scaled Its API Platform: Standards, High Availability, and O2O Event Readiness

This article explains how Suning built a standardized, high‑availability API gateway, detailing naming conventions, documentation practices, protocol choices, error‑code design, dynamic configuration, SDK automation, system refactoring, monitoring, intelligent alerting, and the specific preparations made for the O2O shopping festival.

api-designcloud computinghigh availability
0 likes · 16 min read
How Suning Scaled Its API Platform: Standards, High Availability, and O2O Event Readiness
Java Backend Technology
Java Backend Technology
Nov 13, 2017 · Backend Development

Transforming Monolithic Websites to Scalable, High‑Performance Distributed Systems

Learn how early monolithic websites evolve into distributed architectures by splitting applications, services, and data, implementing load balancers, reverse proxies, caching, CDN, database sharding, and security measures, while focusing on performance, high availability, scalability, and extensibility for robust, high‑traffic sites.

Distributed SystemsScalabilityhigh availability
0 likes · 11 min read
Transforming Monolithic Websites to Scalable, High‑Performance Distributed Systems
JD Retail Technology
JD Retail Technology
Oct 30, 2017 · Operations

Ensuring High Availability and Scalability for Large‑Scale Promotions: Insights from a JD Senior Architect

The article explains how JD’s senior architect prepares for the 11.11 shopping festival by defining high‑availability goals, discussing scalability strategies, disaster‑recovery planning, performance optimization, and system resilience to ensure reliable service under massive traffic spikes.

OperationsScalabilitySystem Architecture
0 likes · 8 min read
Ensuring High Availability and Scalability for Large‑Scale Promotions: Insights from a JD Senior Architect
Qunar Tech Salon
Qunar Tech Salon
Oct 25, 2017 · Backend Development

Design and Optimization of a High‑Performance Flight Search and Pricing System

This article outlines the design, challenges, and performance optimizations of a large‑scale flight search and pricing platform, covering system requirements, architecture layers, caching strategies, indexing, real‑time data synchronization, memory reduction techniques, and high‑availability solutions to handle massive, low‑latency queries.

cachingflight searchhigh availability
0 likes · 19 min read
Design and Optimization of a High‑Performance Flight Search and Pricing System
dbaplus Community
dbaplus Community
Oct 23, 2017 · Databases

How eBay Builds Resilient Multi‑Data‑Center Applications with MongoDB

The article explains eBay's use of MongoDB to create highly available, fault‑tolerant multi‑data‑center architectures, detailing design patterns, replica set configurations, read/write strategies, and recent MongoDB features that enable scalable, mission‑critical applications.

Database designMongoDBMulti-Data Center
0 likes · 8 min read
How eBay Builds Resilient Multi‑Data‑Center Applications with MongoDB