Tagged articles
1414 articles
Page 11 of 15
Amap Tech
Amap Tech
Oct 31, 2019 · Backend Development

Evolution of Amap's Billion-Scale Traffic Access Layer Services

Sun Wei outlined Amap’s transformation of its traffic access layer—from handling 600,000‑plus QPS with sub‑2 ms latency through a fully asynchronous, stream‑based pipeline and reactive Vert.x/WebFlux experiments, to API aggregation, traffic tagging, and a roadmap toward distributed sidecar or SDK gateways for billion‑scale, low‑latency services.

Asynchronous ArchitectureDistributed SystemsService Mesh
0 likes · 11 min read
Evolution of Amap's Billion-Scale Traffic Access Layer Services
Huajiao Technology
Huajiao Technology
Oct 29, 2019 · Backend Development

Building a Scalable Distributed Cron: Google‑Level Design Simplified for Startups

This article examines Google's high‑availability distributed cron design, distills its core requirements and algorithms, and then presents a streamlined implementation for a startup using etcd and Raft, followed by a thoughtful discussion on whether early‑stage companies should adopt a middle‑platform strategy.

Raftbackend infrastructuredistributed cron
0 likes · 10 min read
Building a Scalable Distributed Cron: Google‑Level Design Simplified for Startups
dbaplus Community
dbaplus Community
Oct 27, 2019 · Databases

How Weibo Scales Redis: Architecture, Optimizations, and Future Plans

This article details how Weibo leverages Redis across billions of requests, describing its massive scale, the challenges of trillion‑level reads/writes, the technical choices and customizations made—including LongSet, HA solutions, multi‑level caching, RocksDB integration—and outlines ongoing capacity and future development strategies.

RocksDBWeibocache optimization
0 likes · 18 min read
How Weibo Scales Redis: Architecture, Optimizations, and Future Plans
Big Data Technology & Architecture
Big Data Technology & Architecture
Oct 21, 2019 · Databases

High‑Availability Practices of Alibaba HBase: Large Clusters, MTTF/MTTR, Disaster Recovery, and Extreme Experience

This article reviews Alibaba HBase's evolution toward high availability, covering large‑cluster architecture, reliability metrics (MTTF/MTTR), disaster‑recovery strategies such as data replication and traffic switching, performance optimizations for extreme latency requirements, and lessons learned for building resilient distributed database services.

Distributed SystemsHBasedatabases
0 likes · 20 min read
High‑Availability Practices of Alibaba HBase: Large Clusters, MTTF/MTTR, Disaster Recovery, and Extreme Experience
21CTO
21CTO
Oct 21, 2019 · Backend Development

How Ant Financial Scales Transactions with Distributed Architecture and Microservices

This article summarizes the key concepts, advantages, and practical implementations of distributed architecture at Ant Financial, covering microservice migration, modular development, database vertical and horizontal sharding, high‑availability mechanisms, gray‑release, and fault‑tolerant settlement workflows.

Microservicesdatabase shardingdistributed architecture
0 likes · 8 min read
How Ant Financial Scales Transactions with Distributed Architecture and Microservices
dbaplus Community
dbaplus Community
Oct 20, 2019 · Databases

How to Migrate Legacy Systems to a Distributed MySQL Architecture: 8 Practical Strategies

This article walks through the complete evolution of a legacy monolithic database system to a distributed MySQL architecture, covering background analysis, four migration phases, eight concrete strategies—including functional transfer, system splitting, horizontal scaling, read/write separation, and performance tuning—while providing code examples, benchmark results, and deployment considerations.

Schema Refactoringdistributed architecturehigh availability
0 likes · 18 min read
How to Migrate Legacy Systems to a Distributed MySQL Architecture: 8 Practical Strategies
dbaplus Community
dbaplus Community
Oct 14, 2019 · Databases

Mastering Percona XtraDB Cluster: High Availability, Monitoring, and Backup Strategies

This comprehensive guide explains Galera‑based Percona XtraDB Cluster architecture, high‑availability mechanisms, state‑transfer methods, flow‑control, deployment patterns, routine inspection, monitoring variables, backup management, common failure scenarios, and real‑world case studies for MySQL clusters.

BackupGaleraPercona XtraDB Cluster
0 likes · 36 min read
Mastering Percona XtraDB Cluster: High Availability, Monitoring, and Backup Strategies
DevOps Cloud Academy
DevOps Cloud Academy
Oct 7, 2019 · Operations

GitLab High Availability Solution with DRBD

This guide details a step‑by‑step setup of a highly available GitLab service using two virtual machines, DRBD for block‑level replication, configuration of GitLab and PostgreSQL directories, DRBD resource creation, service start‑up, and manual primary‑secondary failover procedures.

DRBDGitLabLinux
0 likes · 8 min read
GitLab High Availability Solution with DRBD
21CTO
21CTO
Oct 1, 2019 · Backend Development

How Ant Financial Scales Payments with Distributed Microservices and Database Sharding

This article explains Ant Financial's practical implementation of a distributed architecture—including micro‑service migration, modular development, database vertical and horizontal sharding, high‑availability mechanisms, task‑scheduling platforms, gray‑release strategies, and full‑link stress testing—to achieve reliable, scalable payment processing.

Cloud NativeMicroservicesdatabase sharding
0 likes · 8 min read
How Ant Financial Scales Payments with Distributed Microservices and Database Sharding
Aikesheng Open Source Community
Aikesheng Open Source Community
Sep 25, 2019 · Databases

MySQL Replication FAQ: Compatibility, Master‑Slave Behavior, Performance, and High‑Availability

This article provides a comprehensive MySQL replication FAQ covering cross‑OS and hardware compatibility, master‑slave connection behavior, monitoring lag, forcing master pause, bidirectional replication considerations, performance improvements, high‑availability setups, and how to exclude GRANT/REVOKE statements from replication.

Master‑SlaveReplicationdatabase
0 likes · 10 min read
MySQL Replication FAQ: Compatibility, Master‑Slave Behavior, Performance, and High‑Availability
Architecture Digest
Architecture Digest
Sep 23, 2019 · Operations

Improving Application Availability: Practices, Monitoring, and Fault‑Tolerance in a Large‑Scale Payment System

The article describes how a high‑traffic payment platform achieves 99.999% availability by avoiding single points of failure, applying fail‑fast principles, implementing resource limits, building real‑time monitoring and alerting, and automating fault detection, routing, and recovery to ensure continuous 7×24 operation.

backend operationsfault tolerancehigh availability
0 likes · 23 min read
Improving Application Availability: Practices, Monitoring, and Fault‑Tolerance in a Large‑Scale Payment System
Big Data Technology & Architecture
Big Data Technology & Architecture
Sep 21, 2019 · Big Data

Deploying Apache Flink on Kubernetes: A Step‑by‑Step Guide

This tutorial explains how to run Apache Flink jobs on Kubernetes by building Docker images, deploying JobManager and TaskManager components with Kubernetes manifests, configuring high‑availability with ZooKeeper and HDFS, and using SavePoints and scaling techniques to manage and extend Flink streaming applications.

Big DataDockerFlink
0 likes · 14 min read
Deploying Apache Flink on Kubernetes: A Step‑by‑Step Guide
HomeTech
HomeTech
Sep 19, 2019 · Industry Insights

How Autohome Scaled Its 818 Global Car Night to Millions of QPS: A Technical Deep Dive

The article details how Autohome tackled a severe market downturn by launching the 818 Global Car Night, describing the background, massive technical challenges, infrastructure scaling, high‑availability architecture, full‑link stress testing, monitoring, security measures, and the lessons learned for future large‑scale online events.

Performance TestingScalabilitycloud computing
0 likes · 30 min read
How Autohome Scaled Its 818 Global Car Night to Millions of QPS: A Technical Deep Dive
Alibaba Cloud Developer
Alibaba Cloud Developer
Sep 17, 2019 · Cloud Native

How NBF’s FaaS Architecture Powers Serverless at Alibaba’s Mega Sales

This article explains how Alibaba's New‑Retail Business Framework (NBF) implements a non‑typical FaaS architecture that delivers full Serverless capabilities—including containerized bundle management, service publishing, routing, fault tolerance, millisecond‑level auto‑scaling, and rapid rollback—proving its reliability during large‑scale promotional events.

Auto ScalingContainerFaaS
0 likes · 16 min read
How NBF’s FaaS Architecture Powers Serverless at Alibaba’s Mega Sales
Big Data Technology Architecture
Big Data Technology Architecture
Sep 16, 2019 · Operations

Evolution of the Elasticsearch Cluster Architecture in JD.com Order System

This article details how JD.com’s order center migrated its Elasticsearch cluster from a basic, mixed‑node setup to a real‑time, dual‑cluster architecture with increased replicas, physical isolation, version upgrades, and a robust data‑sync strategy to handle billions of documents and hundreds of millions of daily queries.

Cluster ArchitectureElasticsearchdata synchronization
0 likes · 13 min read
Evolution of the Elasticsearch Cluster Architecture in JD.com Order System
Tencent Cloud Developer
Tencent Cloud Developer
Sep 10, 2019 · Cloud Computing

Design and Practice of Multi‑Active Architecture on Public Cloud Infrastructure

Wang Xiaobo explains how public‑cloud services can simplify designing and implementing active‑active architectures, covering data‑center redundancy, real‑time synchronization, fault‑tolerant networking, micro‑service migration, and cost‑benefit trade‑offs, while urging incremental, cloud‑assisted approaches rather than full 100% multi‑active deployments.

Microserviceshigh availabilitymulti-active
0 likes · 18 min read
Design and Practice of Multi‑Active Architecture on Public Cloud Infrastructure
Youzan Coder
Youzan Coder
Sep 4, 2019 · Cloud Native

How Youzan Built a Highly Available Kubernetes Platform for Massive E‑commerce

This article explains why Youzan chose Kubernetes, describes their multi‑IDC, multi‑cluster architecture with high‑availability master components, logging and monitoring solutions, custom service exposure, image building process, lifecycle hooks, continuous delivery pipeline, operational challenges faced, and future plans such as operators and auto‑scaling.

KubernetesMulti-Clusterci/cd
0 likes · 11 min read
How Youzan Built a Highly Available Kubernetes Platform for Massive E‑commerce
Big Data Technology Architecture
Big Data Technology Architecture
Aug 26, 2019 · Backend Development

Redis Distributed Lock Implementation: Design, Issues, and Lessons Learned

This article shares a practical experience of implementing a Redis‑based distributed lock, explains the lock acquisition and release processes, discusses common pitfalls such as expiration handling and concurrency bugs, and provides Q&A on design choices, high‑availability, and future improvements.

Lock designconcurrencydistributed-lock
0 likes · 6 min read
Redis Distributed Lock Implementation: Design, Issues, and Lessons Learned
Architect's Tech Stack
Architect's Tech Stack
Aug 17, 2019 · Databases

Industrial Bank IT Architecture Transformation: From Mainframe DB2 to Distributed MySQL Solutions

This article details Industrial Bank's multi‑year IT architecture transformation, describing the challenges of legacy mainframe‑based OLTP, the strategic shift to a distributed MySQL ecosystem, the implementation phases, high‑availability designs, containerization efforts, measurable outcomes, and future directions for cloud‑native and data‑exchange capabilities.

BankingCloud NativeIT Architecture
0 likes · 22 min read
Industrial Bank IT Architecture Transformation: From Mainframe DB2 to Distributed MySQL Solutions
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Aug 16, 2019 · Operations

Building Scalable Degradation Plans: Lessons from Tong‑Cheng Yilong

At QCon Beijing 2019, senior architect Wang Junxiang shared Tong‑Cheng Yilong’s end‑to‑end degradation‑plan architecture, covering system design, data collection, metric computation, resource recovery, link‑level pre‑plan management, fault diagnosis, strategy extensibility, and high‑availability platform construction, offering practical insights for complex distributed systems.

Distributed Systemsdegradationhigh availability
0 likes · 4 min read
Building Scalable Degradation Plans: Lessons from Tong‑Cheng Yilong
ITPUB
ITPUB
Aug 12, 2019 · Operations

How JD.com Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster

This article details JD.com’s order center journey from a simple Elasticsearch deployment to a highly available, dual‑cluster architecture, covering isolation, replica tuning, hot‑cold data separation, version upgrades, and practical lessons on pagination, field data, and doc values.

Cluster ArchitectureElasticsearchdata synchronization
0 likes · 13 min read
How JD.com Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 12, 2019 · Databases

Choosing Between NewSQL Databases and Middleware‑Based Sharding: A Comparative Analysis

This article objectively compares NewSQL distributed databases with middleware‑based sharding solutions, examining their architectures, distributed transaction handling, scalability, performance, high‑availability, and operational considerations, and provides guidance on selecting the appropriate approach based on workload, consistency, and organizational constraints.

CAP theoremDistributed TransactionsNewSQL
0 likes · 19 min read
Choosing Between NewSQL Databases and Middleware‑Based Sharding: A Comparative Analysis
AntTech
AntTech
Aug 6, 2019 · Databases

How OceanBase Guarantees Data Reliability and Service High‑Availability

The article explains how OceanBase, a distributed enterprise‑grade database, achieves strong data reliability and rapid service recovery on ordinary PC servers by combining Paxos‑based consensus, enhanced redo‑log verification, periodic checkpoint checks, and fine‑grained fail‑over mechanisms, surpassing traditional hardware‑dependent databases.

Data ReliabilityOceanBasePaxos
0 likes · 17 min read
How OceanBase Guarantees Data Reliability and Service High‑Availability
Aikesheng Open Source Community
Aikesheng Open Source Community
Aug 6, 2019 · Databases

MySQL Group Replication Version Compatibility Policies and Upgrade Guidelines

Starting with MySQL 8.0.17, Group Replication introduces patch‑level version compatibility policies that govern primary‑member election, write‑ability, donor selection, and upgrade procedures for mixed‑version clusters, ensuring safe operation during rolling upgrades and multi‑primary mode transitions.

Group ReplicationVersion Compatibilitydatabase
0 likes · 12 min read
MySQL Group Replication Version Compatibility Policies and Upgrade Guidelines
dbaplus Community
dbaplus Community
Jul 10, 2019 · Big Data

How Kuaishou Scales SQL on Hadoop: Architecture, Optimizations, and Lessons Learned

This article explains the SQL‑on‑Hadoop ecosystem—including Hive, Spark, SparkSQL, Presto and other solutions—then details Kuaishou's large‑scale platform architecture, performance bottlenecks, routing logic, high‑availability mechanisms, and a series of concrete optimizations that improve query speed, resource utilization, and operational stability.

SQL on HadoopSparkhigh availability
0 likes · 19 min read
How Kuaishou Scales SQL on Hadoop: Architecture, Optimizations, and Lessons Learned
58 Tech
58 Tech
Jul 8, 2019 · Databases

Design and Implementation of WMHA: A Modified MySQL High‑Availability Solution

This article explains the need for high‑availability MySQL services, critiques the original in‑house HA approach, and details how the mature MHA framework was extended into WMHA with added VIP monitoring, enhanced failover procedures, richer notifications, and a reorganized deployment structure to improve reliability and reduce DBA intervention.

Database operationsMHAWMHA
0 likes · 9 min read
Design and Implementation of WMHA: A Modified MySQL High‑Availability Solution
High Availability Architecture
High Availability Architecture
Jul 5, 2019 · Operations

Practices of Chaos Engineering in Distributed Service Architecture

This article presents a comprehensive overview of chaos engineering, covering its definition, value, principles, implementation steps, enterprise adoption strategies, the open‑source ChaosBlade tool and AHAS Chaos platform, and two detailed case studies demonstrating fault injection experiments in a distributed service environment.

AHASAlibabaFault Injection
0 likes · 15 min read
Practices of Chaos Engineering in Distributed Service Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 3, 2019 · Backend Development

Deep Dive into Apache RocketMQ: Architecture, Routing, Storage, and High‑Availability Design

This article provides a comprehensive overview of Apache RocketMQ’s core architecture, including topic routing mechanisms, message storage file designs, high‑availability message sending, concurrent pull and consumption processes, HA synchronization, and transaction messaging, while offering practical learning steps and programming techniques for developers.

Distributed SystemsMessage QueueRocketMQ
0 likes · 14 min read
Deep Dive into Apache RocketMQ: Architecture, Routing, Storage, and High‑Availability Design
Java High-Performance Architecture
Java High-Performance Architecture
Jul 2, 2019 · Operations

How to Build Highly Available Systems: 8 Essential Strategies

This article outlines eight practical high‑availability techniques—multiple replicas, isolation, rate limiting, circuit breaking, degradation, gray releases with rollback, comprehensive monitoring, and proactive log alerting—to help engineers design systems that are both efficient and reliable under heavy load.

System Designcircuit breakerdegradation
0 likes · 7 min read
How to Build Highly Available Systems: 8 Essential Strategies
Architecture Digest
Architecture Digest
Jul 2, 2019 · Fundamentals

Key Practices for High Availability, Isolation, and Data Consistency in Large‑Scale Internet Systems

The article outlines essential techniques for building highly available internet services, covering system availability metrics, multi‑level caching, database and service isolation, concurrency control, gray‑release deployment, comprehensive monitoring, graceful degradation, asynchronous design, and data‑consistency scenarios for both real‑time and offline big‑data workloads.

Data ConsistencySystem Architecturehigh availability
0 likes · 8 min read
Key Practices for High Availability, Isolation, and Data Consistency in Large‑Scale Internet Systems
Big Data Technology & Architecture
Big Data Technology & Architecture
Jul 1, 2019 · Big Data

How to Ensure High Availability of Message Queues (RabbitMQ and Kafka)

This article explains the concept of high availability for message queues, analyzes interview expectations, and details the HA mechanisms of RabbitMQ (including single, normal cluster, and mirrored modes) and Kafka (partition replication and leader election), highlighting their advantages, drawbacks, and practical considerations.

Distributed SystemsKafkaMessage Queue
0 likes · 11 min read
How to Ensure High Availability of Message Queues (RabbitMQ and Kafka)
Beike Product & Technology
Beike Product & Technology
Jun 28, 2019 · Backend Development

EPX: Real-Time MySQL Change Capture and Kafka Sync Architecture

EPX is a high‑availability, high‑performance data pipeline that captures MySQL binlog changes in real time, parses and filters them, and streams unified JSON events to Kafka for downstream services, while providing monitoring, alerting, backup, and migration capabilities across many business units.

BackendKafkahigh availability
0 likes · 7 min read
EPX: Real-Time MySQL Change Capture and Kafka Sync Architecture
Architecture Digest
Architecture Digest
Jun 24, 2019 · Backend Development

Evolution of Internet Architecture: From Single‑Server to Microservices

This article traces the evolution of internet architecture from simple single‑instance Java projects through Nginx load balancing, HA clusters, CDN, database read/write separation, NoSQL, distributed search, sharding, distributed file systems, service decomposition, and finally microservice architectures, explaining the motivations, techniques and trade‑offs of each stage.

architecturehigh availabilityload balancing
0 likes · 26 min read
Evolution of Internet Architecture: From Single‑Server to Microservices
ITPUB
ITPUB
Jun 22, 2019 · Databases

Master MySQL Replication, Sharding, and Distributed Deployment in 10 Minutes

This article provides a concise, ten‑minute guide to MySQL master‑slave and master‑master replication, data sharding principles and implementations, and various database deployment architectures—including single‑instance, replication‑based scaling, and sharding‑based scaling—while highlighting practical considerations, advantages, and common pitfalls.

Distributed SystemsMycatReplication
0 likes · 15 min read
Master MySQL Replication, Sharding, and Distributed Deployment in 10 Minutes
Architecture Digest
Architecture Digest
Jun 21, 2019 · Backend Development

Design and Implementation of a High‑Availability Scalable IM Group‑Chat Messaging System

This article presents a comprehensive design and implementation of a high‑availability, horizontally scalable instant‑messaging group‑chat system, detailing its architecture, component interactions, scaling strategies, reliability mechanisms, and extensions for offline and single‑chat messaging.

IMgroup chathigh availability
0 likes · 48 min read
Design and Implementation of a High‑Availability Scalable IM Group‑Chat Messaging System
Aikesheng Open Source Community
Aikesheng Open Source Community
Jun 14, 2019 · Databases

Automatic Member Rejoin in MySQL Group Replication (MGR): Features, Configuration, and Monitoring

Starting with MySQL 8.0.16, Group Replication introduces an automatic member rejoin feature that allows expelled or disconnected nodes to attempt reconnection without manual intervention, configurable via the group_replication_autorejoin_tries variable, with monitoring via Performance Schema and trade‑offs compared to expel timeout.

Auto RejoinGroup ReplicationPerformance Schema
0 likes · 11 min read
Automatic Member Rejoin in MySQL Group Replication (MGR): Features, Configuration, and Monitoring
Architecture Digest
Architecture Digest
Jun 12, 2019 · Fundamentals

Comprehensive Guide to Distributed System Theory – Curated Article Collection

This resource compiles a complete series of articles on distributed system theory covering consistency, consensus, high availability, scalability, performance, testing, and operations, offering both quick overviews for newcomers and in‑depth readings for practitioners seeking to master modern distributed architectures.

ConsistencyScalabilityarchitecture
0 likes · 8 min read
Comprehensive Guide to Distributed System Theory – Curated Article Collection
360 Quality & Efficiency
360 Quality & Efficiency
Jun 11, 2019 · Backend Development

NebulasFs: A Distributed High‑Availability Small‑File Storage System Developed by 360 Infrastructure Team

NebulasFs is a self‑developed distributed, highly available, and persistent storage system designed to efficiently store billions of small files, offering simple RESTful APIs, automatic request routing, multi‑tenant isolation, customizable replication, automated scaling, rebalancing, and fault‑tolerant replica recovery for large‑scale unstructured data workloads.

NebulasFsSmall Filescloud
0 likes · 8 min read
NebulasFs: A Distributed High‑Availability Small‑File Storage System Developed by 360 Infrastructure Team
360 Tech Engineering
360 Tech Engineering
Jun 11, 2019 · Databases

NebulasFs: A Distributed High‑Availability Small‑File Storage System

NebulasFs is a self‑developed distributed, highly available and durable storage system designed to efficiently store billions of small files by using a master‑datanode architecture, multi‑tenant isolation, customizable replication, automatic scaling, and automated replica repair, addressing the challenges of massive unstructured data generated by modern applications.

Cloud NativeNebulasFsReplication
0 likes · 7 min read
NebulasFs: A Distributed High‑Availability Small‑File Storage System
Tencent Cloud Developer
Tencent Cloud Developer
May 24, 2019 · Cloud Computing

How Tencent Cloud Elasticsearch Enables Multi‑AZ Disaster Recovery

Tencent Cloud Elasticsearch now supports cross‑availability‑zone deployment, requiring even‑numbered data nodes, dedicated master nodes, and replica settings to ensure continuous service when a zone fails, with detailed steps for quick setup and region limitations explained.

ElasticsearchMulti‑AZTencent Cloud
0 likes · 6 min read
How Tencent Cloud Elasticsearch Enables Multi‑AZ Disaster Recovery
Architecture Talk
Architecture Talk
May 20, 2019 · Databases

From Zero to Redis Mastery: Why and How to Use Its Core Features

This article walks through Redis from a basic overview to advanced features such as persistence, Sentinel, clustering, data types, transactions, Lua scripting, pipelining, and distributed locks, illustrating each concept with practical examples and explaining when and why to use them in real‑world applications.

Data TypesDistributed SystemsPersistence
0 likes · 14 min read
From Zero to Redis Mastery: Why and How to Use Its Core Features
21CTO
21CTO
May 16, 2019 · Databases

How ICBC Transformed Its Legacy OLTP Systems with a Distributed MySQL Architecture

This article details the Industrial and Commercial Bank of China's multi‑year migration from mainframe‑based DB2 to a high‑availability, distributed MySQL solution, covering the challenges, strategic decisions, technical stack, containerization, operational improvements, and measurable business outcomes.

Cloud Nativebanking ITdatabase migration
0 likes · 19 min read
How ICBC Transformed Its Legacy OLTP Systems with a Distributed MySQL Architecture
Aikesheng Open Source Community
Aikesheng Open Source Community
May 14, 2019 · Databases

Industrial and Commercial Bank of China's IT Architecture Transformation: A MySQL Distributed Enterprise Solution

This article details the Industrial and Commercial Bank of China's migration from traditional OLTP mainframe databases to a large‑scale, high‑availability MySQL distributed architecture, covering challenges, strategic goals, technology selection, implementation phases, performance improvements, and future directions.

Banking IT Transformationcontainerizationdatabase migration
0 likes · 19 min read
Industrial and Commercial Bank of China's IT Architecture Transformation: A MySQL Distributed Enterprise Solution
Big Data Technology Architecture
Big Data Technology Architecture
May 13, 2019 · Big Data

Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions

The article analyzes how HBase regions being assigned to a single RegionServer create reliability issues such as jitter, service interruptions, and data loss, examines the underlying hardware, OS, and operational factors, and proposes system optimizations and replica-based high‑availability strategies to mitigate these problems.

Distributed SystemsHBaseRegion
0 likes · 10 min read
Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions
Aikesheng Open Source Community
Aikesheng Open Source Community
May 10, 2019 · Operations

Implementing Load Balancing for DBLE Using Lvs and Keepalived

This article details the design and implementation of a high‑availability load‑balancing solution for the DBLE distributed middleware using Lvs and Keepalived, covering environment setup, configuration, experimental scenarios, performance testing, and troubleshooting tips to ensure stable and balanced traffic distribution.

DBLELVShigh availability
0 likes · 13 min read
Implementing Load Balancing for DBLE Using Lvs and Keepalived
21CTO
21CTO
Apr 24, 2019 · Databases

Which DB Architecture Wins? High Availability, Performance & Consistency Explained

This article examines core database architecture principles—high availability, performance, consistency, and scalability—and compares four common deployment patterns (primary‑standby, dual‑primary, primary‑replica with read/write separation, and a hybrid dual‑primary/replica design), followed by detailed consistency solutions and practical insights for real‑world implementation.

ConsistencyDatabase ArchitectureRead-Write Separation
0 likes · 11 min read
Which DB Architecture Wins? High Availability, Performance & Consistency Explained
Architecture Digest
Architecture Digest
Apr 23, 2019 · Databases

Database Architecture Principles, Common Schemes, and Consistency Solutions

This article outlines core database architecture principles—high availability, performance, consistency, and scalability—examines four typical deployment schemes with their trade‑offs, and presents multiple consistency‑preserving strategies for both primary/replica and DB‑cache interactions.

Consistencyhigh availabilitysharding
0 likes · 10 min read
Database Architecture Principles, Common Schemes, and Consistency Solutions
Aikesheng Open Source Community
Aikesheng Open Source Community
Apr 11, 2019 · Databases

MySQL Replication server_id and server_uuid: Pitfalls, Causes of Data Loss, and Best‑Practice Recommendations

This article explains how duplicate MySQL server_id or server_uuid values in replication topologies can cause data loss during high‑availability failover, illustrates the underlying mechanisms with diagrams, and provides practical configuration recommendations to avoid these issues.

Data lossReplicationhigh availability
0 likes · 6 min read
MySQL Replication server_id and server_uuid: Pitfalls, Causes of Data Loss, and Best‑Practice Recommendations
Java Captain
Java Captain
Apr 3, 2019 · Backend Development

Understanding Message Queues: Benefits, Use Cases, and Challenges

This article explains what a message queue (MQ) is, why it is needed beyond in‑memory Java queues, and how it enables decoupling, asynchronous processing, peak‑shaving and rate‑limiting, while also discussing high‑availability, data‑loss, and consumer‑side considerations.

DecouplingMessage Queuehigh availability
0 likes · 11 min read
Understanding Message Queues: Benefits, Use Cases, and Challenges
MaGe Linux Operations
MaGe Linux Operations
Mar 28, 2019 · Operations

Mastering Load Balancing: Principles, Types, and Algorithms Explained

Load balancing distributes incoming traffic across multiple servers to improve performance, ensure high availability, and enable scalability, and this article explains its core principles, various classifications such as DNS, IP, layer‑2 and hybrid methods, and common algorithms like round‑robin, least connections, hash and weighted distribution.

DNSScalabilityhardware load balancer
0 likes · 13 min read
Mastering Load Balancing: Principles, Types, and Algorithms Explained
Architects' Tech Alliance
Architects' Tech Alliance
Mar 25, 2019 · Cloud Computing

Tencent Cloud’s Intelligent Traffic Scheduling and High‑Redundancy Architecture Mitigate Shanghai Fiber‑Cut Outage

On March 23, a construction accident severed a fiber optic cable in Shanghai, causing widespread internet disruptions, but Tencent Cloud’s intelligent traffic scheduling system and four‑fiber‑three‑router high‑redundancy architecture automatically rerouted traffic, restoring services within two minutes and demonstrating robust cloud network resilience.

BGPNetwork ResilienceTraffic Scheduling
0 likes · 6 min read
Tencent Cloud’s Intelligent Traffic Scheduling and High‑Redundancy Architecture Mitigate Shanghai Fiber‑Cut Outage
Architects' Tech Alliance
Architects' Tech Alliance
Mar 22, 2019 · Operations

Mastering Load Balancing: Principles, Types, and Algorithms Explained

This comprehensive guide explains why load balancing is essential for high‑traffic websites, details vertical and horizontal scaling, compares DNS, IP, link‑layer, and hybrid approaches, outlines common algorithms such as round‑robin and weighted, and reviews hardware versus software solutions.

AlgorithmsDistributed SystemsHardware
0 likes · 12 min read
Mastering Load Balancing: Principles, Types, and Algorithms Explained
Tencent Cloud Developer
Tencent Cloud Developer
Mar 21, 2019 · Cloud Computing

CynosDB Compute‑Intelligent Storage Architecture and High‑Availability Overview

The talk detailed CynosDB’s compute‑intelligent storage and multi‑read architecture, explaining TXSQL, Space Manager, DBStore, and Atlas’s two‑layer distributed storage with three‑replica nodes, high‑availability recovery, snapshot and migration features, and advanced data routing and I/O protocols for robust, fault‑tolerant database services.

CynosDBhigh availability
0 likes · 11 min read
CynosDB Compute‑Intelligent Storage Architecture and High‑Availability Overview
Efficient Ops
Efficient Ops
Mar 17, 2019 · Operations

Why Cold-Standby Disaster Recovery Fails and How Active‑Active Architecture Wins

Modern cloud outages reveal that cold‑standby or simple multi‑cloud promises often provide only psychological comfort; achieving true high availability requires active‑active designs with local traffic handling, data partitioning, and low‑latency synchronization, while balancing cost, complexity, and physical distance constraints.

Active-ActiveLatencydata synchronization
0 likes · 10 min read
Why Cold-Standby Disaster Recovery Fails and How Active‑Active Architecture Wins
Xianyu Technology
Xianyu Technology
Mar 14, 2019 · Operations

Ensuring High Availability of Search Engine Services: A Case Study of Xianyu's Search System

The article explains how Xianyu guarantees high‑availability of its core Ha3‑based search engine through independent gateway deployment, multi‑datacenter disaster recovery, traffic isolation, comprehensive monitoring, pressure testing, gray releases, and automated/manual failover, enabling rapid issue detection, recovery, and continuous service stability.

System Architecturedisaster recoveryemergency response
0 likes · 19 min read
Ensuring High Availability of Search Engine Services: A Case Study of Xianyu's Search System
AntTech
AntTech
Mar 12, 2019 · Databases

Evolution and Architecture of OceanBase Distributed Database

OceanBase, a fully proprietary distributed NewSQL database, has evolved over eight years to support high‑availability, strong consistency via Paxos, flexible replica management, OBProxy routing, LSM‑Tree storage, and migration tools, enabling seamless scaling, disaster recovery, and Oracle compatibility for large‑scale financial services.

PaxosReplicationScalability
0 likes · 15 min read
Evolution and Architecture of OceanBase Distributed Database
MaGe Linux Operations
MaGe Linux Operations
Mar 8, 2019 · Operations

Mastering High‑Availability Clusters: Resources, Constraints, and Failure Handling

This article explains the principles and components of high‑availability (HA) clusters, covering active/standby nodes, resource stickiness and constraints, heartbeat and quorum mechanisms, split‑brain avoidance, failure detection methods, and the minimal setup required for a reliable web‑service HA deployment.

HeartbeatOperationsResource Management
0 likes · 14 min read
Mastering High‑Availability Clusters: Resources, Constraints, and Failure Handling
Tencent Cloud Developer
Tencent Cloud Developer
Mar 5, 2019 · Databases

Technical Sharing on Tencent Cloud's CynosDB: Architecture, High Availability, and Distributed Storage

Tencent Cloud’s CynosDB, a cloud‑native MySQL and PostgreSQL compatible database, uses a compute‑storage separation architecture with computable intelligent storage, primary‑multiple‑read design, and distributed CynosStore to deliver high availability, fast recovery, elastic scaling, and pay‑as‑you‑go pricing for developers.

CynosDBhigh availabilitymysql
0 likes · 4 min read
Technical Sharing on Tencent Cloud's CynosDB: Architecture, High Availability, and Distributed Storage
Java Captain
Java Captain
Mar 3, 2019 · Databases

Redis Overview: Architecture, Persistence, High Availability, and Client Features

This article explains Redis as an in‑memory data store used for caching, database, and messaging, walks through its evolution from simple HTTP caching to dedicated servers, and details server‑side features like persistence, Sentinel, replication, clustering, as well as client‑side capabilities such as rich data types, transactions, Lua scripting, pipelining, and distributed locks.

ClusterDistributed SystemsPersistence
0 likes · 11 min read
Redis Overview: Architecture, Persistence, High Availability, and Client Features
Architect's Tech Stack
Architect's Tech Stack
Feb 26, 2019 · Databases

Database Architecture: Primary‑Backup, Master‑Slave, Read‑Write Splitting, and Consistency Solutions

This article explains fundamental database architecture principles, compares four common deployment patterns—including primary‑backup, dual‑primary, master‑slave with read‑write separation, and a hybrid dual‑primary/master‑slave design—analyzes their high‑availability, performance, consistency, and scalability characteristics, and presents practical consistency‑resolution techniques and personal insights.

ConsistencyDatabase Architecturecaching
0 likes · 10 min read
Database Architecture: Primary‑Backup, Master‑Slave, Read‑Write Splitting, and Consistency Solutions
Java High-Performance Architecture
Java High-Performance Architecture
Feb 19, 2019 · Backend Development

Mastering System Degradation: Keep Your Services Highly Available

This guide explains why degradation is a vital protection mechanism, outlines five strategies across automation, functional, and system‑level dimensions, and details practical implementations such as automatic and manual switches, read/write service fallback, and multi‑level degradation to maintain core functionality under heavy load.

backend reliabilityhigh availabilityservice fallback
0 likes · 7 min read
Mastering System Degradation: Keep Your Services Highly Available
Java Captain
Java Captain
Feb 9, 2019 · Databases

Evolution of Redis Cluster Architecture and Interview Tips

This article reviews the progression of Redis high‑availability solutions—from simple replication with Sentinel, through proxy‑based setups, to native Redis Cluster—explains their advantages and drawbacks, and provides practical interview Q&A tips for candidates.

ClusterProxydatabase
0 likes · 9 min read
Evolution of Redis Cluster Architecture and Interview Tips
ITPUB
ITPUB
Jan 28, 2019 · Databases

How to Prevent Redis Single‑Point Failures with Sentinel and Master‑Slave Replication

This article explains Redis’s key features and common use cases, highlights the risks of single‑node deployments, compares two master‑slave replication architectures, and outlines their pros and cons, and details how Redis Sentinel provides automated monitoring, failover, and configuration to achieve high availability.

databasedisaster recoveryhigh availability
0 likes · 10 min read
How to Prevent Redis Single‑Point Failures with Sentinel and Master‑Slave Replication
dbaplus Community
dbaplus Community
Jan 27, 2019 · Databases

How to Build a Highly Available Redis Service with Sentinel and Virtual IP

This article walks through the design of a highly available Redis deployment, explains common failure scenarios, compares single‑node, master‑slave, and multi‑Sentinel architectures, and shows how adding a virtual IP and three Sentinel instances can provide robust HA while keeping client usage simple.

databasehigh availabilityredis
0 likes · 13 min read
How to Build a Highly Available Redis Service with Sentinel and Virtual IP
Aikesheng Open Source Community
Aikesheng Open Source Community
Jan 25, 2019 · Databases

Highlights from the 2019 MySQL Technical Exchange Conference in Shenzhen

The 2019 MySQL Technical Exchange Conference in Shenzhen, co‑hosted by Shanghai Aikesheng and Oracle, featured six expert talks covering secure MySQL platform design, large‑scale architecture optimization, backup and recovery strategies, open‑source middleware, automated operations, and InnoDB row‑lock mechanics, followed by community Q&A and a look ahead to open‑source initiatives.

Database ArchitectureInnoDBbackup and recovery
0 likes · 5 min read
Highlights from the 2019 MySQL Technical Exchange Conference in Shenzhen
58 Tech
58 Tech
Jan 17, 2019 · Databases

Insights from the 58 Group Technical Salon: Database Operations Platform Construction and Practices

The article summarizes the 58 Group technical salon where experts from Tujia.com, Kingsoft Cloud, and 58 Group shared their experiences on building block‑based database automation systems, cloud database architectures, high‑availability designs, self‑service platforms, and intelligent operation practices for large‑scale database services.

DB OperationsPerformance MonitoringSelf-Service Platform
0 likes · 14 min read
Insights from the 58 Group Technical Salon: Database Operations Platform Construction and Practices
Zhuanzhuan Tech
Zhuanzhuan Tech
Jan 4, 2019 · Backend Development

Design and Implementation of ZZLock Distributed Lock Service Based on Etcd

This article details the requirements analysis, design choices, architecture, code implementation, monitoring, and special considerations of ZZLock, a high‑performance distributed lock solution built on Etcd that ensures atomicity, consistency, and fault‑tolerance for backend services.

Backenddistributed-locketcd
0 likes · 10 min read
Design and Implementation of ZZLock Distributed Lock Service Based on Etcd
dbaplus Community
dbaplus Community
Dec 27, 2018 · Operations

How JD Daojia Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster

This article details how JD Daojia’s order center migrated from MySQL‑only reads to a multi‑stage Elasticsearch architecture, describing each evolution step, data‑sync strategies, performance pitfalls, and the final real‑time active‑passive cluster that ensures high availability for billions of daily queries.

Cluster ArchitectureElasticsearchdata-sync
0 likes · 14 min read
How JD Daojia Scaled Its Order Search with a Real‑Time Dual Elasticsearch Cluster
Programmer DD
Programmer DD
Dec 23, 2018 · Operations

How to Implement Service Degradation for High Availability

This article explains the concept of service degradation, why it is needed to maximize limited resources during traffic spikes, outlines common degradation strategies, and provides practical steps and code examples for ranking, sequencing, and implementing degradation in both front‑end and back‑end systems.

OperationsSystem Designdegradation
0 likes · 11 min read
How to Implement Service Degradation for High Availability
Tongcheng Travel Technology Center
Tongcheng Travel Technology Center
Dec 14, 2018 · Big Data

Design and Architecture of Jarvis: A DAG‑Based Big Data Scheduling Platform

The article describes the design goals, architecture, and key components of Jarvis, an internal DAG‑driven job scheduling platform for big‑data pipelines, covering timed‑shard and workflow schedulers, high‑availability mechanisms, task development for Hive and data‑transfer jobs, dependency handling, APIs, monitoring, and future enhancements.

DAGJob Schedulinghigh availability
0 likes · 17 min read
Design and Architecture of Jarvis: A DAG‑Based Big Data Scheduling Platform
ITPUB
ITPUB
Dec 7, 2018 · Databases

Mastering Redis High Availability: Sentinel, Cluster, and Real‑World Architectures

This article provides a comprehensive guide to Redis high‑availability solutions, detailing Sentinel principles, multiple HA architectures such as DNS‑based, VIP‑based, Keepalived, Redis Cluster, Twemproxy and Codis, and shares practical best‑practice recommendations for deployment and failover.

Clusterhigh availabilityredis
0 likes · 14 min read
Mastering Redis High Availability: Sentinel, Cluster, and Real‑World Architectures
Java Backend Technology
Java Backend Technology
Dec 4, 2018 · Databases

Mastering MySQL: A Practical Knowledge Map of Deployment Scenarios

This article presents a comprehensive knowledge map of MySQL deployment scenarios—including single‑master, master‑slave, master‑multiple‑slaves, horizontal and vertical clustering, and mixed modes—detailing backup methods, performance tuning, scaling strategies, and high‑availability considerations.

Backup StrategiesDatabase Architecturehigh availability
0 likes · 8 min read
Mastering MySQL: A Practical Knowledge Map of Deployment Scenarios
21CTO
21CTO
Dec 3, 2018 · Operations

How JD Daojia Scaled Its Elasticsearch Cluster to Billions of Docs: Lessons and Pitfalls

This article details JD Daojia's order center Elasticsearch architecture evolution—from a chaotic initial deployment to a real‑time dual‑cluster backup—covering scaling strategies, data synchronization methods, and the practical pitfalls encountered along the way.

Cluster ArchitectureElasticsearchdata synchronization
0 likes · 14 min read
How JD Daojia Scaled Its Elasticsearch Cluster to Billions of Docs: Lessons and Pitfalls
Dada Group Technology
Dada Group Technology
Nov 30, 2018 · Big Data

Evolution of JD Daojia Order Center Elasticsearch Cluster: Architecture, Scaling, and Lessons Learned

This article details how JD Daojia's order center migrated from MySQL to a multi‑stage Elasticsearch cluster—covering initial deployment, isolation, replica tuning, primary‑secondary setup, real‑time dual‑cluster upgrades, data synchronization methods, and key pitfalls—to achieve massive scalability, high availability, and performance for billions of orders.

Cluster ArchitectureElasticsearchScalability
0 likes · 13 min read
Evolution of JD Daojia Order Center Elasticsearch Cluster: Architecture, Scaling, and Lessons Learned
Meituan Technology Team
Meituan Technology Team
Nov 22, 2018 · Backend Development

Evolution of Meituan Instant Logistics Distributed System Architecture and Technical Challenges

The article chronicles Meituan’s instant‑logistics system evolution from vertical services to micro‑services, detailing how massive order scale, ultra‑low latency, and fault‑intolerance drove a CAP‑compliant, stateless distributed architecture with AI‑enhanced pricing and matching, robust data‑sync via Databus, automated disaster recovery, and emerging AIOps challenges.

LogisticsMeituanMicroservices
0 likes · 16 min read
Evolution of Meituan Instant Logistics Distributed System Architecture and Technical Challenges
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Nov 21, 2018 · Cloud Computing

Alibaba Data Center Network Architecture HAIL 5.1: High Availability, De‑stacking, and Low‑Latency RDMA Design

The article describes Alibaba's HAIL 5.1 data‑center network architecture introduced for the 2018 Double‑11 event, detailing its high‑availability de‑stacking design, low‑latency RDMA deployment, and future HAIL 2.0 evolution to support larger‑scale, intelligent, and high‑performance cloud networking.

Low latencyRDMAdata center
0 likes · 9 min read
Alibaba Data Center Network Architecture HAIL 5.1: High Availability, De‑stacking, and Low‑Latency RDMA Design