Tagged articles
92 articles
Page 1 of 1
Radish, Keep Going!
Radish, Keep Going!
Jan 30, 2026 · Big Data

How Uber Scaled Data Replication to Petabytes Daily with Distcp Optimizations

Uber tackled the challenge of replicating over 350 PB of data across on‑premise and cloud lakes by redesigning Hadoop Distcp, moving intensive tasks to the Application Master, parallelising copy‑listing and commit phases, and leveraging Uber‑mapper jobs to dramatically cut latency and improve resource efficiency.

Big DataDistcpHadoop
0 likes · 17 min read
How Uber Scaled Data Replication to Petabytes Daily with Distcp Optimizations
Architecture Breakthrough
Architecture Breakthrough
Dec 30, 2025 · Industry Insights

When to Replicate Data Locally vs. Rely on Central Services? A Deep Dive into Middle‑Platform Trade‑offs

This article analyzes the strategic decision of using local data copies or caches versus central middle‑platform services, examining performance, frequency, cost, technical complexity, and organizational impact through the lens of CAP theorem and modern cloud‑native architecture.

CAP theoremMicroservicesarchitecture
0 likes · 9 min read
When to Replicate Data Locally vs. Rely on Central Services? A Deep Dive into Middle‑Platform Trade‑offs
Ctrip Technology
Ctrip Technology
Dec 5, 2025 · Databases

How Ctrip’s DRC Enables High‑Performance Cross‑Region MySQL Replication

This article explains the design and implementation of Ctrip's Data Replication Center (DRC), a MySQL‑based high‑availability system that solves cross‑region data loop, progress tracking, concurrency, DDL handling, and conflict resolution to achieve low‑latency, reliable data replication for global travel services.

Distributed SystemsGTIDcross-region
0 likes · 21 min read
How Ctrip’s DRC Enables High‑Performance Cross‑Region MySQL Replication
NiuNiu MaTe
NiuNiu MaTe
Sep 4, 2025 · Operations

Mastering Multi‑Active Distributed Systems: From Single Server to Global Fault Tolerance

This article walks developers through the evolution of distributed system architectures—from single‑machine deployments to master‑slave, same‑city active‑active, and finally true multi‑active setups—explaining core concepts, replication strategies, conflict resolution, fault detection, switch mechanisms, recovery methods, and interview tips for high‑availability design.

CAP theoremDistributed SystemsInterview Preparation
0 likes · 26 min read
Mastering Multi‑Active Distributed Systems: From Single Server to Global Fault Tolerance
Ops Community
Ops Community
Aug 25, 2025 · Operations

How DRBD Can Save Your Production Data from Disasters

This article explains why most companies suffer long recovery times after data loss, introduces DRBD's real‑time block replication as a solution, and provides detailed architecture designs, deployment steps, monitoring scripts, performance tuning, cost analysis, common pitfalls, and future trends for reliable disaster recovery.

DRBDLinuxdata replication
0 likes · 9 min read
How DRBD Can Save Your Production Data from Disasters
Tech Freedom Circle
Tech Freedom Circle
Aug 4, 2025 · Operations

How Do Projects Achieve High Availability Without Multi‑Site Active‑Active? – A Meituan Interview Question

The article analyzes high‑availability concepts, from single‑machine risks to multi‑site active‑active architectures, compares cold and hot backup strategies, discusses network latency challenges, and presents Ele.me’s cell‑based, sharding‑driven multi‑region solution with concrete examples, tables, and code snippets.

cell-based architecturedata replicationdisaster recovery
0 likes · 28 min read
How Do Projects Achieve High Availability Without Multi‑Site Active‑Active? – A Meituan Interview Question

Demystifying Consistency Models: From Linear to Eventual in Distributed Systems

This article explores the concept of consistency in distributed systems, breaking down various consistency models—including linear, sequential, causal, and eventual—explaining their definitions, practical implications, and how they guide the design of high‑availability architectures and data replication strategies.

ConsistencyDistributed Systemsconsistency models
0 likes · 13 min read
Demystifying Consistency Models: From Linear to Eventual in Distributed Systems
Cognitive Technology Team
Cognitive Technology Team
Jun 21, 2025 · Fundamentals

Understanding Faults, Failures, and Fault Tolerance in Distributed Systems

This tutorial explains the definitions of faults and failures in distributed systems, explores their types and root causes, and presents fault‑tolerance mechanisms such as replication, checkpointing, redundancy, error detection, load balancing, and consensus algorithms to build resilient architectures.

Distributed Systemsconsensus algorithmsdata replication
0 likes · 10 min read
Understanding Faults, Failures, and Fault Tolerance in Distributed Systems

Why Data Replication Matters: Architectures, Formats, and Consistency Models

This article explores the principles of data replication, comparing shared memory, shared disk, and non‑shared storage architectures, detailing replication formats, consistency challenges, and various replication strategies such as synchronous, asynchronous, semi‑synchronous, and majority‑based approaches, helping engineers choose the right trade‑offs.

Asynchronous ReplicationConsistencydata replication
0 likes · 12 min read
Why Data Replication Matters: Architectures, Formats, and Consistency Models
Cognitive Technology Team
Cognitive Technology Team
Apr 13, 2025 · Backend Development

Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms

This article explains how RocketMQ achieves high availability and data reliability through its master‑slave broker design, covering synchronous and asynchronous replication, flush strategies, transaction messaging, automatic failover with Dledger, and read‑write separation for load balancing in distributed systems.

Distributed SystemsMaster‑SlaveRocketMQ
0 likes · 7 min read
Understanding RocketMQ Master‑Slave Architecture and High‑Availability Mechanisms
Architect
Architect
Jan 3, 2025 · Operations

Designing Multi‑Active Distributed Systems: Overcoming Write Latency and Data Replication Challenges

This article analyzes the architectural impact of cross‑city multi‑active deployments, focusing on data‑layer design, write latency, sharding strategies, replication topologies, and routing considerations to achieve high availability, performance, and scalability in large‑scale distributed systems.

Distributed Systemsdata replicationmulti-active architecture
0 likes · 22 min read
Designing Multi‑Active Distributed Systems: Overcoming Write Latency and Data Replication Challenges
dbaplus Community
dbaplus Community
Jan 1, 2025 · Backend Development

Mastering Multi-Active Data Architecture: Reducing Write Latency and Ensuring High Availability

This article examines the challenges of building multi‑active distributed systems, focusing on the data layer’s role in high availability, write‑latency, sharding, isolation, replication strategies, and routing decisions, and provides concrete architectural patterns and practical guidelines for robust backend design.

Distributed SystemsLatencydata replication
0 likes · 23 min read
Mastering Multi-Active Data Architecture: Reducing Write Latency and Ensuring High Availability
Efficient Ops
Efficient Ops
Oct 23, 2024 · Databases

How NineData Boosts R&D Collaboration 5× with Multi‑Cloud Database Management

The NineData presentation at the 2024 GOPS Global Operations Conference in Shanghai detailed multi‑cloud, multi‑source database architecture trends, showcased their intelligent data management platform, explained data replication principles, DevOps challenges and AI‑enhanced solutions, and highlighted real‑world customer success stories across industries.

AICloud NativeDevOps
0 likes · 11 min read
How NineData Boosts R&D Collaboration 5× with Multi‑Cloud Database Management
Architect
Architect
Oct 17, 2024 · Operations

Designing Multi‑Active Distributed Systems: Key Factors and Replication Strategies

This article analyzes the architectural challenges of building large‑scale distributed systems with multi‑active (cross‑city) capabilities, focusing on data‑layer design, write latency, replication models, sharding techniques, and routing impacts to guide reliable, high‑performance infrastructure decisions.

Distributed Systemsarchitecturedata replication
0 likes · 22 min read
Designing Multi‑Active Distributed Systems: Key Factors and Replication Strategies
Tencent Cloud Developer
Tencent Cloud Developer
Oct 15, 2024 · Industry Insights

Why Write Latency Drives Multi‑Active Distributed Architecture Design

This article analyzes how write latency, write volume, isolation, and data replication strategies influence the design of multi‑active distributed systems, offering practical guidance on sharding, synchronous and asynchronous replication, routing, and architecture selection for high availability and performance across regions.

Distributed Systemsdata replicationhigh availability
0 likes · 23 min read
Why Write Latency Drives Multi‑Active Distributed Architecture Design
Architect
Architect
Mar 18, 2024 · Databases

MySQL vs PostgreSQL: Overview, Performance Benchmark, and Use‑Case Guidance

The article introduces MySQL and PostgreSQL, compares their performance through a benchmark on identical hardware, discusses each system’s strengths, weaknesses, and suitable application scenarios, and provides guidance on choosing the appropriate database for different workloads.

PostgreSQLSQLUse Cases
0 likes · 8 min read
MySQL vs PostgreSQL: Overview, Performance Benchmark, and Use‑Case Guidance
Open Source Linux
Open Source Linux
Mar 1, 2024 · Operations

How Two‑Site Three‑Center Disaster Recovery Boosts Business Continuity with Oracle Data Guard

The two‑site three‑center disaster recovery model combines a production site, a same‑city backup, and a remote backup to ensure data integrity and rapid recovery, leveraging Oracle Data Guard for synchronized and asynchronous replication, thereby improving RPO and RTO across various disaster scenarios.

OperationsOracle Data Guardbusiness continuity
0 likes · 4 min read
How Two‑Site Three‑Center Disaster Recovery Boosts Business Continuity with Oracle Data Guard
Didi Tech
Didi Tech
Nov 14, 2023 · Databases

Didi's Multi-Active Redis Architecture: Design, Challenges, and Solutions

To achieve disaster-recovery and cross-data-center resilience, Didi progressed from a simple proxy double-write scheme to a sophisticated MQ-free multi-active Redis design that uses a dedicated syncer, shard-based loop prevention, op-id replay protection, conflict detection, and incremental AOF durability, ensuring low latency, no data loss, and consistent availability.

DidiDistributed Systemsdata replication
0 likes · 11 min read
Didi's Multi-Active Redis Architecture: Design, Challenges, and Solutions
FunTester
FunTester
Jun 19, 2023 · Big Data

Kafka Architecture and Core Concepts: Brokers, Producers, Consumers, Topics, Partitions, Replicas, and Reliability

This article provides a comprehensive overview of Kafka's architecture and fundamental concepts, covering its overall structure, key components such as brokers, producers, consumers, topics, partitions, replicas, leader‑follower synchronization, offset handling, message storage at both logical and physical layers, as well as producer and consumer workflows, partition assignment strategies, rebalancing, log management, zero‑copy I/O, and reliability mechanisms.

Distributed SystemsKafkaLog Management
0 likes · 22 min read
Kafka Architecture and Core Concepts: Brokers, Producers, Consumers, Topics, Partitions, Replicas, and Reliability
Programmer DD
Programmer DD
Jun 3, 2023 · Databases

Master MySQL Binlog: Sync Data and Power Business Innovations

This article explains MySQL's binlog, its role in master‑slave replication, and how businesses can harness it for data heterogeneity, cache synchronization, and task dispatch, illustrating practical middleware designs that transform raw changes into valuable services.

BackendBinlogdata replication
0 likes · 7 min read
Master MySQL Binlog: Sync Data and Power Business Innovations
ITPUB
ITPUB
Feb 13, 2023 · Fundamentals

How a Bat-Borne Virus Explains the Gossip Protocol in Distributed Systems

Using a fictional coronavirus carried by a bat, the article illustrates the Gossip protocol’s mechanisms—direct mail, anti-entropy, and epidemic spread—to explain how distributed systems achieve eventual consistency, highlighting advantages, drawbacks, and practical considerations for storage components like Cassandra.

Anti-entropyDistributed SystemsGossip Protocol
0 likes · 10 min read
How a Bat-Borne Virus Explains the Gossip Protocol in Distributed Systems
dbaplus Community
dbaplus Community
Feb 8, 2023 · Big Data

How Bilibili Scaled Offline Processing Across Multiple Data Centers

This article details Bilibili's multi‑datacenter offline architecture, explaining the capacity challenges, the chosen scale‑out design, and the implementation of job placement, data replication, routing, versioning, throttling, and traffic analysis to efficiently handle massive batch workloads across geographically distributed clusters.

HDFSbandwidth optimizationdata replication
0 likes · 26 min read
How Bilibili Scaled Offline Processing Across Multiple Data Centers
Architects' Tech Alliance
Architects' Tech Alliance
Nov 5, 2022 · Databases

Data Replication: Fundamentals, Technologies, and Future Trends

This article explains the concept of data replication, its three-stage process, key principles of compliance, timeliness, and diversity, various replication methods, layered technologies across storage, operating system, and database levels, emerging cloud and big‑data solutions, and heterogeneous use‑case scenarios.

Big Datadata replicationdatabases
0 likes · 15 min read
Data Replication: Fundamentals, Technologies, and Future Trends
DataFunTalk
DataFunTalk
Sep 4, 2022 · Big Data

Design and Implementation of Bilibili's Offline Multi‑Datacenter Solution

This article describes Bilibili's offline multi‑datacenter architecture, explaining why a scale‑out approach was chosen over scale‑up, and detailing the unit‑based design, job placement, data replication, routing, versioning, bandwidth throttling, traffic analysis, and the operational results and future directions.

Big DataHDFSJob Scheduling
0 likes · 24 min read
Design and Implementation of Bilibili's Offline Multi‑Datacenter Solution
Architects' Tech Alliance
Architects' Tech Alliance
Aug 28, 2022 · Databases

Data Replication: Fundamentals, Technologies, and Industry Trends

The article explains data replication concepts, processes, and technologies across storage hardware, operating system, and database layers, outlines synchronous, asynchronous, and hybrid methods, discusses industry applications, trends such as hardware‑software decoupling, cloud replication, and big‑data real‑time copying, and highlights challenges and future directions.

Big Dataclouddata replication
0 likes · 14 min read
Data Replication: Fundamentals, Technologies, and Industry Trends
Meituan Technology Team
Meituan Technology Team
Aug 25, 2022 · Databases

Data Replication in Distributed Systems – Part 1: Models, Challenges, and Design Considerations

The article surveys three data‑replication models—master‑slave, multi‑master, and leaderless—explains how they enable scalability and fault‑tolerance, and examines core distributed‑system challenges such as partial node failures, unreliable networks, and unsynchronized clocks, while stressing safety‑liveness trade‑offs and design techniques like quorum and timeouts.

ConsistencyDistributed SystemsKafka
0 likes · 37 min read
Data Replication in Distributed Systems – Part 1: Models, Challenges, and Design Considerations
Efficient Ops
Efficient Ops
Jul 19, 2022 · Databases

How CDC Powers Real-Time Analytics Without Overloading Your Database

This article introduces the practice of Change Data Capture (CDC), explaining how capturing only data changes can feed downstream systems and data warehouses in near real‑time, reducing load on the source database, improving reporting latency, and supporting scalable, reliable analytics pipelines.

CDCChange Data CaptureReal-time analytics
0 likes · 9 min read
How CDC Powers Real-Time Analytics Without Overloading Your Database
Bilibili Tech
Bilibili Tech
Jul 5, 2022 · Big Data

Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili

To overcome rapid data growth and on‑premise capacity limits, Bilibili adopted a scale‑out, unit‑based multi‑datacenter architecture that isolates failures, intelligently places jobs, replicates data via an enhanced DistCp service, routes reads with an IP‑aware HDFS router, and throttles cross‑site traffic, enabling stable offline big‑data processing of hundreds of petabytes while preserving throughput.

HDFSYARNbandwidth optimization
0 likes · 28 min read
Multi‑Datacenter Architecture for Offline Big Data Processing at Bilibili
Top Architect
Top Architect
May 11, 2022 · Databases

An Introduction to Change Data Capture (CDC) Practices

This article introduces the concept and practice of Change Data Capture (CDC), explaining why CDC is needed for real‑time analytics, how it works by capturing DML changes, modern approaches using transaction logs, and key considerations for building a production‑ready CDC system.

CDCChange Data CaptureData Integration
0 likes · 8 min read
An Introduction to Change Data Capture (CDC) Practices
Aikesheng Open Source Community
Aikesheng Open Source Community
Apr 19, 2022 · Databases

DTLE 4.22.04.0 Release Notes – New Features and Fixes

The DTLE 4.22.04.0 release introduces UTF‑32 support, Chinese table name handling, enhanced ColumnMapFrom conversion, expanded SqlFilter capabilities, additional logging, and numerous bug fixes including procedure handling, DDL panic prevention, and task list display issues, with links to documentation and related articles.

DTLEFeature Updatesdata replication
0 likes · 4 min read
DTLE 4.22.04.0 Release Notes – New Features and Fixes
Top Architect
Top Architect
Mar 22, 2022 · Big Data

Elasticsearch Cluster Architecture and Data Layer Design

This article explains Elasticsearch's cluster architecture, including nodes, indices, shards, and replicas, compares mixed and tiered deployment models, discusses the data storage layer and replication trade‑offs, and presents two typical distributed data system designs with their advantages and drawbacks.

Cluster ArchitectureElasticsearchdata replication
0 likes · 14 min read
Elasticsearch Cluster Architecture and Data Layer Design
Sanyou's Java Diary
Sanyou's Java Diary
Mar 20, 2022 · Operations

Unlocking Ultra‑High Availability: The Secrets of Geo‑Active Multi‑Active Architecture

This article explains what geo‑active multi‑active (异地多活) architecture is, why it is needed for ultra‑high availability, and walks through the step‑by‑step evolution from a single‑node system to sophisticated multi‑data‑center designs that use redundancy, disaster‑recovery, data synchronization, routing, and conflict‑resolution techniques.

data replicationdisaster recoverymulti-active
0 likes · 31 min read
Unlocking Ultra‑High Availability: The Secrets of Geo‑Active Multi‑Active Architecture
DeWu Technology
DeWu Technology
Jan 19, 2022 · Operations

Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies

Covering essential high‑availability techniques, the article examines disaster‑recovery architectures from same‑city dual‑center to cross‑country active‑passive deployments, compares five patterns, details three multi‑active models, outlines required traffic‑scheduling, replication, and database layers, and provides design methodology, practical safeguards, and key HA metrics.

Distributed Systemsdata replicationdisaster recovery
0 likes · 23 min read
Common High‑Availability Architecture Patterns and Multi‑Active Deployment Strategies
Architect
Architect
Dec 31, 2021 · Operations

Understanding Distributed System High Availability: From Single‑Node to Multi‑Active Architecture

This article explains the principles, evolution, and implementation details of high‑availability architectures—from basic single‑node setups to multi‑active, cross‑region deployments—covering redundancy, disaster recovery, data synchronization, routing strategies, and the challenges of achieving true geo‑distributed active‑active systems.

Active-ActiveDistributed SystemsSystem Architecture
0 likes · 30 min read
Understanding Distributed System High Availability: From Single‑Node to Multi‑Active Architecture
dbaplus Community
dbaplus Community
Nov 7, 2021 · Databases

Why Distributed Databases Matter: From Early DBMS to Modern NewSQL

This article traces the evolution of database systems—from the first network and hierarchical models through relational databases, NoSQL sharding, and finally modern distributed SQL—explaining why distributed databases emerged, how they handle data distribution, consistency, SQL-to‑KV mapping, and transaction challenges.

NewSQLSQL to KVconsensus algorithms
0 likes · 16 min read
Why Distributed Databases Matter: From Early DBMS to Modern NewSQL
360 Tech Engineering
360 Tech Engineering
Aug 25, 2021 · Big Data

Cross‑IDC Kafka Hot‑Standby with MirrorMaker 2: Architecture, Design, and Productization

This article explains how 360 Commercialization implements cross‑IDC hot‑standby for Kafka using MirrorMaker 2, covering MM2 fundamentals, architecture, internal topics, deployment on Kubernetes, design goals, solution details, challenges such as dynamic configuration and offset reverse‑mapping, and productized risk mitigation.

KafkaMirrorMaker2cross-IDC
0 likes · 11 min read
Cross‑IDC Kafka Hot‑Standby with MirrorMaker 2: Architecture, Design, and Productization
Programmer DD
Programmer DD
Jun 1, 2021 · Fundamentals

What Makes Distributed File Systems Tick? Design Principles and Architecture Explained

This article explores the core concepts, design requirements, architectural models, scalability, high availability, performance optimization, and security considerations of distributed file systems, comparing centralized and decentralized approaches while highlighting practical solutions for persistence, consistency, and fault tolerance.

ConsistencyDistributed File SystemScalability
0 likes · 21 min read
What Makes Distributed File Systems Tick? Design Principles and Architecture Explained
Top Architect
Top Architect
May 4, 2021 · Big Data

Overview of CDC Tools: Canal, Maxwell, Databus, and Alibaba DTS

This article introduces four change‑data‑capture solutions—Canal, Maxwell, Databus, and Alibaba Data Transmission Service (DTS)—explaining their principles, processing steps, features, and practical advantages for real‑time data synchronization and migration in big‑data environments.

Alibaba DTSBig DataCDC
0 likes · 6 min read
Overview of CDC Tools: Canal, Maxwell, Databus, and Alibaba DTS
Wukong Talks Architecture
Wukong Talks Architecture
Feb 24, 2021 · Fundamentals

Understanding the Gossip Protocol Through a Virus Analogy

The article uses a whimsical story of a coronavirus‑like virus transmitted from a bat to humans to illustrate the Gossip protocol, its three functions—direct mail, anti‑entropy, and epidemic spread—and discusses their advantages, drawbacks, and practical applications in achieving eventual consistency in distributed systems.

Anti-entropyDistributed SystemsGossip Protocol
0 likes · 10 min read
Understanding the Gossip Protocol Through a Virus Analogy
Tencent Cloud Developer
Tencent Cloud Developer
Dec 18, 2020 · Cloud Computing

Multi-AZ Deployment and High Availability Practices for Tencent Cloud Elasticsearch

The guide explains how to configure Tencent Cloud Elasticsearch clusters for multi‑AZ high availability by using zone‑aware node attributes, deploying data nodes in multiples of AZs, assigning three dedicated masters across zones, setting replica shards and force‑awareness rules, and safely upgrading single‑AZ clusters without service interruption.

Cluster DeploymentElasticsearchMulti‑AZ
0 likes · 11 min read
Multi-AZ Deployment and High Availability Practices for Tencent Cloud Elasticsearch
Architecture Digest
Architecture Digest
Nov 15, 2020 · Cloud Native

Ele.me's Multi‑Active Architecture: Design Principles, Core Components and Implementation Overview

This article explains how Ele.me built a multi‑active, geographically distributed system that enables elastic scaling and data‑center‑level disaster recovery by partitioning services, routing traffic, replicating data in real time, and enforcing strict consistency and availability principles.

Distributed Systemsdata replicationmulti-active
0 likes · 18 min read
Ele.me's Multi‑Active Architecture: Design Principles, Core Components and Implementation Overview
IT Architects Alliance
IT Architects Alliance
Nov 1, 2020 · Industry Insights

What Are the Five Core Data Replication Techniques for Disaster Recovery?

This article breaks down the five major data replication approaches—application‑level, host‑level, database‑level, storage‑gateway, and storage‑media—detailing their principles, advantages, drawbacks, and typical use cases to help professionals design effective disaster‑recovery solutions.

BackupReplication Techniquesdata replication
0 likes · 12 min read
What Are the Five Core Data Replication Techniques for Disaster Recovery?
Architects' Tech Alliance
Architects' Tech Alliance
Nov 1, 2020 · Databases

Overview of Data Replication Technologies for Disaster Recovery

The article introduces the 2021 China Disaster Recovery Whitepaper and explains five layers of data replication—application, host, database, storage‑gateway, and storage‑media—detailing their mechanisms, advantages, limitations, and use cases in modern backup and business continuity solutions.

Backupdata replicationstorage
0 likes · 12 min read
Overview of Data Replication Technologies for Disaster Recovery
Architecture Digest
Architecture Digest
Oct 26, 2020 · Fundamentals

How to Systematically Learn Distributed Systems: Problems, Solutions, and Emerging Challenges

This article outlines why distributed systems are needed, explains how they address cost and high‑availability issues by coordinating cheap nodes, and discusses the new coordination challenges such as service discovery, load balancing, fault isolation, monitoring, data partitioning, replication, and distributed transactions, providing a roadmap for further study.

Distributed Systemsdata replication
0 likes · 11 min read
How to Systematically Learn Distributed Systems: Problems, Solutions, and Emerging Challenges
JD Cloud Developers
JD Cloud Developers
Sep 22, 2020 · Fundamentals

Designing High‑Reliability Storage Systems: Strategies from JD Cloud & Intel

An in‑depth look at how JD Cloud’s high‑reliability storage architecture tackles data reliability challenges—covering replica management, redundancy, detection and repair mechanisms, tiered storage designs, and Intel Optane’s role in boosting performance—offering practical strategies for balancing cost and resilience.

Intel OptaneReliabilitycloud
0 likes · 17 min read
Designing High‑Reliability Storage Systems: Strategies from JD Cloud & Intel
Architects' Tech Alliance
Architects' Tech Alliance
Sep 19, 2020 · Fundamentals

How to Systematically Learn Distributed Systems: Problems, Solutions, and Emerging Challenges

This article outlines why distributed systems are needed, explains how they address cost and high‑availability issues through coordinated nodes, and discusses the new challenges such as service discovery, load balancing, avalanche prevention, monitoring, data sharding, replication, and distributed transactions, while offering practical and theoretical learning paths.

CAP theoremDistributed SystemsLearning Guide
0 likes · 10 min read
How to Systematically Learn Distributed Systems: Problems, Solutions, and Emerging Challenges
Java Captain
Java Captain
Aug 15, 2020 · Databases

Comprehensive SQL Server Cheat Sheet: Basics, Advanced Queries, Administration, and Replication

This article provides a detailed collection of SQL Server commands and techniques covering database creation, table and index management, common DML statements, advanced set operators, various join types, grouping, pagination, maintenance tasks, replication setup, linked‑server usage, and a synchronization stored procedure.

Database AdministrationLinked ServerSQL
0 likes · 22 min read
Comprehensive SQL Server Cheat Sheet: Basics, Advanced Queries, Administration, and Replication
Ctrip Technology
Ctrip Technology
May 28, 2020 · Databases

Design and Implementation of Ctrip's Data Replicate Center (DRC) for MySQL Multi‑Active Replication

The article describes Ctrip's Data Replicate Center (DRC), a MySQL middleware that enables real‑time bidirectional replication across data‑center clusters, detailing its architecture, low‑latency optimizations, consistency mechanisms, DDL handling, monitoring, and future high‑availability improvements.

DRCDatabase MiddlewareGTID
0 likes · 16 min read
Design and Implementation of Ctrip's Data Replicate Center (DRC) for MySQL Multi‑Active Replication
Ziru Technology
Ziru Technology
Sep 6, 2019 · Backend Development

How Alibaba Canal Enables Real-Time MySQL Binlog Replication and Incremental Data Sync

Canal, an open‑source Alibaba project, mimics MySQL slave behavior to subscribe to binlog events, parses them, and supports both standalone and ZooKeeper‑coordinated cluster deployments, offering flexible state storage, message processing pipelines, and integration options such as TCP, Kafka, and RocketMQ for real‑time data synchronization.

BinlogCanalJava
0 likes · 11 min read
How Alibaba Canal Enables Real-Time MySQL Binlog Replication and Incremental Data Sync
Architects' Tech Alliance
Architects' Tech Alliance
Jul 31, 2019 · Databases

Overview of Five Common Data Replication Technologies

This article introduces the global data replication market, explains synchronous and asynchronous replication, and details five typical replication techniques—host‑based, application/middleware‑based, database‑based, storage‑gateway‑based, and storage‑media‑based—highlighting their principles, advantages, and trade‑offs for disaster‑recovery planning.

Asynchronous Replicationdata replicationdatabase
0 likes · 11 min read
Overview of Five Common Data Replication Technologies
JD Retail Technology
JD Retail Technology
Jul 31, 2019 · Fundamentals

Consistency Levels and Consensus Algorithms: Paxos, ZAB, and Raft

This article explains distributed data consistency concepts, the CAP theorem, various consistency levels, and provides detailed overviews of three major consensus algorithms—Paxos, ZAB, and Raft—including their mechanisms, roles, and practical applications such as in CB‑SQL.

Distributed SystemsPaxosRaft
0 likes · 18 min read
Consistency Levels and Consensus Algorithms: Paxos, ZAB, and Raft
Big Data Technology & Architecture
Big Data Technology & Architecture
Apr 3, 2019 · Big Data

Understanding RAID and Its Role in HDFS Architecture

This article explains the storage challenges of big data, introduces RAID technologies and their variants, and shows how the principles of RAID are applied in the Hadoop Distributed File System (HDFS) to achieve scalable, reliable, and high‑performance data storage and processing.

Big DataHDFSRAID
0 likes · 10 min read
Understanding RAID and Its Role in HDFS Architecture
21CTO
21CTO
Jan 28, 2019 · Databases

How Alibaba’s TDDL Evolved from Cobar to Power Billions of Daily Queries

This article traces the evolution of Alibaba’s distributed data layer—from the early Cobar system to the modern TDDL framework and DRDS service—explaining their architectures, limitations, sharding principles, transaction‑boundary strategies, heterogeneous index tables, and the Jingwei data‑replication platform that together enable seamless scaling and high‑performance SQL processing across thousands of databases.

AlibabaDRDSJingwei
0 likes · 29 min read
How Alibaba’s TDDL Evolved from Cobar to Power Billions of Daily Queries
Aikesheng Open Source Community
Aikesheng Open Source Community
Dec 30, 2018 · Databases

Introducing DTLE: An Open‑Source MySQL Data Transfer Middleware for CDC, Replication, and Cloud Synchronization

The article presents DTLE, an open‑source MySQL data‑transfer middleware that extends replication capabilities with high‑performance CDC, multi‑topology support, cloud‑to‑cloud synchronization, and robust cluster management, while comparing it with other open‑source solutions and showcasing real‑world demos.

CDCDTLEcloud sync
0 likes · 14 min read
Introducing DTLE: An Open‑Source MySQL Data Transfer Middleware for CDC, Replication, and Cloud Synchronization
ITPUB
ITPUB
Dec 3, 2018 · Databases

How Ele.me Deployed a Dual‑Active Database System in Just Three Months

This talk details Ele.me’s three‑month rollout of a dual‑active (multi‑active) database system, covering design principles, architecture, migration steps, challenges such as data consistency and DDL handling, the tools built (DRC, D‑Bus, DCP, EMHA), and the performance and reliability benefits achieved.

DBAEle.medata replication
0 likes · 14 min read
How Ele.me Deployed a Dual‑Active Database System in Just Three Months
Architects Research Society
Architects Research Society
Sep 30, 2018 · Backend Development

Microservice Architecture: Benefits, Challenges, and Trade‑offs

The article examines the advantages and disadvantages of microservice architectures, discussing flexibility, scalability, autonomy, monitoring, build and release complexity, security, and data replication, while highlighting practical trade‑offs and lessons learned from real‑world implementations.

MicroservicesSecurityautonomy
0 likes · 8 min read
Microservice Architecture: Benefits, Challenges, and Trade‑offs
Architects' Tech Alliance
Architects' Tech Alliance
Apr 14, 2018 · Industry Insights

How Veritas Velocity Accelerates Data Replica Management and Cuts Costs

The article details Veritas Velocity’s hybrid‑cloud data replica solution, its architecture, deployment steps, role‑based access, workflow automation with Oracle RMAN, and performance benchmarks that show up to 8.8× faster test‑environment provisioning while halving required manpower.

BackupPerformance EvaluationVeritas Velocity
0 likes · 11 min read
How Veritas Velocity Accelerates Data Replica Management and Cuts Costs
Efficient Ops
Efficient Ops
Apr 1, 2018 · Backend Development

Ele.me’s Secret to Seamless Multi-Region Active-Active Architecture

This article details how Ele.me engineered a cross‑region active‑active system that scales elastically, tolerates whole‑data‑center failures, and maintains real‑time food‑delivery performance through geographic sharding, intelligent routing, and robust data‑replication middleware.

Distributed Systemsdata replicationgeographic sharding
0 likes · 18 min read
Ele.me’s Secret to Seamless Multi-Region Active-Active Architecture
Java Backend Technology
Java Backend Technology
Mar 19, 2018 · Fundamentals

Why Distributed Consistency Matters: From CAP to BASE Explained

This article explores the importance of data consistency in distributed systems, illustrating real‑world scenarios, explaining consistency models such as strong, weak and eventual, and detailing the challenges and theories like CAP and BASE that guide system designers in balancing consistency, availability, and partition tolerance.

BASE theoryCAP theoremConsistency
0 likes · 18 min read
Why Distributed Consistency Matters: From CAP to BASE Explained
dbaplus Community
dbaplus Community
Jan 16, 2018 · Big Data

Kafka MirrorMaker Mastery: Real‑Time Sync, Tuning & Troubleshooting

Kafka MirrorMaker provides near‑real‑time cross‑data‑center replication by consuming from a source cluster and producing to a target cluster, and this guide explains its core features, new vs. old consumer APIs, partition assignment strategies, performance tuning, network considerations, and practical command‑line examples.

Consumer APIKafkaMirrorMaker
0 likes · 13 min read
Kafka MirrorMaker Mastery: Real‑Time Sync, Tuning & Troubleshooting
Architects' Tech Alliance
Architects' Tech Alliance
Sep 16, 2017 · Operations

Understanding NetApp MetroCluster: Architecture, Data Synchronization, and High‑Availability Solutions

The article explains NetApp MetroCluster’s clustered storage architecture, including dual‑site HA pairs, 4‑node active‑active designs, synchronization mechanisms such as SyncMirror, ClusterRemote and CRS, and the network and NVRAM strategies that enable seamless data protection and disaster recovery across distances up to 200 km.

MetroClusterNetAppdata replication
0 likes · 9 min read
Understanding NetApp MetroCluster: Architecture, Data Synchronization, and High‑Availability Solutions
dbaplus Community
dbaplus Community
Jul 26, 2017 · Databases

How Ele.me Achieved Sub‑Second MySQL Multi‑Active Replication with DRC

This article details Ele.me's design and implementation of a MySQL bidirectional replication component (DRC) that enables sub‑second, high‑throughput data synchronization across Beijing and Shanghai data centers, addressing latency, consistency, and failover challenges in a multi‑active environment.

Distributed Systemsdata replicationdatabase-consistency
0 likes · 18 min read
How Ele.me Achieved Sub‑Second MySQL Multi‑Active Replication with DRC
21CTO
21CTO
Jul 12, 2017 · Fundamentals

Why Logs Are the Hidden Backbone of Distributed Systems and Real‑Time Data

This note distills Jay Kreps' extensive blog on logs, explaining their core role in distributed databases, real‑time data pipelines, replication, and state‑machine consistency, and showing how logs unify concepts from version control to streaming architectures.

data replicationlogsreal-time data
0 likes · 12 min read
Why Logs Are the Hidden Backbone of Distributed Systems and Real‑Time Data
Huawei Cloud Developer Alliance
Huawei Cloud Developer Alliance
Dec 8, 2016 · Operations

From DR 1.0 to DR 3.0: The Evolution of Disaster Recovery and Its Future

This article traces the history of disaster recovery from its 1970s origins through the rise of hot‑standby sites, the impact of internet expansion, and the shift to cloud‑based DRaaS, while forecasting future trends such as integrated business‑as‑disaster solutions, automated management platforms, and social‑media‑driven emergency communication.

DRaaSIT Operationsbusiness continuity
0 likes · 10 min read
From DR 1.0 to DR 3.0: The Evolution of Disaster Recovery and Its Future
MaGe Linux Operations
MaGe Linux Operations
Nov 7, 2016 · Big Data

How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance

This article explains how HDFS, inspired by Google’s GFS, provides a low‑cost, highly reliable, fault‑tolerant, and high‑performance distributed file system for big‑data workloads by using replication, standby NameNodes, block storage, rack awareness, and compute‑close‑to‑data strategies.

Big DataDistributed File SystemHDFS
0 likes · 7 min read
How HDFS Achieves Low Cost, High Reliability, and Fault Tolerance
ITPUB
ITPUB
Nov 2, 2016 · Databases

Mastering Oracle GoldenGate: Architecture, Components, and Configuration Guide

This article provides a comprehensive overview of Oracle GoldenGate, detailing its supported databases, modular architecture, key components such as Extract, Data Pump, Replicat, Trails, Checkpoints, Manager and Collector, as well as processing types, group configuration, and commit sequence numbers for reliable data replication.

Change Data CaptureETLOracle GoldenGate
0 likes · 20 min read
Mastering Oracle GoldenGate: Architecture, Components, and Configuration Guide
ITPUB
ITPUB
Jul 14, 2016 · Databases

Recovering Oracle GoldenGate After a Source Crash: Step‑by‑Step Fixes

This guide explains how to restore Oracle GoldenGate synchronization after a source‑side crash by repairing the extract service, restarting the pump process, and correcting the replicat on the target, with detailed commands and file‑selection procedures.

ExtractOracle GoldenGatePump
0 likes · 5 min read
Recovering Oracle GoldenGate After a Source Crash: Step‑by‑Step Fixes
Qunar Tech Salon
Qunar Tech Salon
May 13, 2016 · Big Data

Overview and Architecture of Hadoop Distributed File System (HDFS)

This article provides a comprehensive overview of Hadoop Distributed File System (HDFS), detailing its design goals, architecture components such as NameNode, DataNode and SecondaryNameNode, data block handling, replication strategies, communication protocols, and the read, write, and delete processes.

Big DataDistributed File SystemHDFS
0 likes · 18 min read
Overview and Architecture of Hadoop Distributed File System (HDFS)
ITPUB
ITPUB
Apr 13, 2016 · Databases

Accelerating MySQL Data Repair with pt-table-sync: A Self‑Healing Solution

This article explains how the Tencent Game DBA team uses Percona's pt-table-sync to detect and automatically repair MySQL replication inconsistencies, achieving up to 30‑fold speed improvements, reducing resource usage, and enabling a data self‑healing service for large‑scale gaming databases.

DBAchecksumdata replication
0 likes · 13 min read
Accelerating MySQL Data Repair with pt-table-sync: A Self‑Healing Solution
dbaplus Community
dbaplus Community
Apr 6, 2016 · Databases

Seamless DB2 Major Version Upgrade Using CDC Replication

This article explains why DB2 customers must upgrade to avoid EOS risks, compares replication options, and provides a detailed CDC‑based step‑by‑step process to achieve minute‑level downtime, zero performance impact, and fast rollback for large‑scale DB2 version migrations.

CDCDB2Version Upgrade
0 likes · 10 min read
Seamless DB2 Major Version Upgrade Using CDC Replication
Architect
Architect
Dec 18, 2015 · Fundamentals

Understanding Distributed Consistency: Importance, Models, and Challenges

The article explains why consistency is essential in distributed systems, describes the CAP theorem, outlines various consistency models such as strong, weak, and eventual consistency, and discusses the trade‑offs between data correctness and system performance.

CAP theoremConsistencyDistributed Systems
0 likes · 9 min read
Understanding Distributed Consistency: Importance, Models, and Challenges
Architect
Architect
Nov 12, 2015 · Backend Development

JD.com Multi‑Center Transaction System Architecture for the 11.11 Shopping Festival

The article explains how JD.com designed and deployed a multi‑center transaction architecture, using a high‑performance data bus and strict consistency and routing controls, to handle the massive traffic spikes of the 11.11 e‑commerce event while ensuring scalability and disaster recovery.

Distributed Systemsbackend scalabilitydata replication
0 likes · 9 min read
JD.com Multi‑Center Transaction System Architecture for the 11.11 Shopping Festival
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
May 27, 2015 · Databases

Design and Application of Alibaba's Data Replication Center (DRC) for Active‑Active Scenarios

The article presents an overview of Alibaba's Data Replication Center (DRC), detailing its architecture, real‑time cross‑region synchronization capabilities, consistency and latency guarantees, deployment strategies, and its use cases on Alibaba Cloud such as RDS migration and multi‑active e‑commerce workloads.

Active-ActiveAlibaba CloudDRC
0 likes · 10 min read
Design and Application of Alibaba's Data Replication Center (DRC) for Active‑Active Scenarios
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 26, 2015 · Cloud Computing

Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations

The 2015 Beijing QCon highlighted four technical talks covering asynchronous processing in distributed systems, the DRC data‑replication infrastructure, minute‑level high‑availability fault recovery, and cloud‑era database operations, illustrating Alibaba's approaches to scalability and reliability in modern cloud platforms.

Distributed SystemsQConasynchronous processing
0 likes · 6 min read
Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations