Tagged articles

2122 articles

Page 15 of 22

Jun 18, 2020 · Backend Development

Applying Message Queues for Decoupling in E‑commerce Architecture

The article explains why and how to use message queues to achieve low‑coupling, better performance, fault tolerance, and eventual consistency in an e‑commerce order‑processing flow, discusses common pitfalls such as message loss and duplication, and compares popular queue products like RabbitMQ, Kafka, and RocketMQ.

Backend ArchitectureDecouplingDistributed Systems

0 likes · 10 min read

Applying Message Queues for Decoupling in E‑commerce Architecture

Full-Stack Internet Architecture

Jun 18, 2020 · Big Data

Kafka Interview Questions: High Availability, Reliability, Consistency, Performance, and Usage Rationale

This article explains common Kafka interview questions by analyzing the system's high‑availability design, reliability mechanisms, consistency model, performance tricks such as sequential writes and zero‑copy, and the reasons for using Kafka and message queues, providing both conceptual insight and practical details.

ConsistencyDistributed SystemsKafka

0 likes · 12 min read

Kafka Interview Questions: High Availability, Reliability, Consistency, Performance, and Usage Rationale

Architecture Digest

Jun 16, 2020 · Backend Development

Why Use Distributed Locks? Implementation with Redis, Redisson, and Zookeeper

The article explains the inventory oversell problem in distributed e‑commerce systems, why native JVM locks fail across multiple machines, and presents practical implementations of distributed locks using Redis (including RedLock and Redisson) and Zookeeper (with Curator), comparing their advantages and drawbacks.

Distributed SystemsJava concurrencydistributed-lock

0 likes · 17 min read

Why Use Distributed Locks? Implementation with Redis, Redisson, and Zookeeper

Big Data Technology Architecture

Jun 15, 2020 · Databases

Resolving Zookeeper and HBase Master Crash Caused by jute.maxbuffer Misconfiguration

The article details a step‑by‑step investigation of a Zookeeper outage and subsequent HBase master failure caused by an outdated Zookeeper version bug and an excessively large jute.maxbuffer setting, explaining how to identify the issue, adjust configurations, and improve region assignment performance.

Distributed SystemsHBaseZooKeeper

0 likes · 5 min read

Resolving Zookeeper and HBase Master Crash Caused by jute.maxbuffer Misconfiguration

AntTech

Jun 13, 2020 · Cloud Native

Cloud‑Native Architecture and Quantifiable Benefits at MyBank Using Ant Group’s SOFAStack

The article details how MyBank leverages Ant Group’s cloud‑native SOFAStack platform to rebuild its core banking system, achieving faster onboarding, shorter development cycles, higher availability, and measurable cost and performance gains across millions of small‑business customers.

BankingDigital TransformationDistributed Systems

0 likes · 11 min read

Cloud‑Native Architecture and Quantifiable Benefits at MyBank Using Ant Group’s SOFAStack

Xiaokun's Architecture Exploration Notes

Jun 12, 2020 · Fundamentals

Master Distributed System Consistency: CAP, ACID, BASE & Transaction Protocols

This article explains core distributed‑system concepts—including the CAP theorem, ACID and BASE models, consistency guarantees, and the mechanics of 2PC, 3PC, and TCC transaction protocols—while also discussing availability strategies and practical design considerations.

ACIDBASECAP theorem

0 likes · 19 min read

Master Distributed System Consistency: CAP, ACID, BASE & Transaction Protocols

MaGe Linux Operations

Jun 11, 2020 · Fundamentals

Why Git Matters: A Deep Dive into Distributed Version Control

This article explains what Git is, why version control is essential, compares local, centralized, and distributed systems, describes Git's storage mechanisms and object types, and outlines the typical Git workflow with practical examples.

DAGDelta StorageDistributed Systems

0 likes · 21 min read

Why Git Matters: A Deep Dive into Distributed Version Control

dbaplus Community

Jun 10, 2020 · Backend Development

Why Distributed Core Banking Systems Are the Future of Legacy Bank Architecture

This article examines the evolution of bank core systems from monolithic designs to distributed architectures, detailing the technical rationale, modular decomposition, sharding strategies, data routing, transaction processing, and performance considerations essential for modernizing banking IT infrastructure.

Distributed Systemscore bankingdata routing

0 likes · 44 min read

Why Distributed Core Banking Systems Are the Future of Legacy Bank Architecture

Top Architect

Jun 10, 2020 · Fundamentals

A Comprehensive Guide to Learning Distributed Systems

This article provides a thorough overview of distributed systems, explaining their definition, core challenges, key characteristics, essential components, common protocols, and practical implementations to help readers build a solid, structured learning path for mastering distributed architectures.

Distributed SystemsSystem Designfault tolerance

0 likes · 16 min read

A Comprehensive Guide to Learning Distributed Systems

Xianyu Technology

Jun 9, 2020 · Backend Development

Xianyu Coin Service Migration: A Service‑Based Data Migration Approach

Alibaba migrated Xianyu Coin’s points from the legacy KingTower platform to the new Banliang service using a four‑phase, service‑based approach—preparation, active/passive data migration with distributed locks, dual‑write for consistency, reconciliation, and a gradual service switch—completing the transition in a month without user impact.

Backend EngineeringData MigrationDistributed Systems

0 likes · 11 min read

Xianyu Coin Service Migration: A Service‑Based Data Migration Approach

Architect

Jun 7, 2020 · Fundamentals

Understanding Consistency Models and Distributed Consensus Protocols

This article explains the fundamentals of distributed consistency, covering weak and strong consistency, the CAP theorem, ACID and BASE models, and detailed overviews of 2PC, 3PC, Paxos, Raft, Gossip, NWR, Quorum, and Lease mechanisms, highlighting their trade‑offs and practical use cases.

2PCCAP theoremConsistency

0 likes · 16 min read

Understanding Consistency Models and Distributed Consensus Protocols

DataFunTalk

Jun 7, 2020 · Databases

ByteKV: Design and Implementation of a Strongly Consistent Range-Partitioned KV Store

ByteKV is a C++-based, strongly consistent, range-partitioned key‑value storage system built by ByteDance, featuring a multi‑Raft consensus layer, custom storage engines (RocksDB and BlockDB), automatic partition splitting/merging, load balancing, distributed transactions, and a SQL table layer for rich data models.

ConsensusDistributed SystemsPartitioning

0 likes · 47 min read

ByteKV: Design and Implementation of a Strongly Consistent Range-Partitioned KV Store

Architecture Digest

Jun 6, 2020 · Backend Development

Evolution of Project Architecture and Glossary of Common Distributed System Terms

This article explains the evolution of software project architectures—from single‑server monoliths to MVC, RPC, SOA, and micro‑services—while providing clear definitions of key terms such as clusters, load balancing, caching, and flow control for readers unfamiliar with high‑concurrency and distributed systems.

ArchitectureDistributed SystemsMicroservices

0 likes · 12 min read

Evolution of Project Architecture and Glossary of Common Distributed System Terms

Programmer DD

Jun 5, 2020 · Operations

Why ZooKeeper Fails as Service Discovery: Alibaba’s 10‑Year Lessons

This article examines a decade of Alibaba’s experience with ZooKeeper‑based service discovery, arguing that ZooKeeper’s strong consistency and limited scalability make it unsuitable as a registration center and outlining design principles that favor availability, eventual consistency, and richer health‑check mechanisms.

CAP theoremDistributed Systemsregistration center

0 likes · 20 min read

Why ZooKeeper Fails as Service Discovery: Alibaba’s 10‑Year Lessons

Architecture Digest

Jun 3, 2020 · Operations

Why ZooKeeper Is Not the Best Choice for Service Discovery: Design Considerations for a Registration Center

Drawing on Alibaba's decade‑long experience, this article analyses service‑discovery requirements, CAP trade‑offs, consistency versus availability, health‑check design, disaster recovery, and exception handling to argue that ZooKeeper, while excellent for coordination, is often unsuitable as the primary registration center for large‑scale microservice environments.

CAP theoremDistributed Systemsregistration center

0 likes · 18 min read

Why ZooKeeper Is Not the Best Choice for Service Discovery: Design Considerations for a Registration Center

Cloud Native Technology Community

Jun 1, 2020 · Cloud Native

From Business Pain to a Fully Realized Cloud‑Native Architecture: A Step‑by‑Step Blueprint

This article walks through a practical, step‑by‑step transformation from a monolithic application to a cloud‑native, micro‑service architecture, covering planning, domain‑driven design, continuous integration, service registration, API gateways, databases, caching, logging, configuration management, containerization, performance monitoring, service governance, GitOps, traffic shading, service mesh, stress testing, and multi‑datacenter deployment.

CI/CDDevOpsDistributed Systems

0 likes · 58 min read

From Business Pain to a Fully Realized Cloud‑Native Architecture: A Step‑by‑Step Blueprint

Programmer DD

May 29, 2020 · Backend Development

Demystifying Clusters, Load Balancing & Caching in Modern Backend

This article walks through the evolution of project architectures—from single‑server MVC to RPC, SOA, and micro‑services—explaining key concepts such as clusters, load‑balancing strategies, and various caching mechanisms, helping readers grasp how high‑concurrency, distributed systems are designed and optimized.

Distributed Systemscachingload balancing

0 likes · 12 min read

Demystifying Clusters, Load Balancing & Caching in Modern Backend

Java Backend Technology

May 28, 2020 · Backend Development

How to Build a Low‑Intrusion Transactional Message System with Spring Boot and RabbitMQ

This article details a lightweight transactional message solution for microservices that stores pending messages in a local MySQL table, defers RabbitMQ publishing until after transaction commit, and includes compensation logic with exponential back‑off to ensure reliable asynchronous communication.

CompensationDistributed SystemsMySQL

0 likes · 23 min read

How to Build a Low‑Intrusion Transactional Message System with Spring Boot and RabbitMQ

Java Backend Technology

May 26, 2020 · Backend Development

Choosing the Right Unique ID Strategy: From DB Auto‑Increment to Snowflake

This article reviews common unique identifier generation techniques—including database auto‑increment, UUID, Redis counters, Snowflake, Zookeeper, and MongoDB ObjectId—detailing their advantages, drawbacks, and practical implementation examples with C# code snippets.

Distributed SystemsID generationc++

0 likes · 14 min read

Choosing the Right Unique ID Strategy: From DB Auto‑Increment to Snowflake

dbaplus Community

May 25, 2020 · Operations

Scaling CAT Monitoring at Ctrip: Thread Model, Client Computation & Memory Tweaks

This article details how Ctrip optimized the CAT monitoring system—covering its large‑scale deployment, thread‑model redesign, offloading calculations to clients, double‑buffered reporting, and string handling improvements—to dramatically cut CPU usage, GC pressure, and memory consumption while handling billions of messages daily.

Distributed SystemsPerformance OptimizationThread Model

0 likes · 25 min read

Scaling CAT Monitoring at Ctrip: Thread Model, Client Computation & Memory Tweaks

Yanxuan Tech Team

May 25, 2020 · Operations

How NetEase Cloud Music Built a Scalable Full‑Link Tracing System for Real‑Time Service Diagnosis

This article details the design, implementation, and evolution of NetEase Cloud Music's full‑link tracing platform, covering its motivations, architecture, low‑overhead data collection, multi‑dimensional analysis, service grooming, automated diagnosis, and future plans for AI‑driven anomaly detection and big‑data processing.

Distributed SystemsObservabilityservice monitoring

0 likes · 19 min read

How NetEase Cloud Music Built a Scalable Full‑Link Tracing System for Real‑Time Service Diagnosis

macrozheng

May 21, 2020 · Big Data

Mastering Kafka: Core Concepts, Architecture, and Reliability Guarantees

This comprehensive guide covers Kafka's definition, publish/subscribe model, key components, storage mechanisms, producer and consumer strategies, and reliability features such as ACK levels, ISR, and exactly‑once semantics, providing a solid foundation for real‑time big‑data processing.

Big DataDistributed SystemsKafka

0 likes · 16 min read

Mastering Kafka: Core Concepts, Architecture, and Reliability Guarantees

Youzan Coder

May 20, 2020 · Backend Development

Real-Time Loss Prevention System: Architecture and Implementation at YouZan

YouZan’s real‑time loss‑prevention platform monitors database binlogs, transforms and verifies transaction data across five loosely coupled layers, handling 200 million daily messages and 60 million checks with dynamic sharding, caching and distributed locks to detect over‑charges, duplicate refunds, migration inconsistencies and unauthorized data changes.

Distributed SystemsHBaseMessage Queue

0 likes · 12 min read

Real-Time Loss Prevention System: Architecture and Implementation at YouZan

Alibaba Cloud Developer

May 17, 2020 · Big Data

Inside Alibaba’s Fuxi DAG 2.0: Boosting Big Data Workloads with Dynamic Scheduling

Alibaba’s Fuxi DAG 2.0 redesign separates logical and physical graphs, introduces dynamic scheduling, unified offline and near‑real‑time execution, and a flexible bubble mode, enabling massive big‑data jobs to run up to five times faster while dramatically reducing resource waste.

Big DataDAGDistributed Systems

0 likes · 38 min read

Inside Alibaba’s Fuxi DAG 2.0: Boosting Big Data Workloads with Dynamic Scheduling

Xiaokun's Architecture Exploration Notes

May 15, 2020 · Backend Development

Mastering Distributed System Design: Key Principles, Techniques, and Best Practices

This comprehensive guide explains why distributed systems are needed, outlines design goals, explores essential technologies and architectural patterns, and provides practical strategies for scalability, high availability, service governance, DevOps automation, and monitoring to help engineers build robust distributed architectures.

Distributed SystemsScalabilityhigh availability

0 likes · 22 min read

Mastering Distributed System Design: Key Principles, Techniques, and Best Practices

Meituan Technology Team

May 14, 2020 · Cloud Native

Meituan Naming Service (MNS) 2.0: Architecture Evolution and Business Enablement

Meituan’s Naming Service 2.0 replaces the ZooKeeper‑based 1.0 design with a four‑layer, AP‑oriented architecture that leverages a service‑mesh sidecar, sharded KV storage, and a control service layer, delivering eight‑fold throughput gains, sub‑second latency, zero‑downtime migration for most services, and new business capabilities such as traffic isolation, elastic scaling, and data‑driven SLA monitoring.

Cloud NativeDistributed SystemsMicroservices

0 likes · 25 min read

Meituan Naming Service (MNS) 2.0: Architecture Evolution and Business Enablement

Tencent Tech

May 11, 2020 · Big Data

How Tencent Scaled Elasticsearch to Thousands of Nodes: Core Kernel Optimizations Revealed

This article details Tencent's large‑scale Elasticsearch deployment, covering its massive usage scenarios, the availability, performance, cost and scalability challenges faced, and the comprehensive kernel‑level optimizations—including memory‑based throttling, storage‑model merging, off‑heap caching, rollup and metadata improvements—that enable PB‑level clusters with high reliability and low expense.

Big DataDistributed SystemsElasticsearch

0 likes · 27 min read

How Tencent Scaled Elasticsearch to Thousands of Nodes: Core Kernel Optimizations Revealed

Architecture Digest

May 11, 2020 · Backend Development

Ensuring Idempotency in Distributed Systems: Unique ID Generation Strategies

The article explains why idempotency is essential for reliable service calls, discusses using unique identifiers such as UUIDs and Snowflake algorithms, compares centralized and client‑side ID generation, and offers practical storage and query‑optimisation techniques to prevent duplicate orders and resource waste.

Distributed SystemsIdempotencyUnique ID

0 likes · 6 min read

Ensuring Idempotency in Distributed Systems: Unique ID Generation Strategies

Java Captain

May 8, 2020 · Big Data

Elasticsearch Adoption and Architecture Cases in Major Chinese Companies

The article surveys how leading Chinese tech firms such as JD Daojia, Ctrip, Qunar, 58.com, and Didi have adopted Elasticsearch for large‑scale search, real‑time analytics, and security, detailing their evolving cluster architectures, shard strategies, data volumes, and supporting services.

ArchitectureBig DataDistributed Systems

0 likes · 11 min read

Elasticsearch Adoption and Architecture Cases in Major Chinese Companies

Programmer DD

May 8, 2020 · Backend Development

How to Become a Middleware Engineer: Skills, Roadmap, and Tips

This article outlines what middleware development entails, the essential technical and professional qualities required, various types of middleware, and a step‑by‑step learning roadmap for Java developers aiming to break into middleware engineering.

DevOpsDistributed Systemsbackend-development

0 likes · 8 min read

How to Become a Middleware Engineer: Skills, Roadmap, and Tips

Tencent Cloud Developer

Apr 29, 2020 · Cloud Computing

Large-Scale Task Scheduling Architecture of Tencent Meeting and VStation

The talk explains how Tencent’s self‑developed VStation scheduler, integrated with TKE and using a hybrid sharding‑plus‑master‑worker architecture, enabled Tencent Meeting to scale to over 100 000 hosts and one million CPU cores, cutting provisioning time to under ten seconds while handling thousands of tasks per minute through DAG‑driven automation and fault‑tolerant mechanisms.

Distributed SystemsTencent MeetingVStation

0 likes · 24 min read

Large-Scale Task Scheduling Architecture of Tencent Meeting and VStation

Programmer DD

Apr 29, 2020 · Operations

How to Keep Your Distributed System Running Even When Upstream Services Fail

The article explains why distributed systems must stay alive despite upstream or downstream failures, emphasizing rate limiting and circuit breaking as essential practices to prevent fault propagation and ensure service reliability, and it invites developers to assess their own safeguards.

Circuit BreakingDistributed Systemsrate limiting

0 likes · 3 min read

How to Keep Your Distributed System Running Even When Upstream Services Fail

Tencent Cloud Developer

Apr 28, 2020 · Big Data

Evolution of Ctrip Vacation Pricing Engine: Architecture, Challenges, and Optimizations

Ctrip’s vacation pricing engine evolved from a MySQL‑based synchronous queue to a Kafka‑driven, Spark‑parallelized architecture using HBase, dramatically cutting task generation from five hours to 1.5 hours, boosting price‑accuracy above 90 % while handling billions of calculations and external API constraints.

Distributed SystemsKafkaSpark

0 likes · 18 min read

Evolution of Ctrip Vacation Pricing Engine: Architecture, Challenges, and Optimizations

Top Architect

Apr 28, 2020 · Operations

Evolution of System Architecture: From Single‑Machine to Distributed Designs

The article outlines the historical evolution of IT architecture—from early single‑machine deployments through hot‑standby and multi‑node active clusters to modern distributed systems—explaining the motivations, trade‑offs, and key technologies that drive each generation and offering guidance on selecting the right architecture for business needs.

Distributed SystemsSystem Designdatabase

0 likes · 8 min read

Evolution of System Architecture: From Single‑Machine to Distributed Designs

Tencent Cloud Developer

Apr 26, 2020 · Backend Development

Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI

Ctrip’s flight search service processes two billion daily queries by employing a multi‑level Redis cache, machine‑learning‑driven TTLs, distributed pooling and overload protection, AI‑based anti‑scraping, and robust load‑balancing across three data centers, delivering sub‑second latency, up to three‑fold throughput gains and significant cost reductions.

AIDistributed SystemsReal‑Time Computing

0 likes · 23 min read

Design and Evolution of Ctrip Flight Search System: High‑Throughput Caching, Real‑Time Computing, Load Balancing and AI

Java Backend Technology

Apr 26, 2020 · Databases

When to Shard Your Database? A Practical Guide to Partitioning Strategies

This article explains database bottlenecks caused by IO and CPU limits, introduces horizontal and vertical sharding for databases and tables, compares popular sharding tools, discusses challenges such as distributed transactions, cross‑node joins, pagination and global ID generation, and offers guidance on when and how to apply sharding in real‑world systems.

Distributed SystemsPartitioningScalability

0 likes · 14 min read

When to Shard Your Database? A Practical Guide to Partitioning Strategies

Top Architect

Apr 23, 2020 · Backend Development

Evolution and Core Principles of Large-Scale Website Architecture

The article outlines how large‑scale websites evolve from single‑server monoliths to layered, distributed architectures by separating application, service and data layers, adding caching, clustering, load balancing, CDNs, NoSQL, micro‑services and automation to achieve performance, high availability, scalability, extensibility and security.

Distributed Systemswebsite architecture

0 likes · 17 min read

Evolution and Core Principles of Large-Scale Website Architecture

360 Tech Engineering

Apr 21, 2020 · Backend Development

Using ETCD for Leader Election and High Availability: Architecture, Installation, and Go Implementation

This article explains ETCD's role as a distributed key‑value store, details its architecture and leader election mechanism, provides step‑by‑step cluster deployment on CentOS with systemd, and demonstrates a Go implementation of leader election to achieve high availability.

Distributed SystemsSystemdhigh availability

0 likes · 10 min read

Using ETCD for Leader Election and High Availability: Architecture, Installation, and Go Implementation

ITFLY8 Architecture Home

Apr 21, 2020 · Backend Development

8 Design Principles for Business Middle Platforms and Distributed Services

This article explains how to abstract business functions into middle‑platform models, outlines eight essential design principles for services, and describes key distributed mechanisms such as service registration, elastic scaling, rate limiting, gray‑release, message queues, and transaction handling to build robust, scalable backend systems.

BackendDistributed SystemsMicroservices

0 likes · 18 min read

8 Design Principles for Business Middle Platforms and Distributed Services

Java Backend Technology

Apr 14, 2020 · Cloud Native

Why ZooKeeper Isn’t the Best Choice for Service Discovery: Design Insights

This article analyzes the limitations of ZooKeeper for service discovery, covering consistency, partition tolerance, scalability, persistence, health‑checking, disaster‑recovery, and operational complexities, and explains why modern registration centers should favor AP designs and richer health‑check mechanisms.

CAP theoremDistributed SystemsZooKeeper

0 likes · 19 min read

Why ZooKeeper Isn’t the Best Choice for Service Discovery: Design Insights

Big Data Technology & Architecture

Apr 12, 2020 · Big Data

Understanding Spark and Flink RPC Implementations: A Code Reading Guide

This article explains how to read and compare the RPC implementations of Spark and Flink, covering Actor Model concepts, Akka integration, message handling, threading models, and practical code‑reading techniques while providing detailed code excerpts and architectural analysis.

Distributed SystemsFlinkRPC

0 likes · 32 min read

Understanding Spark and Flink RPC Implementations: A Code Reading Guide

Tencent Cloud Developer

Apr 10, 2020 · Backend Development

High‑Availability Architecture for a Flash‑Sale System in the Weishi Spring Festival Card‑Collect Event

The article details a high‑availability flash‑sale architecture for Weishi’s Spring Festival card‑collect event, describing three design models, a funnel‑style traffic‑filtering approach, product and client strategies, layered rate limiting, sharding, asynchronous order handling, multi‑region DB redundancy, and a three‑level degradation plan to sustain extreme concurrency.

Distributed Systemsflash salehigh availability

0 likes · 9 min read

High‑Availability Architecture for a Flash‑Sale System in the Weishi Spring Festival Card‑Collect Event

Tencent Cloud Developer

Apr 9, 2020 · Cloud Computing

Twenty-Seven New Tencent Cloud TVP Experts Join to Advance Cloud Computing

The article announces that twenty‑seven distinguished professionals from fields such as blockchain, AI, big data, databases, and distributed systems have joined Tencent Cloud’s Valuable Professional program, highlighting their backgrounds, achievements, and commitment to advancing cloud computing and enriching the developer ecosystem.

BlockchainDistributed SystemsTVP

0 likes · 19 min read

Twenty-Seven New Tencent Cloud TVP Experts Join to Advance Cloud Computing

Top Architect

Apr 9, 2020 · Backend Development

Low‑Latency and High‑Availability Design of RocketMQ: Evolution, Optimizations, and Capacity Planning

This article reviews the evolution of Alibaba's Aliware message engine, analyzes the low‑latency and high‑availability challenges faced during Double 11, and details the architectural, JVM, memory, rate‑limiting, and multi‑replica solutions that enabled RocketMQ to achieve sub‑millisecond write latency and five‑nine availability.

Distributed SystemsLow latencyRocketMQ

0 likes · 29 min read

Low‑Latency and High‑Availability Design of RocketMQ: Evolution, Optimizations, and Capacity Planning

21CTO

Apr 6, 2020 · Operations

How Alipay Achieved Near‑Zero Downtime with Multi‑Datacenter Failover Architecture

This article explains the evolution of Alipay's high‑availability and disaster‑recovery architecture—from a simple single‑datacenter design to a multi‑datacenter, unit‑based system with failover and blue‑green deployment—highlighting the challenges, solutions, and operational benefits that enable continuous service during massive traffic spikes.

Alipay architectureBlue‑Green deploymentDistributed Systems

0 likes · 17 min read

How Alipay Achieved Near‑Zero Downtime with Multi‑Datacenter Failover Architecture

Wukong Talks Architecture

Apr 6, 2020 · Fundamentals

Fundamentals of Distributed Systems: Microservices, Clustering, Load Balancing, Service Registry, Configuration Center, Circuit Breaker, and API Gateway

This article introduces core concepts of distributed systems, covering microservices, clustering, remote invocation, load balancing algorithms, service registration and discovery, configuration management, circuit breaking, degradation strategies, and API gateways, providing a comprehensive overview for building resilient cloud-native applications.

Distributed Systemscircuit breakerload balancing

0 likes · 6 min read

Fundamentals of Distributed Systems: Microservices, Clustering, Load Balancing, Service Registry, Configuration Center, Circuit Breaker, and API Gateway

Continuous Delivery 2.0

Apr 3, 2020 · Operations

Scalable and Reliable Configuration Distribution at Facebook

This article explains how Facebook’s Configerator system achieves scalable, reliable configuration distribution using a push model, a hierarchical Zeus tree, Package Vessel for large data, and multi‑repo Git strategies to improve commit throughput and fault tolerance.

Configuration ManagementDistributed SystemsReliability

0 likes · 11 min read

Scalable and Reliable Configuration Distribution at Facebook

Programmer DD

Mar 28, 2020 · Backend Development

Why Is Kafka So Fast? Uncover the 11 Performance Secrets

Kafka achieves its remarkable speed by combining sequential I/O, batch processing, compression, zero‑copy, careful client‑side work, and a design that avoids costly fsync and garbage collection, while maintaining durability, ordering, and at‑least‑once delivery, making it a high‑throughput, low‑latency event streaming platform.

Batch ProcessingDistributed SystemsKafka

0 likes · 15 min read

Why Is Kafka So Fast? Uncover the 11 Performance Secrets

360 Quality & Efficiency

Mar 27, 2020 · Backend Development

Exploring JDK RMI: Architecture, Code Samples, and Service Invocation Process

This article introduces JDK RMI, explains its role in distributed Java applications, provides complete Maven project examples for interface, server, and client implementations, and details the underlying registry, stub, skeleton, and export mechanisms that enable remote method calls.

Distributed SystemsJava ExampleRMI

0 likes · 10 min read

Exploring JDK RMI: Architecture, Code Samples, and Service Invocation Process

Programmer DD

Mar 27, 2020 · Big Data

How Leading Chinese Companies Scale Elasticsearch for Billions of Queries

This article surveys how major Chinese tech firms such as JD.com, Ctrip, Qunar, 58.com and Didi design, scale, and operate massive Elasticsearch clusters for search, real‑time analytics, and security, detailing architecture choices, shard strategies, data pipelines and performance optimizations.

Big DataDistributed SystemsElasticsearch

0 likes · 12 min read

How Leading Chinese Companies Scale Elasticsearch for Billions of Queries

Top Architect

Mar 24, 2020 · Backend Development

How Meituan Built Its Distributed High‑Concurrency Instant Logistics System

This article explains how Meituan’s instant logistics platform evolved from a simple point‑to‑point delivery model to a large‑scale, AI‑driven, distributed micro‑service architecture that ensures ultra‑low latency, high availability, and cost‑effective scaling for real‑time food delivery.

AIDistributed SystemsMeituan

0 likes · 15 min read

How Meituan Built Its Distributed High‑Concurrency Instant Logistics System

Architects' Tech Alliance

Mar 23, 2020 · Fundamentals

What Really Defines Software Architecture? A Deep Dive into Concepts, Layers, and Evolution

This article explains the fundamental concepts of software architecture, distinguishes systems, subsystems, modules, components, and frameworks, outlines various architecture layers and classifications, describes architecture levels, tracks the evolution from monolithic to micro‑services, and discusses how to evaluate and avoid common architectural pitfalls.

Distributed SystemsMicroservicesSoftware Architecture

0 likes · 21 min read

What Really Defines Software Architecture? A Deep Dive into Concepts, Layers, and Evolution

Programmer DD

Mar 23, 2020 · Operations

Mastering Chaos Engineering: Boost Confidence in Distributed Systems

This article explains chaos engineering as a systematic approach to experiment on distributed systems, identifies common failure modes, outlines a four‑step experimentation process, and presents advanced principles to help teams increase reliability and confidence in production environments.

Distributed SystemsReliabilitychaos engineering

0 likes · 7 min read

Mastering Chaos Engineering: Boost Confidence in Distributed Systems

ITFLY8 Architecture Home

Mar 20, 2020 · Backend Development

From Monolith to Microservices: A Practical Guide to Architecture Evolution

This article explains the hierarchical architecture levels, contrasts strategic and tactical design, and walks through the evolution from monolithic applications to distributed services and micro‑services, highlighting benefits, drawbacks, and key metrics for evaluating a sound system architecture.

Distributed SystemsMicroservicesSoftware Architecture

0 likes · 11 min read

From Monolith to Microservices: A Practical Guide to Architecture Evolution

Top Architect

Mar 19, 2020 · Fundamentals

Overview of Common Software Architecture Styles: Monolithic, Distributed, Microservices, and Serverless

The article explains four major software architecture patterns—monolithic, distributed, microservices, and serverless—detailing their structures, advantages, drawbacks, and suitable scenarios to help developers broaden their architectural knowledge and make informed design choices.

Distributed SystemsServerlessSoftware Architecture

0 likes · 12 min read

Overview of Common Software Architecture Styles: Monolithic, Distributed, Microservices, and Serverless

Qunar Tech Salon

Mar 19, 2020 · Big Data

Apache Kafka Overview: Architecture, Features, and Usage

This article provides a comprehensive introduction to Apache Kafka, covering its high‑throughput distributed architecture, core concepts such as topics, partitions, brokers, producers and consumers, design goals, performance characteristics, deployment steps, configuration, and example code for producers, consumers, and Spring Boot integration.

Big DataDistributed SystemsKafka

0 likes · 39 min read

Apache Kafka Overview: Architecture, Features, and Usage

Alibaba Cloud Native

Mar 18, 2020 · Cloud Native

From Java Monolith to Serverless Kubernetes: Building a Custom Container Orchestration System

This article recounts a developer's journey from a single‑machine Java monolith to a full‑stack container orchestration platform, explaining why Kubernetes was needed, how master‑worker components like kube‑apiserver, scheduler, etcd, kubelet and kube‑proxy work together, and how the design evolves toward a serverless model.

Distributed SystemsKubernetesMaster-Worker Architecture

0 likes · 12 min read

From Java Monolith to Serverless Kubernetes: Building a Custom Container Orchestration System

Top Architect

Mar 17, 2020 · Databases

Understanding Ant Financial’s LDC Architecture: Unitization, CAP Analysis, and High‑TPS Design

This article explains how Ant Financial’s massive Double‑11 payment traffic is handled through logical data centers (LDC), unit‑based architecture (RZone, GZone, CZone), traffic routing, disaster‑recovery strategies, and a CAP analysis that highlights the role of OceanBase’s Paxos‑based consensus in achieving high availability and eventual consistency.

CAP theoremDistributed SystemsOceanBase

0 likes · 36 min read

Understanding Ant Financial’s LDC Architecture: Unitization, CAP Analysis, and High‑TPS Design

21CTO

Mar 16, 2020 · Backend Development

How Ant Financial Scales Payments with Distributed Architecture and OceanBase

The article summarizes Xu Wenqi's 2019 Alibaba Cloud Summit talk on Ant Financial's distributed architecture, covering the shift from monolithic to microservices, modular development, load‑balancing, database sharding, the distributed TA system, task scheduling, gray‑release, full‑link stress testing, and OceanBase high‑availability solutions.

Backend ArchitectureDistributed SystemsMicroservices

0 likes · 9 min read

How Ant Financial Scales Payments with Distributed Architecture and OceanBase

Java Backend Technology

Mar 13, 2020 · Backend Development

Idempotency Strategies: Preventing Duplicate Operations in High‑Traffic Systems

Idempotency ensures that repeated execution of an operation yields the same result as a single execution, and this article explains its importance in backend systems, outlines concepts, and presents practical techniques such as unique indexes, token mechanisms, pessimistic and optimistic locks, distributed locks, and API design for reliable, duplicate‑free processing.

Distributed SystemsTokenbackend-development

0 likes · 9 min read

Idempotency Strategies: Preventing Duplicate Operations in High‑Traffic Systems

Tencent Tech

Mar 11, 2020 · Big Data

Scaling the Health Code: Tencent Cloud Elasticsearch at Billion-User Scale

Leveraging Tencent Cloud Elasticsearch, the nationwide COVID‑19 health code platform handled over 1.6 billion scans for more than 900 million users, achieving millisecond‑level search, seamless horizontal scaling, multi‑zone high availability, and robust security, while simplifying development through RESTful APIs and rich UI tools.

Big DataDistributed SystemsElasticsearch

0 likes · 12 min read

Scaling the Health Code: Tencent Cloud Elasticsearch at Billion-User Scale

Java Backend Technology

Mar 11, 2020 · Fundamentals

What Really Defines Software Architecture? Concepts, Layers, and Evolution Explained

This comprehensive guide explains the true essence of software architecture, covering definitions, system vs. subsystem, modules vs. components, frameworks vs. architecture, various architectural layers (business, application, data, code, technical, deployment), evolution from monolith to microservices, common pitfalls, and recommended reading, all aimed at helping architects design suitable, efficient, and maintainable systems.

Distributed SystemsSoftware ArchitectureSystem Design

0 likes · 26 min read

What Really Defines Software Architecture? Concepts, Layers, and Evolution Explained

Architects' Tech Alliance

Mar 3, 2020 · Cloud Native

Understanding Service Mesh: Evolution, Concepts, and Challenges

This article traces the evolution of Service Mesh from early microservice communication challenges through multiple generations of architectures, explains its role as a transparent infrastructure layer for service-to-service traffic, and discusses its benefits, language-agnostic nature, and current performance and operational trade‑offs.

Distributed SystemsMicroservicesService Mesh

0 likes · 8 min read

Understanding Service Mesh: Evolution, Concepts, and Challenges

Java Captain

Mar 3, 2020 · Backend Development

A Curated List of Alibaba Open‑Source Projects for Distributed and Enterprise Development

This article presents a comprehensive collection of Alibaba’s open‑source projects—including Spring Cloud Alibaba, Ant Design, Druid, Dubbo, JStorm, Sentinel, and many others—detailing their core features and providing repository links to help developers build scalable, high‑performance backend and cloud‑native applications.

AlibabaDistributed SystemsMicroservices

0 likes · 16 min read

A Curated List of Alibaba Open‑Source Projects for Distributed and Enterprise Development

Architects' Tech Alliance

Mar 1, 2020 · Backend Development

From Single Server to Cloud Native: How Taobao Scaled to Millions of Requests

This article walks through the step‑by‑step evolution of a high‑traffic e‑commerce backend—from a single‑machine setup to distributed caching, load‑balancing, database sharding, microservices, and finally cloud‑native deployment—highlighting the key technologies and design principles at each stage.

Cloud NativeDistributed SystemsMicroservices

0 likes · 20 min read

From Single Server to Cloud Native: How Taobao Scaled to Millions of Requests

Selected Java Interview Questions

Feb 28, 2020 · Databases

Asynchronous Flow‑Log Approach for Transaction Consistency in Sharded Databases

The article explains how to maintain transaction consistency between a primary database and sharded tables by using a flow‑log table and an asynchronous processor that ensures eventual consistency, handles ordering, retries, and avoids the complexity of transactional message queues.

Distributed Systemsflow logssharding

0 likes · 9 min read

Asynchronous Flow‑Log Approach for Transaction Consistency in Sharded Databases

21CTO

Feb 26, 2020 · Backend Development

Mastering Distributed Rate Limiting: Caching, Degradation, and Flow Control Techniques

This article explains how caching, degradation, and various rate‑limiting strategies—including semaphore‑based concurrency control, token‑bucket algorithms, Guava RateLimiter, custom annotations, Redis interceptors, and Nginx modules—protect high‑concurrency distributed systems, with practical Java code samples and configuration snippets.

Distributed Systemscachingdegradation

0 likes · 19 min read

Mastering Distributed Rate Limiting: Caching, Degradation, and Flow Control Techniques

ITPUB

Feb 26, 2020 · Backend Development

Why XXL‑JOB Is the Lightweight Distributed Scheduler Used by 290+ Companies

XXL‑JOB is an open‑source, lightweight distributed task scheduling platform adopted by over 290 enterprises, offering 35 features, a decoupled scheduling‑center and executor architecture, and a thriving GitHub community with more than 12 K stars and 5 K forks.

BackendDistributed SystemsOpen-source

0 likes · 5 min read

Why XXL‑JOB Is the Lightweight Distributed Scheduler Used by 290+ Companies

Aikesheng Open Source Community

Feb 25, 2020 · Databases

DBLE 2.19.11.0 Release Notes and Feature Overview

The DBLE 2.19.11.0 release introduces 13 new features, 28 bug fixes, and backward‑compatibility changes, providing detailed explanations of global table checks, new commands, and performance improvements for this enterprise‑grade open‑source distributed middleware to enhance.

DBLEDatabase MiddlewareDistributed Systems

0 likes · 10 min read

DBLE 2.19.11.0 Release Notes and Feature Overview

Big Data Technology & Architecture

Feb 24, 2020 · Big Data

Apache Ozone: Architecture, Design Principles, and Deployment Guide

This article introduces Apache Ozone, a scalable distributed object storage system for Hadoop, covering its background, core components, design principles, architecture, deployment steps, configuration examples, and basic command‑line operations for managing volumes, buckets, and keys.

Big DataCLIDeployment

0 likes · 18 min read

Apache Ozone: Architecture, Design Principles, and Deployment Guide

Big Data Technology & Architecture

Feb 22, 2020 · Big Data

Understanding Flink's Asynchronous Barrier Snapshot (ABS) Algorithm for Checkpointing

This article explains how Apache Flink implements fault‑tolerant checkpointing using the Asynchronous Barrier Snapshot (ABS) algorithm, a localized version of the Chandy‑Lamport distributed snapshot, covering barriers, snapshot alignment, exactly‑once versus at‑least‑once semantics, and handling of cyclic dataflow graphs.

Asynchronous Barrier SnapshotDistributed SystemsFlink

0 likes · 9 min read

Understanding Flink's Asynchronous Barrier Snapshot (ABS) Algorithm for Checkpointing

360 Tech Engineering

Feb 21, 2020 · Backend Development

Understanding Message Middleware: Queue and Publish‑Subscribe Styles

This article explains how modern message middleware works by describing the two primary communication styles—message queuing and publish‑subscribe—illustrating each with examples, comparing their characteristics, and listing common middleware products to help developers choose the appropriate solution for their backend systems.

Distributed SystemsMessage QueuePublish-Subscribe

0 likes · 5 min read

Big Data Technology Architecture

Feb 21, 2020 · Databases

Analysis of Elasticsearch Write Operations and Underlying Mechanisms

This article examines how Elasticsearch implements write operations on top of Lucene, detailing the challenges of Lucene's write path and describing Elasticsearch's distributed design, near‑real‑time refresh, translog reliability, shard replication, partial updates, and the complete write workflow from coordinating node to primary and replica shards.

Distributed SystemsElasticsearchShard

0 likes · 14 min read

Analysis of Elasticsearch Write Operations and Underlying Mechanisms

Big Data Technology & Architecture

Feb 19, 2020 · Fundamentals

Understanding CAP, Byzantine Fault Tolerance, PBFT, Paxos, and Raft Consensus Algorithms

This article explains the CAP theorem, illustrates the Byzantine Generals problem, and provides detailed overviews of PBFT, Paxos (including Multi‑Paxos), and Raft consensus algorithms, highlighting their phases, roles, and practical considerations for achieving consistency in distributed systems.

Byzantine Fault ToleranceConsensusDistributed Systems

0 likes · 10 min read

Understanding CAP, Byzantine Fault Tolerance, PBFT, Paxos, and Raft Consensus Algorithms

Top Architect

Feb 14, 2020 · Backend Development

Understanding Microservice Architecture: From Monolithic Three‑Tier to Distributed Services

This article explains the evolution from traditional three‑layer monolithic architecture to microservice architecture, detailing the concepts, advantages, disadvantages, key characteristics, and practical challenges such as distributed complexity, DevOps, testing, and dependency management.

Distributed SystemsSoftware ArchitectureThree-tier

0 likes · 13 min read

Understanding Microservice Architecture: From Monolithic Three‑Tier to Distributed Services

Architects' Tech Alliance

Feb 10, 2020 · Fundamentals

Mastering Distributed System Fundamentals: Models, Replication, Consistency, and Protocols

This article provides a comprehensive overview of distributed system fundamentals, covering node modeling, replica concepts, consistency levels, data distribution strategies, centralized and decentralized replica protocols, lease mechanisms, quorum, two‑phase commit, MVCC, Paxos, and the CAP theorem, while analyzing their trade‑offs in availability, consistency, and partition tolerance.

ConsistencyDistributed SystemsReplication

0 likes · 55 min read

Mastering Distributed System Fundamentals: Models, Replication, Consistency, and Protocols

Youzan Coder

Feb 5, 2020 · Backend Development

Configurable Data Reconciliation Platform at Youzan: Design, Architecture, and Implementation

Youzan built a configurable data reconciliation platform that integrates new scenarios, processes massive real‑time and batch data, offers visual monitoring, automated correction, and flexible Groovy‑based logic across four DDD layers, achieving 99.99% stability while simplifying detection and resolution of cross‑system inconsistencies.

Big DataData ReconciliationDistributed Systems

0 likes · 15 min read

Configurable Data Reconciliation Platform at Youzan: Design, Architecture, and Implementation

Architecture Digest

Jan 31, 2020 · Backend Development

Design and Optimization of Large‑Scale Instant Messaging Backend Architecture

This article analyses the architecture of high‑traffic instant‑messaging services such as WeChat and Momo, detailing long‑connection handling, short‑vs‑long HTTP/TCP protocols, custom binary messaging, smart routing, load‑balancing, sharding, replication, and the engineering trade‑offs required for massive scalability and reliability.

Distributed SystemsIMScalability

0 likes · 12 min read

Design and Optimization of Large‑Scale Instant Messaging Backend Architecture

Architects Research Society

Jan 27, 2020 · Databases

CouchDB Final Consistency and Distributed System Design

This article explains CouchDB’s eventual consistency model, its use of MVCC, CAP theorem trade‑offs, incremental replication, and document validation, illustrating how these mechanisms enable scalable, high‑availability distributed databases without locking, and includes a practical case study of syncing Songbird playlists.

CouchDBDistributed SystemsMVCC

0 likes · 15 min read

CouchDB Final Consistency and Distributed System Design

ITPUB

Jan 22, 2020 · Backend Development

Unlocking High‑Performance Global IDs and Limits with Coconut

This article explains how the open‑source Coconut cache server implements a high‑throughput global sequence ID generator and a lock‑free global limit manager, detailing their data formats, HTTP APIs, command‑line usage, performance benchmarks, and deployment instructions for distributed systems.

BackendDistributed SystemsHTTP API

0 likes · 13 min read

Unlocking High‑Performance Global IDs and Limits with Coconut

Ctrip Technology

Jan 22, 2020 · Databases

Migrating Log Processing from Elasticsearch to ClickHouse: Architecture, Deployment, Optimization, and Benefits

This article details Ctrip's migration of large‑scale log processing from Elasticsearch to ClickHouse, explaining why ClickHouse was chosen, the high‑availability deployment architecture, data ingestion strategies, dashboard integration, performance gains, operational practices, and overall cost and reliability improvements.

ClickHouseDistributed SystemsElasticsearch

0 likes · 12 min read

Migrating Log Processing from Elasticsearch to ClickHouse: Architecture, Deployment, Optimization, and Benefits

Java Backend Technology

Jan 19, 2020 · Backend Development

7 Open-Source Middleware Projects on Gitee to Boost Your Backend

This article introduces seven open‑source middleware projects hosted on Gitee, detailing their features, use cases, and URLs, offering developers practical options for high‑performance, distributed, and reliable messaging, proxy, and push‑notification services in modern backend architectures.

Distributed SystemsGiteemiddleware

0 likes · 7 min read

7 Open-Source Middleware Projects on Gitee to Boost Your Backend

Big Data Technology & Architecture

Jan 16, 2020 · Big Data

Kafka Interview Guide: Core Concepts, Architecture, and Practical Tips

This article compiles essential Kafka interview material, covering its role as a message queue, usage scenarios, architectural components, storage mechanisms, consumer group rebalancing, high‑availability features, replication details, ordering guarantees, producer/consumer client design, topic management, log retention, performance optimizations, and key monitoring metrics.

Big DataDistributed SystemsKafka

0 likes · 16 min read

Kafka Interview Guide: Core Concepts, Architecture, and Practical Tips

Sohu Tech Products

Jan 15, 2020 · Fundamentals

Understanding Computation Balancing in Distributed Systems: The Effect of Replicas on Performance

The article explains how storing multiple replicas of data on a machine influences computation efficiency in distributed systems by describing partitioning, tablet replicas, two basic load‑balancing scenarios, and the overall performance trade‑offs compared to single‑node setups.

BackendDistributed Systemsdata partition

0 likes · 3 min read

Understanding Computation Balancing in Distributed Systems: The Effect of Replicas on Performance

Top Architect

Jan 9, 2020 · Fundamentals

Core Elements and Evolution of Large‑Scale Platform Architecture

This article outlines the five core elements of large‑scale platform architecture—performance, availability, scalability, extensibility, and security—and illustrates their evolution through ten architectural stages ranging from a single‑server LAMP setup to distributed micro‑service systems, accompanied by practical design patterns such as caching, load balancing, database sharding, CDN, and NoSQL.

Distributed Systems

0 likes · 10 min read

Core Elements and Evolution of Large‑Scale Platform Architecture

Tongcheng Travel Technology Center

Jan 7, 2020 · Big Data

Design and Implementation of XFlink: A Flink‑Based Data Migration System on Yarn

The article describes the evolution from the legacy XDATA tool to the new XFlink system, detailing its architecture, core plugins, parser and deployment modules, resource management with Yarn, monitoring via Prometheus and Grafana, and planned enhancements such as Flink SQL configuration and modular plugins.

Big DataData MigrationDistributed Systems

0 likes · 10 min read

Design and Implementation of XFlink: A Flink‑Based Data Migration System on Yarn

Architecture Digest

Jan 7, 2020 · Backend Development

Ensuring 100% Message Delivery with RabbitMQ: Reliability Steps and Idempotent Design

This article explains how to achieve guaranteed 100% message delivery in RabbitMQ by leveraging its acknowledgment mechanisms, implementing producer‑side confirmation steps, designing compensation and retry logic, and ensuring consumer‑side idempotency through unique identifiers and various ID‑generation strategies.

Distributed SystemsIdempotencyMessage Delivery

0 likes · 7 min read

Ensuring 100% Message Delivery with RabbitMQ: Reliability Steps and Idempotent Design

Java Backend Technology

Jan 7, 2020 · Backend Development

Mastering Retry and Idempotency: Prevent Timeout Failures in High‑Concurrency Systems

This article examines a real‑world group‑buy scenario, explains why timeout‑prone interfaces need robust retry and idempotency handling, distinguishes read and write timeouts, outlines key idempotency practices for services and messages, and introduces Guava‑retrying and Spring‑retry as elegant solutions.

Distributed SystemsOperationsRetry

0 likes · 13 min read

Mastering Retry and Idempotency: Prevent Timeout Failures in High‑Concurrency Systems

Tencent Cloud Developer

Jan 6, 2020 · Big Data

Overview of TubeMQ: Principles, Architecture, Performance, and Open‑Source Strategy for Big‑Data Message Queues

TubeMQ is a trillion‑level, Java‑based distributed message‑queue middleware designed for massive‑data ingestion, offering 140 k TPS with sub‑5 ms latency, high reliability, low cost, and horizontal scalability, and is being open‑sourced to the Apache foundation to foster community collaboration and future expansion beyond traditional MQ functions.

Big DataDistributed SystemsMessage Queue

0 likes · 15 min read

Overview of TubeMQ: Principles, Architecture, Performance, and Open‑Source Strategy for Big‑Data Message Queues

Top Architect

Jan 6, 2020 · Backend Development

Alipay’s LDC Architecture: High‑TPS Design, Unitization, and CAP Analysis

The article explains how Alipay’s Logical Data Center (LDC) architecture, with its RZone, GZone, and CZone unitization, combined with OceanBase’s Paxos‑based consensus, enables massive TPS growth, traffic diversion, and disaster‑recovery while navigating the CAP theorem constraints.

CAP theoremDistributed SystemsHigh TPS

0 likes · 35 min read

Alipay’s LDC Architecture: High‑TPS Design, Unitization, and CAP Analysis

Programmer DD

Jan 2, 2020 · Operations

Mastering Zookeeper: From Basics to Advanced Coordination in Distributed Systems

This article provides a comprehensive guide to Zookeeper, covering its role in high‑concurrency distributed environments, core concepts, installation steps, key features such as ordering, replication, and watches, as well as practical command usage and session management.

Coordination ServiceDistributed SystemsInstallation

0 likes · 13 min read

Mastering Zookeeper: From Basics to Advanced Coordination in Distributed Systems

Top Architect

Jan 2, 2020 · Backend Development

Designing a High‑Concurrency Ticket‑Seckill System with Load Balancing, Pre‑Deduction, and Go Implementation

This article analyzes the challenges of handling millions of simultaneous train‑ticket purchase requests, presents a multi‑layer load‑balancing architecture, introduces a pre‑deduction inventory strategy using Redis and local memory, and demonstrates a complete Go implementation with performance testing and key architectural insights.

Distributed SystemsGohigh concurrency

0 likes · 18 min read

Designing a High‑Concurrency Ticket‑Seckill System with Load Balancing, Pre‑Deduction, and Go Implementation

Aikesheng Open Source Community

Dec 27, 2019 · Databases

Understanding Global Sequence Generation in DBLE: Snowflake and Offset‑Step Mechanisms

This article introduces DBLE's advanced global sequence features, explaining the Snowflake algorithm with its timestamp‑based ID generation and the offset‑step token allocator, detailing their designs, capacities, cluster considerations, and practical trade‑offs for distributed database systems.

DBLEDistributed Systemsglobal sequence

0 likes · 5 min read

Understanding Global Sequence Generation in DBLE: Snowflake and Offset‑Step Mechanisms

macrozheng

Dec 26, 2019 · Fundamentals

How ZooKeeper Coordinates Distributed Systems: Nodes, Watchers, and Leader Election

This article explains ZooKeeper's core concepts—including ZNode data storage, node types, watcher mechanisms, session management, and the leader‑follower‑observer architecture—illustrating how it enables reliable coordination and atomic operations in distributed systems.

Distributed SystemsSession ManagementWatcher

0 likes · 16 min read

How ZooKeeper Coordinates Distributed Systems: Nodes, Watchers, and Leader Election

dbaplus Community

Dec 25, 2019 · Backend Development

How NetEase Cloud Music Built a Custom High‑Availability Message Queue on RocketMQ

This article details NetEase Cloud Music's journey from evaluating RabbitMQ, Kafka, and RocketMQ to designing a fully controllable, high‑availability message queue with failover, tracing, monitoring, and numerous custom extensions that now serve hundreds of services and billions of messages daily.

Distributed SystemsMessage QueueRocketMQ

0 likes · 15 min read

How NetEase Cloud Music Built a Custom High‑Availability Message Queue on RocketMQ

Big Data Technology & Architecture

Dec 21, 2019 · Big Data

Kafka Offset Management and Replication Mechanisms Explained

This article provides a comprehensive technical overview of Kafka's offset handling, covering the request entry point, in‑memory offset sources, offset commit and fetch implementations, file storage layout, and the leader‑follower synchronization process that ensures data replication and high‑watermark updates.

Big DataDistributed SystemsHigh Watermark

0 likes · 16 min read

Kafka Offset Management and Replication Mechanisms Explained

Alibaba Cloud Developer

Dec 20, 2019 · Operations

How We Traced a 48‑Hour Memory Leak in a Distributed Coordination Service

This article details a step‑by‑step investigation of repeated follower process alerts in a Paxos‑based distributed coordination service, revealing a Java GC pause‑induced memory leak in the front‑end Proxy and describing the rapid mitigation actions taken to restore system stability.

Distributed Systemsincident responsejava-gc

0 likes · 12 min read

How We Traced a 48‑Hour Memory Leak in a Distributed Coordination Service

Big Data Technology & Architecture

Dec 19, 2019 · Big Data

Apache Kafka 2.4.0 Release: New Features and Improvements

Apache Kafka 2.4.0 introduces a range of new capabilities—including consumer replica fetching, incremental cooperative rebalancing, MirrorMaker 2.0, a new Java authorization API, KTable non‑key joins, administrative replica reassignment, protected REST endpoints, and offset deletion—along with numerous performance and stability improvements.

Apache KafkaBig DataDistributed Systems

0 likes · 3 min read

Apache Kafka 2.4.0 Release: New Features and Improvements

Java High-Performance Architecture

Dec 18, 2019 · Fundamentals

Understanding the CAP Theorem and Distributed Consistency: A Practical Guide

This article explains the CAP theorem and its trade-offs in distributed systems, compares consistency models like ZAB and Raft, discusses multi‑data‑center support, gossip protocols, watch mechanisms, multi‑language clients, DNS‑based service discovery, and health‑check strategies across tools such as Zookeeper, Consul, and Eureka.

Distributed SystemsGossip Protocol

0 likes · 7 min read

Understanding the CAP Theorem and Distributed Consistency: A Practical Guide

ITPUB

Dec 17, 2019 · Backend Development

From Single Server to Cloud‑Native: 12 Steps of Scaling an E‑Commerce Backend

The article walks through the evolution of a high‑traffic e‑commerce backend—from a single‑machine setup to distributed databases, load‑balancing, micro‑services, and finally cloud‑native deployment—highlighting the technical challenges and design principles at each stage.

Cloud NativeDistributed SystemsMicroservices

0 likes · 20 min read

From Single Server to Cloud‑Native: 12 Steps of Scaling an E‑Commerce Backend