Tagged articles
2122 articles
Page 17 of 22
dbaplus Community
dbaplus Community
May 22, 2019 · Operations

Designing a Scalable Monitoring System: From Data Collection to Alerting

This article explains how to build a comprehensive monitoring system for distributed applications by classifying monitoring functions, describing data quadrants, outlining core modules such as collection, processing, feature extraction, and visualization, and reviewing typical implementations for metrics, logs, tracing, alerting, and the key open‑source components involved.

Distributed SystemsMetricsmonitoring
0 likes · 18 min read
Designing a Scalable Monitoring System: From Data Collection to Alerting
Architecture Talk
Architecture Talk
May 20, 2019 · Databases

From Zero to Redis Mastery: Why and How to Use Its Core Features

This article walks through Redis from a basic overview to advanced features such as persistence, Sentinel, clustering, data types, transactions, Lua scripting, pipelining, and distributed locks, illustrating each concept with practical examples and explaining when and why to use them in real‑world applications.

Data TypesDistributed SystemsPersistence
0 likes · 14 min read
From Zero to Redis Mastery: Why and How to Use Its Core Features
Architecture Digest
Architecture Digest
May 15, 2019 · Backend Development

WeChat Backend Architecture: High Availability, Strong Consistency, and Scalable Microservices

This article summarizes the design of WeChat's massive‑scale backend, covering its evolution from early storage systems to a multi‑master PaxosStore architecture that delivers six‑nine availability, strong data consistency, rapid iteration, and a unified microservice framework for billions of daily operations.

Distributed SystemsMicroservicesPaxosStore
0 likes · 14 min read
WeChat Backend Architecture: High Availability, Strong Consistency, and Scalable Microservices
Big Data Technology Architecture
Big Data Technology Architecture
May 13, 2019 · Big Data

Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions

The article analyzes how HBase regions being assigned to a single RegionServer create reliability issues such as jitter, service interruptions, and data loss, examines the underlying hardware, OS, and operational factors, and proposes system optimizations and replica-based high‑availability strategies to mitigate these problems.

Distributed SystemsHBaseRegion
0 likes · 10 min read
Problems Caused by Single-Point Region Assignment in HBase and Possible Solutions
Architecture Digest
Architecture Digest
May 11, 2019 · Cloud Native

Ant Financial’s Fifteen‑Year Technology Architecture Evolution and the Future of FinTech

In a QCon 2019 talk, Ant Financial’s deputy CTO Hu Xi outlines the company’s fifteen‑year journey reshaping payments and micro‑loans through blockchain, AI, security, IoT and cloud computing, and details the emerging cloud‑native, high‑availability, data‑intelligent architecture that will underpin the next generation of financial technology.

Artificial IntelligenceBig DataBlockchain
0 likes · 16 min read
Ant Financial’s Fifteen‑Year Technology Architecture Evolution and the Future of FinTech
Alibaba Cloud Developer
Alibaba Cloud Developer
May 10, 2019 · Cloud Native

How Ant Group Built a Cloud‑Native, Financial‑Grade Architecture Over 15 Years

Ant Group’s former CTO Hu Xi outlines the 15‑year evolution of its fintech architecture, highlighting the five BASIC technologies—blockchain, AI, security, IoT, and cloud computing—while detailing the shift to cloud‑native, distributed middleware, OceanBase, service mesh, risk‑auto‑recovery, and open‑intelligent data platforms.

Big DataBlockchainDistributed Systems
0 likes · 18 min read
How Ant Group Built a Cloud‑Native, Financial‑Grade Architecture Over 15 Years
Architects' Tech Alliance
Architects' Tech Alliance
May 9, 2019 · Databases

From Single‑Node to Distributed: The Evolution of Modern Database Services

This article traces the historical laws that drove computing growth, examines how Redis, MongoDB and Memcached evolved, compares client‑side, proxy and compute‑storage‑separated architectures, evaluates their trade‑offs, and answers common questions about cloud‑based distributed databases.

Cloud DatabasesCompute-Storage SeparationDatabase Architecture
0 likes · 23 min read
From Single‑Node to Distributed: The Evolution of Modern Database Services
AntTech
AntTech
May 9, 2019 · Cloud Native

Ant Financial’s Fifteen‑Year Technology Architecture Evolution and the Future of FinTech

The article reviews Ant Financial’s fifteen‑year journey reshaping payments and micro‑loans through blockchain, AI, security, IoT and cloud computing, explains how distributed middleware, OceanBase, service‑mesh‑based cloud‑native infrastructure and open intelligent computing architectures enable high‑availability, scalable financial services, and introduces the BASIC College talent program.

Artificial IntelligenceBig DataBlockchain
0 likes · 16 min read
Ant Financial’s Fifteen‑Year Technology Architecture Evolution and the Future of FinTech
High Availability Architecture
High Availability Architecture
May 7, 2019 · Cloud Native

A Gentle Introduction to Apache Pulsar: Architecture, Features, and Use Cases

This article introduces Apache Pulsar, a cloud‑native pub/sub messaging platform that solves traditional messaging system limitations with a layered architecture, offering independent scaling, multi‑tenant isolation, cross‑region replication, zero rebalancing, unified queue‑and‑stream models, and built‑in functions and proxy support.

Apache PulsarDistributed SystemsMessaging
0 likes · 5 min read
A Gentle Introduction to Apache Pulsar: Architecture, Features, and Use Cases
Java Backend Technology
Java Backend Technology
May 6, 2019 · Backend Development

Ensuring Zero Message Loss in RabbitMQ: Persistence, Confirm & Idempotency

This article explains how to guarantee that order service messages are reliably delivered to RabbitMQ by using durable queues, the confirm mechanism, early persistence with Redis and scheduled retries, and idempotent processing techniques to achieve near‑100% message safety in high‑concurrency environments.

Confirm MechanismDistributed SystemsIdempotency
0 likes · 10 min read
Ensuring Zero Message Loss in RabbitMQ: Persistence, Confirm & Idempotency
Architects' Tech Alliance
Architects' Tech Alliance
May 1, 2019 · Industry Insights

Why the Mid‑Platform Will Become the Backbone of the Future Industry Internet

The article analyzes the evolution of technology architecture, explains the origin and purpose of the mid‑platform concept from Alibaba and JD, outlines future software trends such as AI‑driven logic, IoT UI, blockchain data, and quantum infrastructure, and proposes a layered application model for the emerging industry internet.

Artificial IntelligenceCloud ComputingDistributed Systems
0 likes · 11 min read
Why the Mid‑Platform Will Become the Backbone of the Future Industry Internet
21CTO
21CTO
Apr 29, 2019 · Big Data

How EasyScheduler Powers Scalable Big Data Workflow Management

EasyScheduler is an open‑source big‑data workflow scheduler that uses a decentralized architecture with Master and Worker nodes coordinated via ZooKeeper, supporting DAG‑based task definitions, various task types, fault tolerance, priority handling, distributed locks, and remote logging, all illustrated with detailed component diagrams.

Big DataDAGDistributed Systems
0 likes · 17 min read
How EasyScheduler Powers Scalable Big Data Workflow Management
Architect's Tech Stack
Architect's Tech Stack
Apr 27, 2019 · Databases

Hybrid Hash‑Range Sharding Strategy with Group‑Based Allocation

This article presents a hybrid sharding approach that combines range partitioning to assign ID ranges to groups and hash modulo on the total number of tables to achieve uniform data distribution while avoiding hotspots and eliminating the need for data migration during scaling.

Distributed SystemsHashScalability
0 likes · 7 min read
Hybrid Hash‑Range Sharding Strategy with Group‑Based Allocation
iQIYI Technical Product Team
iQIYI Technical Product Team
Apr 26, 2019 · Operations

Design and Implementation of iQIYI CDN Inspection System

iQIYI built a three‑component CDN Inspection System that automatically generates tasks, centrally processes and analyzes results, and runs edge measurements to monitor millions of hybrid CDN servers in real time, detecting configuration errors, file mismatches and traffic anomalies, enabling proactive remediation and 100 % local coverage.

CDNCloud ComputingDistributed Systems
0 likes · 11 min read
Design and Implementation of iQIYI CDN Inspection System
Efficient Ops
Efficient Ops
Apr 21, 2019 · Backend Development

Mastering Elasticsearch: From Inverted Index to Distributed Search

This article walks through the fundamentals of search engines, explaining inverted indexes, the explosion of index size, core Elasticsearch concepts, its distributed architecture, and how it powers the ELK stack for log analysis, all illustrated with clear diagrams and examples.

BackendDistributed SystemsELK
0 likes · 6 min read
Mastering Elasticsearch: From Inverted Index to Distributed Search
Youzan Coder
Youzan Coder
Apr 17, 2019 · Big Data

Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase

YouZan’s order data synchronization moves changes from MySQL through Canal‑parsed binlogs into a message queue, then uses sequential SeqNo‑based optimistic locking and HBase’s column‑version timestamps to guarantee ordering for both single‑ and multi‑table updates, while a Logstash‑style configurable pipeline feeds ES for search and HBase for detail queries, eliminating ordered‑queue bottlenecks and ensuring high‑throughput consistency.

BinlogCanalDistributed Systems
0 likes · 12 min read
Order Data Synchronization Architecture at YouZan: From MySQL to ES and HBase
Java High-Performance Architecture
Java High-Performance Architecture
Apr 16, 2019 · Fundamentals

Why Distributed Systems Can’t Have It All: Unpacking the CAP and BASE Theories

The article explains the CAP theorem, detailing how distributed systems must trade off consistency, availability, and partition tolerance, and explores CP vs AP designs, then introduces the BASE model—Basically Available, Soft State, Eventual Consistency—as a practical complement for real‑world architectures.

BASE modelDistributed Systems
0 likes · 8 min read
Why Distributed Systems Can’t Have It All: Unpacking the CAP and BASE Theories
Youzan Coder
Youzan Coder
Apr 12, 2019 · Industry Insights

How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs

This article details Youzan's evolution from a simple Flume‑based log collector to a multi‑tenant, Kafka‑buffered, Spark‑processed, HBase‑backed logging architecture that now handles hundreds of billions of log entries per day, highlighting challenges, design decisions, and future improvements.

Distributed SystemsElasticsearchHBase
0 likes · 10 min read
How Youzan Scaled Its Log Platform to Handle Billions of Daily Logs
Programmer DD
Programmer DD
Apr 12, 2019 · Backend Development

Mastering Microservice Architecture: Core Principles and Practical Guidelines

This article outlines essential microservice principles, from business‑centric service design and standardised interaction protocols to practical splitting, aggregation, technology stack selection, API design, unified logging, error handling and legacy challenges, offering a comprehensive roadmap for successful backend service implementation.

ArchitectureDistributed Systemsservice standards
0 likes · 15 min read
Mastering Microservice Architecture: Core Principles and Practical Guidelines
21CTO
21CTO
Apr 8, 2019 · Blockchain

Understanding Blockchain Architecture: Layers, Implementations, and Knowledge Map

This article explains the fundamental concepts of blockchain, outlines its three-layer architecture (protocol, extension, application), reviews typical language‑specific implementations, and presents a knowledge‑map that helps developers systematically study and build blockchain‑based products.

ArchitectureBlockchainDistributed Systems
0 likes · 16 min read
Understanding Blockchain Architecture: Layers, Implementations, and Knowledge Map
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 7, 2019 · Backend Development

Master High Cohesion & Low Coupling: Microservice Architecture Patterns and DDD Guide

This article explores how to achieve high cohesion and low coupling in distributed systems by examining clean, hexagonal, and CQRS architectures, detailing domain‑driven design layers, microservice boundary design, splitting principles, and practical strategies for data, caching, and resilience in modern backend development.

Distributed SystemsMicroservicesSoftware Architecture
0 likes · 40 min read
Master High Cohesion & Low Coupling: Microservice Architecture Patterns and DDD Guide
Architecture Digest
Architecture Digest
Apr 4, 2019 · Backend Development

Design and Implementation of a Modern IM Message Synchronization and Storage Architecture Using the TableStore Timeline Model

This article explains the evolution from traditional to modern instant‑messaging system architectures, introduces a Timeline logical model for message sync and storage, discusses read‑ and write‑diffusion strategies, evaluates database requirements, and demonstrates a TableStore‑based implementation with sample code.

Distributed SystemsIMTablestore
0 likes · 14 min read
Design and Implementation of a Modern IM Message Synchronization and Storage Architecture Using the TableStore Timeline Model
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Mar 29, 2019 · Backend Development

How MVCC Beats Pessimistic Locks for Distributed Key‑Value Stores

This article examines a distributed system concurrency problem, compares lock‑based transactions with multiversion concurrency control (MVCC), and explains why MVCC often outperforms pessimistic locking in scenarios demanding high read responsiveness and low contention.

Concurrency ControlDistributed SystemsMVCC
0 likes · 10 min read
How MVCC Beats Pessimistic Locks for Distributed Key‑Value Stores
21CTO
21CTO
Mar 28, 2019 · Operations

Master Load Balancing: Principles, Types, and Algorithms Explained

Load balancing distributes traffic across multiple servers to improve performance, ensure high availability, and enable scalability, covering concepts such as vertical and horizontal scaling, business partitioning, various load‑balancing methods (DNS, HTTP, IP, layer‑2, hybrid), and common algorithms like round‑robin, random, least connections, hash, and weighted distribution.

Distributed Systemsload balancingnetwork algorithms
0 likes · 12 min read
Master Load Balancing: Principles, Types, and Algorithms Explained
Alibaba Cloud Developer
Alibaba Cloud Developer
Mar 28, 2019 · Operations

How ChaosBlade Empowers You to Build Resilient Cloud‑Native Systems

ChaosBlade is an open‑source chaos engineering tool from Alibaba that lets you repeatedly inject failures into distributed systems, helping you measure fault tolerance, validate orchestration, test platform robustness, verify monitoring alerts, and improve emergency response capabilities for more reliable cloud‑native applications.

DevOpsDistributed SystemsOpen-source
0 likes · 9 min read
How ChaosBlade Empowers You to Build Resilient Cloud‑Native Systems
21CTO
21CTO
Mar 27, 2019 · Backend Development

Designing Scalable Distributed Architecture for QQGame: Lessons in High‑Performance Backend

This article examines the challenges of handling millions of concurrent QQGame players, explains why client‑side room‑count replication is essential, and proposes a divide‑and‑conquer, scale‑out server cluster with autonomous room and region managers to achieve high availability and data consistency.

Distributed SystemsQQGameServer Architecture
0 likes · 14 min read
Designing Scalable Distributed Architecture for QQGame: Lessons in High‑Performance Backend
Architects' Tech Alliance
Architects' Tech Alliance
Mar 22, 2019 · Operations

Mastering Load Balancing: Principles, Types, and Algorithms Explained

This comprehensive guide explains why load balancing is essential for high‑traffic websites, details vertical and horizontal scaling, compares DNS, IP, link‑layer, and hybrid approaches, outlines common algorithms such as round‑robin and weighted, and reviews hardware versus software solutions.

AlgorithmsDistributed SystemsHardware
0 likes · 12 min read
Mastering Load Balancing: Principles, Types, and Algorithms Explained
Aikesheng Open Source Community
Aikesheng Open Source Community
Mar 18, 2019 · Databases

Improving MyCat Pitfalls with the Distributed Middleware DBLE: Technical Review and Comparison

This article reviews the shortcomings of MyCat in handling sharding, joins, inserts, and session variables, demonstrates how the DBLE middleware addresses those issues with better correctness, performance, security, and operational management, and discusses code‑quality improvements and automated testing practices.

DBLEDistributed Systemssql
0 likes · 19 min read
Improving MyCat Pitfalls with the Distributed Middleware DBLE: Technical Review and Comparison
Java Backend Technology
Java Backend Technology
Mar 15, 2019 · Backend Development

Mastering Dubbo: Core Concepts, Configurations, and Common Pitfalls

This article explains what Dubbo is, its key features such as transparent RPC, cluster fault tolerance, communication protocols, registration centers, serialization options, configuration details, load‑balancing strategies, security mechanisms, common issues and solutions, and compares it with Dubbox and other distributed frameworks.

Distributed SystemsDubboRPC
0 likes · 14 min read
Mastering Dubbo: Core Concepts, Configurations, and Common Pitfalls
21CTO
21CTO
Mar 13, 2019 · Backend Development

From Rejection to Mastery: How Deep Code Reading Boosted My Backend Career

The author shares a personal journey from multiple Alibaba interview rejections to mastering backend engineering through diligent source‑code study, open‑source contributions, algorithm training, and practical project experience, offering actionable advice for aspiring developers seeking growth and interview success.

Distributed SystemsSystem Designalgorithm training
0 likes · 10 min read
From Rejection to Mastery: How Deep Code Reading Boosted My Backend Career
dbaplus Community
dbaplus Community
Mar 12, 2019 · Databases

Mastering HBase Cross‑Datacenter Migration: Snapshots, Architecture, and Real‑World Tips

This article provides a comprehensive technical guide on HBase, covering its core concepts, advantages and drawbacks, architecture layers, practical use cases, and a detailed step‑by‑step process for large‑scale cross‑datacenter migration using snapshot‑based strategies, with commands, diagrams, and lessons learned.

Big DataData MigrationDatabase Architecture
0 likes · 19 min read
Mastering HBase Cross‑Datacenter Migration: Snapshots, Architecture, and Real‑World Tips
Alibaba Cloud Native
Alibaba Cloud Native
Mar 7, 2019 · Cloud Native

How Kubernetes Scheduler Works: Inside the Core Scheduling Engine

This article explains the inner workings of the Kubernetes scheduler, covering its architecture, pod queue handling, filtering, prioritization, binding, preemption, and code-level details, while also discussing current limitations and future enhancements such as the V2 framework and gang scheduling extensions.

Distributed SystemsGoKubernetes
0 likes · 12 min read
How Kubernetes Scheduler Works: Inside the Core Scheduling Engine
Java Captain
Java Captain
Mar 3, 2019 · Databases

Redis Overview: Architecture, Persistence, High Availability, and Client Features

This article explains Redis as an in‑memory data store used for caching, database, and messaging, walks through its evolution from simple HTTP caching to dedicated servers, and details server‑side features like persistence, Sentinel, replication, clustering, as well as client‑side capabilities such as rich data types, transactions, Lua scripting, pipelining, and distributed locks.

ClusterDistributed SystemsPersistence
0 likes · 11 min read
Redis Overview: Architecture, Persistence, High Availability, and Client Features
System Architect Go
System Architect Go
Mar 2, 2019 · Backend Development

Mastering NSQ: Deep Dive into Architecture, Configurations, and Best Practices

This article provides a comprehensive overview of NSQ, covering its core components, message flow, producer and consumer interactions, key configuration parameters, handling of timeouts, backoff mechanisms, service discovery with nsqlookupd, and practical considerations for scaling and reliability.

Backend ArchitectureDistributed SystemsMessage Queue
0 likes · 10 min read
Mastering NSQ: Deep Dive into Architecture, Configurations, and Best Practices
Java Backend Technology
Java Backend Technology
Mar 2, 2019 · Operations

How Alibaba’s ‘MonkeyKing’ Uses Chaos Engineering to Strengthen System Reliability

Alibaba’s MonkeyKing, inspired by Netflix’s Chaos Monkey, employs intentional fault injection—from random node kills to simulated network outages—to test and improve system robustness across IaaS, PaaS, and SaaS layers, offering a comprehensive model for reliability engineering in complex distributed environments.

AlibabaDistributed SystemsFault Injection
0 likes · 8 min read
How Alibaba’s ‘MonkeyKing’ Uses Chaos Engineering to Strengthen System Reliability
Java Captain
Java Captain
Feb 21, 2019 · Backend Development

Top Java Open‑Source Projects on GitHub in January

This article presents a curated list of the most popular Java open‑source projects on GitHub for January, summarizing each repository’s purpose, key features, and star count to help developers discover valuable resources for learning, building, and advancing Java applications.

Distributed SystemsGitHubOpen-source
0 likes · 6 min read
Top Java Open‑Source Projects on GitHub in January
Youzan Coder
Youzan Coder
Feb 20, 2019 · Databases

HBase Read Path Analysis

The article first outlines HBase’s overall architecture and core components, then details the end‑to‑end read path—from client request routing to RegionServer processing, data organization and filtering—and finally presents practical client‑ and server‑side optimizations such as heterogeneous storage, HDFS short‑circuit, hedged reads, high‑availability reads, and warm‑up failure fixes, illustrated with Youzan’s production cluster.

Distributed SystemsHBaseTechnical Guide
0 likes · 17 min read
HBase Read Path Analysis
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 15, 2019 · Big Data

Big Data Mastery Roadmap

This article outlines a comprehensive series of over 500 planned tutorials covering Java advanced features, distributed theory, Hadoop, Spark, Flink, and various big‑data storage and processing technologies, designed to guide engineers transitioning into big‑data development from fundamentals to expert level.

Distributed SystemsFlinkHadoop
0 likes · 4 min read
Big Data Mastery Roadmap
Big Data Technology & Architecture
Big Data Technology & Architecture
Feb 12, 2019 · Big Data

Big Data Mastery Roadmap – Series Overview

An extensive roadmap series titled “Big Data Mastery Roadmap” outlines essential topics—from Java advanced features and JVM internals to Hadoop, Spark, Flink, and big-data algorithms—guiding engineers transitioning to big data development with curated references, updates, and author insights.

Distributed SystemsLearning Pathdata engineering
0 likes · 5 min read
Big Data Mastery Roadmap – Series Overview
Java Captain
Java Captain
Jan 31, 2019 · Backend Development

A Curated List of Alibaba Open‑Source Projects for Java Development

This article presents a curated collection of Alibaba’s open‑source Java projects, ranging from distributed service frameworks like Spring Cloud Alibaba and Dubbo, database tools such as Druid and TDDL, to monitoring, messaging, and cloud‑native components, each with brief descriptions and GitHub links.

AlibabaDistributed SystemsOpen-source
0 likes · 13 min read
A Curated List of Alibaba Open‑Source Projects for Java Development
58 Tech
58 Tech
Jan 25, 2019 · Backend Development

Search Engineering Architecture: Lessons from Zhihu and 58 Group

The article summarizes the evolution and redesign of Zhihu's search engine, details 58 Group's high‑performance uesearch architecture, real‑time indexing mechanisms, cloud‑native deployment with Kubernetes, and highlights key technical insights and future directions for large‑scale search systems.

ArchitectureDistributed SystemsKubernetes
0 likes · 9 min read
Search Engineering Architecture: Lessons from Zhihu and 58 Group
Programmer DD
Programmer DD
Jan 18, 2019 · Backend Development

Mastering Compensation: When to Rollback vs Retry in Distributed Systems

This article explains the purpose of compensation mechanisms in microservice architectures, compares rollback and retry approaches, outlines their implementation details, discusses idempotency concerns, and provides practical best‑practice recommendations for building resilient distributed systems.

CompensationDistributed SystemsIdempotency
0 likes · 12 min read
Mastering Compensation: When to Rollback vs Retry in Distributed Systems
Meituan Technology Team
Meituan Technology Team
Jan 17, 2019 · Information Security

Design and Architecture of a Scalable Host‑Based Intrusion Detection System (HIDS)

The paper presents a highly scalable, low‑overhead Host‑based Intrusion Detection System architecture designed for hundreds of thousands of servers, emphasizing cluster high‑availability, strong consistency via a CP‑oriented etcd backend, Go‑based agents with efficient resource management, modular sandboxing, and robust process monitoring to ensure reliable, secure operation at massive scale.

CAP theoremDistributed SystemsHIDS
0 likes · 26 min read
Design and Architecture of a Scalable Host‑Based Intrusion Detection System (HIDS)
Java Captain
Java Captain
Jan 17, 2019 · Backend Development

Understanding Java RPC: RMI, Hessian, and Dubbo with Code Examples

This article explains the concept of RPC in Java, compares three popular frameworks—RMI, Hessian, and Dubbo—describes their architectures, and provides complete code samples for interfaces, service implementations, clients, and servers to help developers build scalable distributed applications.

Distributed SystemsDubboHessian
0 likes · 6 min read
Understanding Java RPC: RMI, Hessian, and Dubbo with Code Examples
Architects Research Society
Architects Research Society
Jan 12, 2019 · Fundamentals

Architectural Trade‑offs: Why eBay and Amazon Avoid Distributed Transactions and Embrace BASE

The article examines how architects of large-scale systems like eBay and Amazon forgo traditional ACID transactions in favor of BASE principles, balancing consistency, availability, and scalability through application‑level designs, async processing, and strategic trade‑offs informed by the CAP theorem.

ArchitectureBASEDistributed Systems
0 likes · 6 min read
Architectural Trade‑offs: Why eBay and Amazon Avoid Distributed Transactions and Embrace BASE
Youzan Coder
Youzan Coder
Jan 11, 2019 · Backend Development

Business Reconciliation Platform Architecture Design for Distributed Systems

The article describes YouZan's business reconciliation platform for distributed systems, which detects and quantifies data inconsistencies by offering easy plug‑in integration, a four‑step orchestrated workflow, high‑throughput offline processing with Spark, second‑level real‑time event handling, a three‑layer architecture, and health monitoring for transaction chains.

CAP theoremData ConsistencyDistributed Systems
0 likes · 9 min read
Business Reconciliation Platform Architecture Design for Distributed Systems
Manbang Technology Team
Manbang Technology Team
Jan 10, 2019 · Backend Development

Mastering Apache Storm: Architecture, Components, and Real‑Time Processing Essentials

This article provides an in‑depth technical overview of Apache Storm, covering its core architecture, key components such as Nimbus, Supervisor, Worker, Executor, and Task, the role of ZooKeeper, high‑availability setup, API interfaces, code examples, grouping strategies, metrics, back‑pressure handling, and essential configuration parameters for building low‑latency stream processing topologies.

Apache StormBack-pressureBolt
0 likes · 12 min read
Mastering Apache Storm: Architecture, Components, and Real‑Time Processing Essentials
Java Captain
Java Captain
Jan 10, 2019 · Backend Development

Building a Simple Distributed Service with SpringBoot and Dubbo

This tutorial explains key concepts such as distributed systems, RPC, and Dubbo, then guides you through installing Zookeeper, creating Maven modules, configuring SpringBoot and Dubbo, implementing service interfaces, providers, and consumers, and testing a simple HelloWorld distributed service.

Distributed SystemsDubboRPC
0 likes · 12 min read
Building a Simple Distributed Service with SpringBoot and Dubbo
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Jan 7, 2019 · Backend Development

Design Principles and Architecture of Message Queues

Message queues serve as essential decoupling and traffic‑control components in distributed systems, and this article outlines their overall design—including producers, brokers, consumers, communication protocols such as JMS, AMQP and Kafka, storage options, consumption relationship handling, and advanced features like ordering, reliability, persistence, and transaction support.

Distributed Systemsstorage
0 likes · 6 min read
Design Principles and Architecture of Message Queues
Java Captain
Java Captain
Dec 29, 2018 · Backend Development

Understanding Remote Procedure Call (RPC) and Its Implementation in Java

This article explains the concept of Remote Procedure Call (RPC), why distributed systems need remote invocation, the roles of provider, consumer and registry, essential technologies such as dynamic proxies, serialization and NIO, and introduces popular open‑source Java RPC frameworks.

Distributed SystemsDynamic ProxyRPC
0 likes · 8 min read
Understanding Remote Procedure Call (RPC) and Its Implementation in Java
Architect's Tech Stack
Architect's Tech Stack
Dec 22, 2018 · Backend Development

Understanding Idempotency: Concepts, Examples, and Implementation Techniques

This article explains the mathematical and programming concept of idempotency, provides real‑world examples such as duplicate form submissions and payment requests, and details practical techniques—including query/select operations, unique indexes, token mechanisms, pessimistic and optimistic locks, distributed locks, and API design—to ensure idempotent behavior in backend systems.

Distributed SystemsIdempotencydatabase
0 likes · 9 min read
Understanding Idempotency: Concepts, Examples, and Implementation Techniques
Youzan Coder
Youzan Coder
Dec 21, 2018 · Operations

Building MAXIM: A Distributed Full-Link Load Testing Engine Based on Gatling

MAXIM is Youzan’s distributed full‑link load‑testing engine built on Gatling, featuring a central control center, multiple load injectors, a GUI for test orchestration, data‑parameter binding, real‑time injector monitoring, automated reporting with historical retention, and extensible architecture supporting Dubbo and centralized InfluxDB logging.

Distributed SystemsGatlingLoad Testing
0 likes · 10 min read
Building MAXIM: A Distributed Full-Link Load Testing Engine Based on Gatling
Programmer DD
Programmer DD
Dec 21, 2018 · Backend Development

How Circuit Breakers Safeguard Distributed Systems from Cascading Failures

This article explains the concept of circuit breaking in distributed systems, outlines a four‑step implementation process with strategies for detecting unhealthy services, cutting off calls, probing recovery, and restoring normal operation, and shares best‑practice tips to minimize downtime and improve resilience.

Distributed Systemscircuit breakerfault tolerance
0 likes · 10 min read
How Circuit Breakers Safeguard Distributed Systems from Cascading Failures
Programmer DD
Programmer DD
Dec 19, 2018 · Backend Development

Mastering Rate Limiting: When to Use Fixed, Sliding, Leaky or Token Buckets

This article explains the difference between rate limiting and circuit breaking, shows how to determine system capacity, compares fixed‑window, sliding‑window, leaky‑bucket and token‑bucket algorithms with code examples, and offers best‑practice guidance for applying them in distributed backend systems.

Backend ArchitectureCircuit BreakingDistributed Systems
0 likes · 15 min read
Mastering Rate Limiting: When to Use Fixed, Sliding, Leaky or Token Buckets
Java Captain
Java Captain
Dec 17, 2018 · Backend Development

Overview of Traditional Three‑Tier, Cluster, Distributed, and Microservice Architectures for Java Web Applications

The article explains the evolution from classic three‑tier Java web architecture to cluster, distributed, and microservice designs, detailing each model’s components, load‑balancing mechanisms, session sharing, and the trade‑offs of using technologies such as Tomcat, Nginx, Dubbo, and Spring Cloud.

ClusterDistributed SystemsSpring MVC
0 likes · 9 min read
Overview of Traditional Three‑Tier, Cluster, Distributed, and Microservice Architectures for Java Web Applications
Youzan Coder
Youzan Coder
Dec 14, 2018 · Operations

Youzan Full‑Link Load Testing Architecture and Implementation

Youzan’s full‑link load‑testing architecture combines a traffic generator, a data‑factory pipeline, and the Maxim platform to replay realistic e‑commerce user actions, tag and isolate test traffic via unified headers, route reads/writes to shadow storage, and integrate Gatling for capacity planning, degradation, alarm, disaster‑recovery and throttling drills.

Big DataData IsolationDistributed Systems
0 likes · 13 min read
Youzan Full‑Link Load Testing Architecture and Implementation
Architects' Tech Alliance
Architects' Tech Alliance
Dec 10, 2018 · Fundamentals

Why Consistency Matters in Distributed Systems: A Deep Dive

This article explains the fundamental reasons for building distributed systems, examines the inevitable side‑effects—especially data consistency challenges—analyzes the root causes of inconsistency, and walks through various consistency models from eventual to linearizability with clear examples and illustrations.

Data ConsistencyDistributed SystemsLinearizability
0 likes · 10 min read
Why Consistency Matters in Distributed Systems: A Deep Dive
dbaplus Community
dbaplus Community
Dec 6, 2018 · Backend Development

Mastering RabbitMQ: Core Concepts, Patterns, and Advanced Features

This article provides a comprehensive guide to RabbitMQ, covering the fundamentals of message middleware, the P2P and Pub/Sub models, a comparison with Kafka and RocketMQ, detailed explanations of exchanges, queues, channels, and advanced features such as mandatory routing, backup exchanges, TTL, dead‑letter queues, delayed queues, priority queues, and RPC implementations for reliable distributed systems.

Dead Letter QueueDistributed SystemsMessage Queue
0 likes · 23 min read
Mastering RabbitMQ: Core Concepts, Patterns, and Advanced Features
21CTO
21CTO
Nov 28, 2018 · Backend Development

How Uber Scaled Its Real-Time Ride-Sharing Platform: Architecture & Challenges

This article examines how Uber built and scaled its real-time ride-sharing platform, detailing the original simple PHP-MySQL architecture, subsequent extensions with message queues, MongoDB, Ringpop storage, TChannel communication, fault-tolerance strategies, latency challenges, and practical tools for distributed system design.

Distributed SystemsMicroservicesRingpop
0 likes · 18 min read
How Uber Scaled Its Real-Time Ride-Sharing Platform: Architecture & Challenges
Hujiang Technology
Hujiang Technology
Nov 26, 2018 · Backend Development

Ensuring Distributed Final Consistency: Heavy and Light Approaches, Principles and Practices

The article explains distributed final consistency challenges, compares heavyweight transaction frameworks with lightweight techniques such as idempotency, retries, state machines, recovery logs, and async verification, and outlines CAP, BASE principles and practical implementation steps for backend systems.

BASECAP theoremConsistency
0 likes · 14 min read
Ensuring Distributed Final Consistency: Heavy and Light Approaches, Principles and Practices
Youzan Coder
Youzan Coder
Nov 23, 2018 · Cloud Computing

Transparent Multilevel Cache (TMC): Architecture, Hotspot Detection, and Local Cache in Youzan PaaS

Youzan’s Transparent Multilevel Cache (TMC) adds automatic hotspot detection and a 64 MB local cache to existing distributed caches via a Hermes‑SDK‑augmented Jedis client, delivering transparent Java integration, strong consistency, up to 80 % local‑hit rates, and improved QPS during high‑traffic events.

CacheDistributed SystemsPaaS
0 likes · 15 min read
Transparent Multilevel Cache (TMC): Architecture, Hotspot Detection, and Local Cache in Youzan PaaS
Qunar Tech Salon
Qunar Tech Salon
Nov 14, 2018 · Big Data

Comparing Apache Pulsar and Apache Kafka: Message Models, Consumption, Acknowledgment, Retention, and Architecture

This article provides a detailed comparison between Apache Pulsar and Apache Kafka, covering their message consumption models (queue vs. stream), subscription types, acknowledgment mechanisms, retention policies, and underlying layered architecture, highlighting Pulsar's unified API and segment‑based storage advantages.

Apache KafkaApache PulsarDistributed Systems
0 likes · 21 min read
Comparing Apache Pulsar and Apache Kafka: Message Models, Consumption, Acknowledgment, Retention, and Architecture
Architect's Tech Stack
Architect's Tech Stack
Nov 13, 2018 · Databases

Understanding Hot Key Issues and Effective Solutions in Distributed Caching Systems

This article explains the causes of hot key problems in high‑traffic scenarios, outlines their potential impact on system performance, and presents multiple mitigation strategies—including server‑side caching, Memcache/Redis, local caches, read‑write separation, and proactive hot‑data detection—while comparing their advantages and trade‑offs.

Backend PerformanceDistributed SystemsHot Key
0 likes · 7 min read
Understanding Hot Key Issues and Effective Solutions in Distributed Caching Systems
DataFunTalk
DataFunTalk
Nov 9, 2018 · Backend Development

From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies

This article presents a comprehensive overview of constructing a search engine using Elasticsearch, covering architecture components, data read/write mechanisms, shard management, caching strategies, and real‑world case studies that illustrate performance tuning, isolation, and deployment best practices.

Distributed SystemsElasticsearchPerformance Optimization
0 likes · 14 min read
From Zero to One: Building and Optimizing Search Engines with Elasticsearch – Insights and Case Studies
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 8, 2018 · Databases

Why Alibaba’s Li Fei‑Fei Earned the ACM Distinguished Scientist Honor

Alibaba’s DAMO Academy chief database scientist Li Fei‑Fei was named an ACM Distinguished Scientist in 2018, recognizing his groundbreaking work on next‑generation distributed databases, unstructured data management, and data security that power Alibaba’s massive services and national smart‑city and meteorological platforms.

ACM Distinguished ScientistAlibabaDistributed Systems
0 likes · 5 min read
Why Alibaba’s Li Fei‑Fei Earned the ACM Distinguished Scientist Honor
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 6, 2018 · Databases

How Alibaba’s Database Tech Evolved for Double 11: 10 Years of Innovation

This article chronicles Alibaba's decade‑long journey of database innovations—from commercial and open‑source systems to a self‑developed distributed engine—highlighting the technical breakthroughs, performance optimizations, and intelligent features that powered successive Double 11 shopping festivals.

AlibabaDistributed SystemsDouble 11
0 likes · 23 min read
How Alibaba’s Database Tech Evolved for Double 11: 10 Years of Innovation
dbaplus Community
dbaplus Community
Nov 4, 2018 · Databases

How Spark Turns Traditional Databases into Powerful OLAP Engines

This article examines why traditional relational databases like MySQL struggle with analytical workloads, compares ROLAP and MOLAP approaches, explains Spark’s architecture and its advantages for OLAP, and details how Alibaba Cloud’s DRDS HTAP leverages a Spark‑based engine to deliver real‑time distributed query processing.

Distributed SystemsHTAPMPP
0 likes · 11 min read
How Spark Turns Traditional Databases into Powerful OLAP Engines
JD Tech
JD Tech
Nov 2, 2018 · Backend Development

Evolution and Optimization of JD B2B Platform Architecture

This article presents a comprehensive case study of JD's B2B platform, detailing its three development phases, the challenges of the initial monolithic architecture, and the step‑by‑step architectural refinements—including service decomposition, database sharding, messaging middleware, and distributed search—that culminated in the current modular 3.0 design.

ArchitectureB2BDistributed Systems
0 likes · 11 min read
Evolution and Optimization of JD B2B Platform Architecture
AntTech
AntTech
Nov 2, 2018 · Information Security

Ant Group’s TRaaS: A Technological Risk‑Defense Platform for Financial Systems

Ant Group unveiled TRaaS (Technological Risk‑defense as a Service), a comprehensive platform that combines high‑availability, real‑time fund reconciliation and AI‑driven self‑healing capabilities to protect large‑scale financial systems against technical risks.

Distributed SystemsTRaaSaiops
0 likes · 10 min read
Ant Group’s TRaaS: A Technological Risk‑Defense Platform for Financial Systems
Node Underground
Node Underground
Oct 31, 2018 · Cloud Computing

Unlock 32 Cloud Design Patterns to Solve Distributed System Challenges

This article introduces the free classic book “Cloud Design Patterns,” which catalogs 32 design patterns across eight problem domains—availability, data management, design & implementation, messaging, management & monitoring, performance & scalability, resiliency, and security—offering practical solutions and C# examples for modern cloud and distributed systems.

Distributed Systemsresiliencysharding
0 likes · 5 min read
Unlock 32 Cloud Design Patterns to Solve Distributed System Challenges
Programmer DD
Programmer DD
Oct 30, 2018 · Fundamentals

What Is Paxos? A Storytelling Guide to Distributed Consensus

This article uses a vivid allegorical story to introduce the Paxos algorithm, then explains its roles, two-phase protocol, fault assumptions, and why majority and multiple acceptors are essential for achieving reliable consensus in distributed systems.

Distributed SystemsPaxosalgorithm
0 likes · 10 min read
What Is Paxos? A Storytelling Guide to Distributed Consensus
AntTech
AntTech
Oct 26, 2018 · Backend Development

An Overview of SOFARPC: Design, Extensions, and Core Features

This article introduces SOFARPC, Ant Financial's financial‑grade Java RPC framework, covering its overall architecture, extension mechanism, link tracing, connection management, synchronous and asynchronous invocation models, thread model, fault‑tolerance, generic invocation, data passthrough, graceful shutdown, routing implementation, and annotation support for building robust microservice systems.

Distributed SystemsMicroservicesRPC
0 likes · 13 min read
An Overview of SOFARPC: Design, Extensions, and Core Features
58 Tech
58 Tech
Oct 24, 2018 · Backend Development

Overview of the SCF RPC Framework: Architecture, Call Modes, Serialization, Service Registration, and Monitoring

This article introduces the SCF RPC framework developed by 58, covering its overall architecture, synchronous and callback call modes, timeout handling, custom serialization techniques, service registration and discovery using etcd, as well as data collection, storage, and monitoring mechanisms for large‑scale distributed services.

Distributed SystemsRPCSCF
0 likes · 16 min read
Overview of the SCF RPC Framework: Architecture, Call Modes, Serialization, Service Registration, and Monitoring
AntTech
AntTech
Oct 24, 2018 · Fundamentals

A Comprehensive Overview of Alibaba’s Open‑Source Projects Across Frontend, Backend, Mobile, Database, and System Domains

This article presents a curated collection of Alibaba Group’s open‑source projects, spanning frontend design systems, Java libraries, database engines, distributed file and messaging systems, as well as tutorials, highlighting each project's purpose, key features, and GitHub repository links for developers seeking robust, production‑grade solutions.

AlibabaBackendDistributed Systems
0 likes · 20 min read
A Comprehensive Overview of Alibaba’s Open‑Source Projects Across Frontend, Backend, Mobile, Database, and System Domains
Architects' Tech Alliance
Architects' Tech Alliance
Oct 15, 2018 · Databases

Data Sharding in Distributed Systems: Partitioning Strategies, Metadata Management, and Consistency Mechanisms

The article explains how distributed storage systems solve the fundamental problems of data sharding and redundancy by describing three sharding methods (hash, consistent‑hash, and range‑based), the criteria for choosing a shard key, the role of metadata servers, and consistency techniques such as leasing, all illustrated with concrete examples and code snippets.

Distributed Systemsconsistent hashingdatabases
0 likes · 25 min read
Data Sharding in Distributed Systems: Partitioning Strategies, Metadata Management, and Consistency Mechanisms
dbaplus Community
dbaplus Community
Oct 15, 2018 · Backend Development

Mastering Server‑Side Caching: Design Patterns, Redis Tips, and Best Practices

This article explores the fundamentals and practical details of server‑side caching in distributed web systems, covering local vs. distributed caches, Redis instance planning, key‑value modeling, persistence, eviction policies, and CRUD operations with performance‑focused recommendations.

Cache DesignDistributed SystemsPerformance
0 likes · 14 min read
Mastering Server‑Side Caching: Design Patterns, Redis Tips, and Best Practices
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Oct 12, 2018 · Backend Development

From Zero to Scalable Logistics: A Real‑World Backend Architecture Evolution

This article narrates the step‑by‑step evolution of a cross‑border e‑commerce logistics platform, detailing how its backend architecture was repeatedly redesigned—from a simple 1.0 system to a robust, fault‑tolerant 3.0 solution—driven by growing business needs, performance challenges, and reliability lessons.

Backend ArchitectureDistributed SystemsMicroservices
0 likes · 16 min read
From Zero to Scalable Logistics: A Real‑World Backend Architecture Evolution
DataFunTalk
DataFunTalk
Sep 29, 2018 · Big Data

Applying HBase in a Risk‑Control System and High‑Availability Practices

This article summarizes Guo Dongdong’s presentation on leveraging HBase for a risk‑control platform, detailing its architecture, data import/export mechanisms, indexing, region server recovery challenges, monitoring, SQL interception, dual‑cluster high‑availability, and future enhancements for large‑scale, low‑latency big‑data services.

Distributed SystemsHBasePhoenix
0 likes · 13 min read
Applying HBase in a Risk‑Control System and High‑Availability Practices
Manbang Technology Team
Manbang Technology Team
Sep 28, 2018 · Backend Development

Global Unique Ordered ID Generation in Distributed Systems

This article presents a high-availability global unique ordered ID generation solution, Cantor, which addresses the limitations of UUIDs and auto-increment IDs in distributed systems by combining time-based and node-specific sequences for scalability and performance.

CantorDistributed SystemsGlobal ID Generation
0 likes · 11 min read
Global Unique Ordered ID Generation in Distributed Systems
iQIYI Technical Product Team
iQIYI Technical Product Team
Sep 21, 2018 · Backend Development

Microservice Practices and Lessons from iQIYI Video Backend Development Team

The iQIYI video backend team outlines their microservice journey, detailing service decomposition strategies, choosing Spring Cloud for its low migration cost and rich ecosystem, and building a shared platform of registries, configuration, gateways, monitoring, and CI/CD to boost efficiency, reliability, and scalability while planning future adoption of service mesh and domain‑driven design.

Cloud NativeDistributed SystemsMicroservices
0 likes · 14 min read
Microservice Practices and Lessons from iQIYI Video Backend Development Team
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Sep 17, 2018 · Backend Development

Session Sharing Solutions in Distributed Environments

This article explains the concept of web sessions, the consistency challenges that arise in distributed server clusters, and evaluates four common solutions—session replication, session affinity, cookie‑based sessions, and dedicated session servers—detailing their use cases, advantages, and drawbacks.

Distributed SystemsSessionsession replication
0 likes · 6 min read
Session Sharing Solutions in Distributed Environments
Mike Chen's Internet Architecture
Mike Chen's Internet Architecture
Sep 13, 2018 · Operations

Common Open‑Source Monitoring Systems and Zabbix Monitoring Process

The article introduces common open‑source monitoring tools such as Zabbix and Nagios, explains why distributed systems need proactive health checks, compares features, and provides a detailed Zabbix monitoring workflow including data collection, storage, visualization, alerting, and specific metrics for servers, networks, JVM and MySQL.

Distributed SystemsNagiosOperations
0 likes · 8 min read
Common Open‑Source Monitoring Systems and Zabbix Monitoring Process