Tagged articles
5 articles
Page 1 of 1
Architect-Kip
Architect-Kip
Oct 28, 2025 · Operations

Mastering Failure Recovery: Fast‑Fail, Auto‑Retry, and Resilience Patterns for Distributed Systems

This guide outlines core principles and practical solutions for building resilient backend systems, covering fast‑failure handling, automatic retries with exponential back‑off, circuit‑breaker usage, idempotency, batch job strategies, online transaction patterns, and robust message‑queue processing.

Batch ProcessingIdempotencyMessage Queue
0 likes · 17 min read
Mastering Failure Recovery: Fast‑Fail, Auto‑Retry, and Resilience Patterns for Distributed Systems
政采云技术
政采云技术
Aug 2, 2022 · Fundamentals

Understanding the Chandy‑Lamport Distributed Snapshot Algorithm

This article explains the Chandy‑Lamport algorithm for capturing consistent global snapshots in distributed systems, describes its assumptions and message‑marker rules, walks through a detailed example with three processes and channels, and relates it to Apache Flink's asynchronous checkpoint mechanism.

Apache FlinkChandy-LamportDistributed Systems
0 likes · 14 min read
Understanding the Chandy‑Lamport Distributed Snapshot Algorithm
dbaplus Community
dbaplus Community
Apr 10, 2022 · Databases

Designing a High‑Performance Distributed KV Store for B‑Station

This article details the background, architecture, core features, and operational practices of a custom high‑reliability, high‑throughput key‑value storage system that combines Raft replication, flexible partitioning, binlog support, bulk loading, and multi‑active deployment for B‑Station's diverse data workloads.

BinlogPartitioningRaft replication
0 likes · 22 min read
Designing a High‑Performance Distributed KV Store for B‑Station
Bilibili Tech
Bilibili Tech
Mar 11, 2022 · Databases

Design and Architecture of Bilibili's High‑Performance Distributed KV Store

Bilibili’s high‑performance distributed KV store combines hash and range partitioning, Raft‑based multi‑replica consistency, and a Metaserver‑managed topology of pools, zones, nodes, tables, shards and replicas, offering features such as partition splitting, binlog streaming, multi‑active replication, bulk loading, KV‑storage separation, and automated load, leader and health balancing for reliable, scalable data services.

PartitioningRaft consensusbulk load
0 likes · 22 min read
Design and Architecture of Bilibili's High‑Performance Distributed KV Store