Tagged articles
1413 articles
Page 15 of 15
High Availability Architecture
High Availability Architecture
Jul 22, 2015 · Backend Development

Designing Uber’s High‑Availability Messaging System: Fault Tolerance, Sharding, and Multi‑Data‑Center Strategies

The article details Uber senior engineer Zhao Lei’s presentation on building a highly available messaging platform, covering single‑point failure mitigation, sharding approaches, large‑scale outage handling, cross‑region failover, and the practical engineering practices and protocols used to keep billions of users online.

BackendMessagingfault tolerance
0 likes · 16 min read
Designing Uber’s High‑Availability Messaging System: Fault Tolerance, Sharding, and Multi‑Data‑Center Strategies
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
May 27, 2015 · Backend Development

WhatsApp’s High‑Reliability Architecture for 450 Million Users

This article examines WhatsApp’s high‑reliability architecture that supports 450 million users, detailing its Erlang‑based backend, hardware choices, scaling techniques, performance metrics, monitoring tools, and lessons learned from achieving up to two million concurrent connections on a single server.

ErlangScalabilityWhatsApp
0 likes · 18 min read
WhatsApp’s High‑Reliability Architecture for 450 Million Users

Understanding Kafka High Availability: Data Replication and Leader Election

The article explains why Kafka introduced high availability starting with version 0.8, detailing the need for data replication and leader election, describing replica distribution algorithms, replication mechanics, ISR handling, ZooKeeper structures, and the broker failover process to ensure fault‑tolerant streaming.

KafkaZooKeeperhigh availability
0 likes · 19 min read
Understanding Kafka High Availability: Data Replication and Leader Election
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 26, 2015 · Cloud Computing

Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations

The 2015 Beijing QCon highlighted four technical talks covering asynchronous processing in distributed systems, the DRC data‑replication infrastructure, minute‑level high‑availability fault recovery, and cloud‑era database operations, illustrating Alibaba's approaches to scalability and reliability in modern cloud platforms.

Distributed SystemsQConasynchronous processing
0 likes · 6 min read
Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations

Designing a High‑Availability, Auto‑Scaling KV Storage System Based on Memcached and Redis

This article examines common NoSQL key‑value stores such as Memcached and Redis, compares their strengths and limitations, and proposes a distributed architecture with routing, storage, management, and migration nodes that achieves high availability, automatic fault‑tolerance, load balancing, and elastic scaling.

KV StoreMemcachedelastic scaling
0 likes · 15 min read
Designing a High‑Availability, Auto‑Scaling KV Storage System Based on Memcached and Redis
MaGe Linux Operations
MaGe Linux Operations
Oct 28, 2014 · Databases

Redis vs MySQL & Memcached: Key Differences, Use‑Cases, and HA Design

This article compares Redis with MySQL, outlines their similarities and differences, examines Redis alongside Memcached, EhCache, and OSCache, and proposes a simple high‑availability architecture for Redis, highlighting performance, data model, scalability, and operational considerations.

database comparisonhigh availabilitymysql
0 likes · 6 min read
Redis vs MySQL & Memcached: Key Differences, Use‑Cases, and HA Design
MaGe Linux Operations
MaGe Linux Operations
Aug 22, 2014 · Operations

Understanding Linux Clusters: Differences, Types, and Key Features

This article explains what a Linux cluster is, contrasts it with distributed systems, outlines its two main characteristics—scalability and high availability—along with essential capabilities like load balancing and error recovery, and details common cluster types such as high‑availability, load‑balancing, and high‑performance computing clusters.

ClusterHPCLinux
0 likes · 10 min read
Understanding Linux Clusters: Differences, Types, and Key Features
Baidu Tech Salon
Baidu Tech Salon
Apr 22, 2014 · Operations

Baidu's Optimization of MooseFS and Redis: Architecture Improvements and Performance Enhancement

At Baidu’s 49th Technical Salon, Cheng Yishi explained how the company revamped its MooseFS and Redis systems by adding a Shadow Master to split reads from writes, introducing Slave nodes for failover, and deploying a Redis proxy middleware, thereby dramatically improving performance, scalability, and high‑availability for critical services.

BaiduMooseFSShadow Master
0 likes · 6 min read
Baidu's Optimization of MooseFS and Redis: Architecture Improvements and Performance Enhancement