Tagged articles

High Availability

1447 articles · Page 15 of 15
21CTO
21CTO
Feb 9, 2016 · Databases

Understanding Database Clustering: Architectures, Benefits, and Challenges

This article explores the importance of database clustering in modern information systems, outlines the challenges of performance, availability, and scalability, and compares Share‑Disk and Share‑Nothing architectures along with their advantages, drawbacks, and real‑world implementations.

High Availabilitydatabase clusteringscalability
0 likes · 10 min read
Understanding Database Clustering: Architectures, Benefits, and Challenges
21CTO
21CTO
Feb 9, 2016 · Cloud Computing

How Tencent Cloud Powers WeChat’s Massive New Year Red Envelope Storm

During the Chinese New Year, Tencent Cloud handled billions of WeChat red‑envelope transactions by employing multi‑region load balancing, disaster‑recovery architectures, database sharding, and high‑throughput caching to ensure seamless, high‑availability service for millions of concurrent users.

High Availabilitydistributed systems
0 likes · 5 min read
How Tencent Cloud Powers WeChat’s Massive New Year Red Envelope Storm
21CTO
21CTO
Feb 4, 2016 · Backend Development

Key Principles for Building Scalable Distributed Web Systems

This article outlines essential design principles for large‑scale web architectures—including availability, performance, reliability, scalability, manageability and cost—and demonstrates their application through a detailed image‑hosting service example, covering services, redundancy, partitioning, caching, proxies, indexing, load balancing, and queuing to achieve efficient, scalable data access.

CachingData PartitioningHigh Availability
0 likes · 37 min read
Key Principles for Building Scalable Distributed Web Systems
21CTO
21CTO
Jan 28, 2016 · Operations

How to Build High‑Availability Systems: Lessons from a Transaction Platform Evolution

This article shares practical insights on achieving high availability by understanding goals, decomposing requirements, designing resilient architectures, ensuring operability, testing rigorously, and reducing release risk, illustrated through the multi‑stage evolution of a transaction system.

High AvailabilityMicroservicesMonitoring
0 likes · 14 min read
How to Build High‑Availability Systems: Lessons from a Transaction Platform Evolution
High Availability Architecture
High Availability Architecture
Jan 21, 2016 · Databases

Design Considerations for a Scalable High‑Availability Weibo Distributed Storage System

The article outlines the design requirements, sharding strategies, capacity planning, cost trade‑offs, and indexing techniques for building a highly available, scalable distributed storage solution for Weibo, focusing on storage‑layer decisions without involving caching or ID generation.

Database DesignDistributed storageHigh Availability
0 likes · 8 min read
Design Considerations for a Scalable High‑Availability Weibo Distributed Storage System
High Availability Architecture
High Availability Architecture
Jan 17, 2016 · Operations

High‑Availability Architecture and Scaling Practices of Meipai Short‑Video Platform

This article outlines Meipai’s evolution into a billion‑user short‑video platform, detailing its high‑availability, scalable architecture, service discovery via etcd, data storage challenges, CDN and cloud‑storage redundancy, fault‑tolerance mechanisms, and future directions such as H.265 adoption and P2P‑CDN hybrid delivery.

CDNEtcdHigh Availability
0 likes · 18 min read
High‑Availability Architecture and Scaling Practices of Meipai Short‑Video Platform
ITPUB
ITPUB
Jan 8, 2016 · Databases

Mastering Oracle RAC: Best Practices, Common Pitfalls, and Real-World Cases

This technical session covers Oracle RAC high‑availability best practices, installation steps, daily operational commands, detailed case studies of auto‑start checks, version‑mix issues, addNode failures, network heartbeat problems, and client connection errors, plus a concise Q&A on uninstall, SCAN vs VIP, and split‑brain detection.

High AvailabilityInstallationOracle
0 likes · 21 min read
Mastering Oracle RAC: Best Practices, Common Pitfalls, and Real-World Cases
21CTO
21CTO
Dec 30, 2015 · Operations

Designing a 100k-Server Monitoring System: Architecture and Key Lessons

This article shares the architecture, design principles, challenges, and performance‑optimizing solutions behind a ten‑hundred‑thousand‑scale server monitoring system, covering data collection agents, distributed pipelines, real‑time alerts, high throughput, multi‑platform support, and practical lessons learned.

High AvailabilityPerformance OptimizationServer monitoring
0 likes · 10 min read
Designing a 100k-Server Monitoring System: Architecture and Key Lessons
High Availability Architecture
High Availability Architecture
Dec 18, 2015 · Operations

Weibo's Multi-Data-Center (Active‑Active) Architecture: Experience, Challenges, and Best Practices

The article details Weibo's journey in building a multi‑data‑center active‑active architecture, covering its evolution, technical challenges such as latency and data synchronization, the adopted MCQ‑based messaging solution, operational best practices, and future directions for high‑availability deployments.

CachingHigh AvailabilityMulti-Data Center
0 likes · 16 min read
Weibo's Multi-Data-Center (Active‑Active) Architecture: Experience, Challenges, and Best Practices
MaGe Linux Operations
MaGe Linux Operations
Dec 16, 2015 · Databases

Mastering MySQL Replication: Concepts, Architectures, and Advanced Strategies

This article explains MySQL replication fundamentals, detailing master‑slave and multi‑master architectures, thread roles, configuration parameters, read‑write splitting, multi‑source setups, and advanced scenarios such as multi‑level and circular replication, while highlighting common pitfalls and performance considerations.

Database ArchitectureHigh AvailabilityMaster‑Slave
0 likes · 7 min read
Mastering MySQL Replication: Concepts, Architectures, and Advanced Strategies
21CTO
21CTO
Dec 5, 2015 · Cloud Computing

How Momo Scales to Billions: High‑Availability Communication Architecture for Mobile Social Apps

Facing up to 1.6‑1.7 billion daily requests, Momo’s tech director explains how the company built a highly available, low‑latency communication protocol and distributed infrastructure—leveraging virtualization, OpenStack, and cloud servers—to overcome weak mobile networks and serve its massive social‑gaming user base.

Communication ProtocolHigh AvailabilityMobile Networking
0 likes · 4 min read
How Momo Scales to Billions: High‑Availability Communication Architecture for Mobile Social Apps
ITPUB
ITPUB
Nov 30, 2015 · Backend Development

How 58.com Achieves Read/Write High Availability with Dual‑Master and Redis Caching

The article explains 58.com’s service‑oriented architecture that combines dual‑master MySQL replication, a standby database, and Redis caching to provide high‑availability read/write operations, outlines their scaling process, and details a four‑step cache‑consistency strategy.

High AvailabilityService Architecturedatabase replication
0 likes · 6 min read
How 58.com Achieves Read/Write High Availability with Dual‑Master and Redis Caching
High Availability Architecture
High Availability Architecture
Nov 25, 2015 · Backend Development

Unified Service Architecture for JD’s Billion-Scale Product Detail Page during Double Eleven

This article describes JD’s unified service architecture for the product detail page that handled tens of billions of requests during Double Eleven, detailing the multi‑level caching strategy, Nginx+Lua access layer, service isolation, degradation mechanisms, and front‑end logic offloading.

CachingHigh AvailabilityLua
0 likes · 22 min read
Unified Service Architecture for JD’s Billion-Scale Product Detail Page during Double Eleven
Efficient Ops
Efficient Ops
Nov 24, 2015 · Operations

Achieve Web High Availability and Static/Dynamic Separation with HAProxy & Keepalived

This article walks through implementing web high availability and static‑dynamic content separation using HAProxy combined with Keepalived, covering load‑balancing concepts, VRRP basics, step‑by‑step configuration of time sync, hostnames, SSH trust, installing required packages, and testing failover scenarios.

HAProxyHigh AvailabilityKeepalived
0 likes · 13 min read
Achieve Web High Availability and Static/Dynamic Separation with HAProxy & Keepalived
21CTO
21CTO
Nov 14, 2015 · Fundamentals

Inside Taobao’s High‑Performance Distributed File System (TFS): Architecture & Scaling

This article explains the design, storage mechanisms, high‑availability architecture, scaling strategies, multi‑data‑center disaster recovery, operational management, and future plans of Taobao’s distributed file system (TFS), a highly available and scalable storage solution for massive unstructured data.

Distributed File SystemHigh AvailabilityTFS
0 likes · 14 min read
Inside Taobao’s High‑Performance Distributed File System (TFS): Architecture & Scaling
21CTO
21CTO
Nov 13, 2015 · Operations

Inside Taobao’s High‑Performance Distributed File System (TFS): Architecture & Scaling

Taobao’s File System (TFS) is a highly available, high‑performance distributed storage solution built on Linux servers, featuring name‑server and data‑server clusters, block‑level replication, HA mechanisms, client caching, seamless scaling, multi‑data‑center disaster recovery, and open‑source support for C++, Java, and Nginx integration.

Distributed File SystemHigh AvailabilityTaobao
0 likes · 15 min read
Inside Taobao’s High‑Performance Distributed File System (TFS): Architecture & Scaling
Qunar Tech Salon
Qunar Tech Salon
Nov 5, 2015 · Databases

Design and Architecture of TDSQL: A Distributed MySQL‑Based SQL System

The article describes how the limitations of an in‑memory NoSQL HOLD platform led to the creation of TDSQL, a distributed MySQL‑based SQL system featuring ZooKeeper‑coordinated Scheduler, Agent, and Gateway modules, automatic cross‑IDC disaster recovery, transparent horizontal scaling, strong synchronous replication, and future integration with container technologies.

High AvailabilityShardingTDSQL
0 likes · 19 min read
Design and Architecture of TDSQL: A Distributed MySQL‑Based SQL System
Qunar Tech Salon
Qunar Tech Salon
Nov 4, 2015 · Backend Development

Evolution of 58.com Architecture: From Single‑Server All‑In‑One to Scalable Service‑Oriented System

The article chronicles how 58.com’s web architecture evolved from a tiny, single‑machine setup to a multi‑layer, Java‑based, highly available service‑oriented platform, detailing the technical decisions, scaling challenges, and automation practices adopted at each traffic milestone.

High Availabilityarchitecturescalability
0 likes · 14 min read
Evolution of 58.com Architecture: From Single‑Server All‑In‑One to Scalable Service‑Oriented System
Efficient Ops
Efficient Ops
Sep 23, 2015 · Operations

How Tencent Powers Millions with SET‑Based NoSQL Clusters

Tencent’s operations team explains how its SET‑based NoSQL clusters deliver ultra‑low latency, high availability, and seamless disaster recovery for billions of users, detailing deployment models, synchronization mechanisms, cost‑saving techniques, and the Data‑as‑Service approach that underpins its massive social platforms.

Data as a ServiceHigh AvailabilityNoSQL
0 likes · 12 min read
How Tencent Powers Millions with SET‑Based NoSQL Clusters
21CTO
21CTO
Sep 16, 2015 · Databases

How TDSQL Achieves Scalable, High‑Availability Distributed SQL on MySQL

This article explains how TDSQL transforms MySQL into a distributed, high‑availability SQL system by addressing NoSQL limitations, introducing a Scheduler‑Agent‑Gateway architecture, automatic scaling, sharding, robust disaster‑recovery mechanisms, and future integration with container technologies.

Auto ScalingHigh AvailabilitySharding
0 likes · 19 min read
How TDSQL Achieves Scalable, High‑Availability Distributed SQL on MySQL
21CTO
21CTO
Aug 31, 2015 · Backend Development

Scaling JD.com’s Product Detail Pages with Dynamic, High‑Performance Architecture

This article details the evolution and redesign of JD.com’s product detail page architecture, describing the transition from static HTML generation to a dynamic, high‑performance, multi‑datacenter system built on key‑value storage, Nginx + Lua, asynchronous processing, multi‑level caching, and robust scaling and reliability strategies.

CachingHigh AvailabilityKey-Value Store
0 likes · 34 min read
Scaling JD.com’s Product Detail Pages with Dynamic, High‑Performance Architecture
High Availability Architecture
High Availability Architecture
Aug 31, 2015 · Backend Development

High‑Availability Architecture for JD.com Product Detail Pages

This article describes how JD.com redesigned its product detail page system from a static, cache‑heavy architecture to a fully dynamic, multi‑level cached service using Nginx+Lua, JIMDB, and asynchronous workers, addressing scalability, performance, and high‑availability challenges for billions of daily page views.

CachingHigh AvailabilityJimdb
0 likes · 30 min read
High‑Availability Architecture for JD.com Product Detail Pages
Architect
Architect
Aug 26, 2015 · Backend Development

Design Considerations for Master/Slave Distributed Cache with Proxy and CAS

The article analyzes the use of a master/slave architecture for distributed caching, explains why two clusters, CAS, and proxy are employed, discusses consistency and availability challenges, and evaluates possible mitigation strategies for cache failures.

CASHigh AvailabilityMaster‑Slave
0 likes · 7 min read
Design Considerations for Master/Slave Distributed Cache with Proxy and CAS
Java High-Performance Architecture
Java High-Performance Architecture
Aug 24, 2015 · Operations

How DRBD Enables Real-Time Block-Level Replication for High Availability

DRBD (Distributed Replicated Block Device) is a software‑based, network‑driven block replication solution that mirrors disks, partitions, or logical volumes across servers in real time, offering synchronous and asynchronous modes, transparent failover, and a middle‑layer between the filesystem and physical storage.

Asynchronous ReplicationDRBDHigh Availability
0 likes · 3 min read
How DRBD Enables Real-Time Block-Level Replication for High Availability
21CTO
21CTO
Aug 15, 2015 · Backend Development

Inside Weibo’s Third‑Generation Backend Architecture: Scalability and High‑Availability

An in‑depth look at Weibo’s evolution to its third‑generation backend system, detailing the orthogonal decomposition model, three‑tier horizontal layering, key middleware such as MCQ, Motan RPC, SSDCache, and the WatchMan tracing platform that together enable high‑availability, massive concurrency, and low‑latency services for billions of users.

Distributed TracingHigh AvailabilityMiddleware
0 likes · 12 min read
Inside Weibo’s Third‑Generation Backend Architecture: Scalability and High‑Availability
High Availability Architecture
High Availability Architecture
Aug 7, 2015 · Backend Development

Highlights from High‑Availability Architecture Discussions: Plugins, Kanban Tools, Scheduling Services, Massive Tables, and ID Generation

This article compiles recent high‑availability architecture discussions covering plugin‑based system design, simple kanban tools, independent scheduling services, challenges of managing 60‑billion‑row tables, and the trade‑offs between UUIDs and custom distributed ID generators, offering practical insights for backend engineers.

High AvailabilityPluginsScheduling
0 likes · 7 min read
Highlights from High‑Availability Architecture Discussions: Plugins, Kanban Tools, Scheduling Services, Massive Tables, and ID Generation
High Availability Architecture
High Availability Architecture
Jul 22, 2015 · Backend Development

Designing Uber’s High‑Availability Messaging System: Fault Tolerance, Sharding, and Multi‑Data‑Center Strategies

The article details Uber senior engineer Zhao Lei’s presentation on building a highly available messaging platform, covering single‑point failure mitigation, sharding approaches, large‑scale outage handling, cross‑region failover, and the practical engineering practices and protocols used to keep billions of users online.

High Availabilitybackendfault tolerance
0 likes · 16 min read
Designing Uber’s High‑Availability Messaging System: Fault Tolerance, Sharding, and Multi‑Data‑Center Strategies
Java High-Performance Architecture
Java High-Performance Architecture
Jul 19, 2015 · Operations

How Keepalived Enables Automatic Failover and Load Balancing for High‑Availability Clusters

Keepalived, built on LVS, automatically monitors service nodes, isolates failed machines, and performs seamless failover between load balancers using VRRPv2, allowing web servers to be added or removed without manual intervention, thus ensuring high availability and efficient load distribution.

Cluster MonitoringHigh AvailabilityLVS
0 likes · 2 min read
How Keepalived Enables Automatic Failover and Load Balancing for High‑Availability Clusters
Qunar Tech Salon
Qunar Tech Salon
Jul 6, 2015 · Databases

Design, Deployment, and Lessons Learned from Codis and RebornDB: A Proxy‑Based Distributed Redis Solution

This article presents an in‑depth overview of Codis and its next‑generation project RebornDB, covering Redis, Redis Cluster, the proxy‑based architecture, consistency trade‑offs, production deployment experiences, operational pitfalls, and broader perspectives on distributed databases and architectures.

CodisHigh AvailabilityRebornDB
0 likes · 20 min read
Design, Deployment, and Lessons Learned from Codis and RebornDB: A Proxy‑Based Distributed Redis Solution
Art of Distributed System Architecture Design
Art of Distributed System Architecture Design
May 27, 2015 · Backend Development

WhatsApp’s High‑Reliability Architecture for 450 Million Users

This article examines WhatsApp’s high‑reliability architecture that supports 450 million users, detailing its Erlang‑based backend, hardware choices, scaling techniques, performance metrics, monitoring tools, and lessons learned from achieving up to two million concurrent connections on a single server.

ErlangHigh AvailabilityWhatsApp
0 likes · 18 min read
WhatsApp’s High‑Reliability Architecture for 450 Million Users

Understanding Kafka High Availability: Data Replication and Leader Election

The article explains why Kafka introduced high availability starting with version 0.8, detailing the need for data replication and leader election, describing replica distribution algorithms, replication mechanics, ISR handling, ZooKeeper structures, and the broker failover process to ensure fault‑tolerant streaming.

High AvailabilityLeader ElectionZookeeper
0 likes · 19 min read
Understanding Kafka High Availability: Data Replication and Leader Election
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 26, 2015 · Cloud Computing

Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations

The 2015 Beijing QCon highlighted four technical talks covering asynchronous processing in distributed systems, the DRC data‑replication infrastructure, minute‑level high‑availability fault recovery, and cloud‑era database operations, illustrating Alibaba's approaches to scalability and reliability in modern cloud platforms.

Data ReplicationHigh AvailabilityQCon
0 likes · 6 min read
Key Topics from the 2015 Beijing QCon: Asynchronous Processing, DRC Data Replication, High Availability, and Cloud Database Operations

Designing a High‑Availability, Auto‑Scaling KV Storage System Based on Memcached and Redis

This article examines common NoSQL key‑value stores such as Memcached and Redis, compares their strengths and limitations, and proposes a distributed architecture with routing, storage, management, and migration nodes that achieves high availability, automatic fault‑tolerance, load balancing, and elastic scaling.

Elastic ScalingHigh AvailabilityKV store
0 likes · 15 min read
Designing a High‑Availability, Auto‑Scaling KV Storage System Based on Memcached and Redis
MaGe Linux Operations
MaGe Linux Operations
Oct 28, 2014 · Databases

Redis vs MySQL & Memcached: Key Differences, Use‑Cases, and HA Design

This article compares Redis with MySQL, outlines their similarities and differences, examines Redis alongside Memcached, EhCache, and OSCache, and proposes a simple high‑availability architecture for Redis, highlighting performance, data model, scalability, and operational considerations.

Database ComparisonHigh AvailabilityRedis
0 likes · 6 min read
Redis vs MySQL & Memcached: Key Differences, Use‑Cases, and HA Design
MaGe Linux Operations
MaGe Linux Operations
Aug 22, 2014 · Operations

Understanding Linux Clusters: Differences, Types, and Key Features

This article explains what a Linux cluster is, contrasts it with distributed systems, outlines its two main characteristics—scalability and high availability—along with essential capabilities like load balancing and error recovery, and details common cluster types such as high‑availability, load‑balancing, and high‑performance computing clusters.

HPCHigh AvailabilityLinux
0 likes · 10 min read
Understanding Linux Clusters: Differences, Types, and Key Features
Baidu Tech Salon
Baidu Tech Salon
Apr 22, 2014 · Operations

Baidu's Optimization of MooseFS and Redis: Architecture Improvements and Performance Enhancement

At Baidu’s 49th Technical Salon, Cheng Yishi explained how the company revamped its MooseFS and Redis systems by adding a Shadow Master to split reads from writes, introducing Slave nodes for failover, and deploying a Redis proxy middleware, thereby dramatically improving performance, scalability, and high‑availability for critical services.

BaiduDistributed storageHigh Availability
0 likes · 6 min read
Baidu's Optimization of MooseFS and Redis: Architecture Improvements and Performance Enhancement