Tagged articles
232 articles
Page 2 of 3
IT Architects Alliance
IT Architects Alliance
Nov 9, 2021 · Operations

Why Scale and How: Hardware Expansion, AKF Splitting Principle, Distributed ID Generation, and Elastic Scaling

The article explains the reasons for scaling, outlines hardware and component expansion strategies, introduces the AKF splitting principle for distributed systems, discusses database clustering and distributed ID generation methods such as UUID and Snowflake, and describes elastic scaling challenges and solutions.

Distributed SystemsID generationcapacity planning
0 likes · 14 min read
Why Scale and How: Hardware Expansion, AKF Splitting Principle, Distributed ID Generation, and Elastic Scaling
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Nov 8, 2021 · Operations

How to Scale Your System: From Hardware Expansion to Distributed ID Strategies

This article explains why capacity expansion is necessary, outlines hardware and component scaling strategies, introduces the AKF splitting principle for Redis clusters, discusses challenges of distributed scaling such as data consistency and high concurrency, and reviews database clustering and distributed ID generation methods like UUID and Snowflake.

AKF principlecapacity planningdatabase clustering
0 likes · 14 min read
How to Scale Your System: From Hardware Expansion to Distributed ID Strategies
Java High-Performance Architecture
Java High-Performance Architecture
Nov 1, 2021 · Operations

Why Scaling Matters: Hardware Expansion, Distributed ID & Elastic Capacity Strategies

The article explains why performance optimization has limits and outlines practical scaling methods—including whole‑machine and component upgrades, AKF splitting, database clustering, distributed ID generation (UUID and Snowflake), and elastic scaling—while also discussing the challenges each approach introduces.

ID generationcapacity planningdatabase clustering
0 likes · 14 min read
Why Scaling Matters: Hardware Expansion, Distributed ID & Elastic Capacity Strategies
21CTO
21CTO
Oct 30, 2021 · Operations

Scaling Systems: Hardware Expansion, Distributed IDs, and Elastic Capacity

This article explains why capacity expansion is necessary, outlines hardware and component scaling strategies, introduces AKF splitting principles, discusses database clustering and distributed ID generation methods such as UUID and Snowflake, and highlights the benefits and challenges of elastic scaling.

capacity planningdistributed-idelastic scaling
0 likes · 13 min read
Scaling Systems: Hardware Expansion, Distributed IDs, and Elastic Capacity
IT Architects Alliance
IT Architects Alliance
Oct 24, 2021 · Databases

Database Capacity Planning and Scaling with ScyllaDB

This article explains why database capacity planning is challenging and presents a systematic approach—including workload analysis, performance modeling, consistency considerations, and node scaling decisions—using the open‑source NoSQL database ScyllaDB to guide accurate capacity estimation.

ConsistencyNoSQLPerformance Modeling
0 likes · 14 min read
Database Capacity Planning and Scaling with ScyllaDB
21CTO
21CTO
Oct 21, 2021 · Databases

Why Is Database Capacity Planning So Hard? Simplify with ScyllaDB

This article explains why sizing a database cluster is challenging, outlines a step‑by‑step methodology for estimating workload, configuration and performance, discusses the impact of consistency levels, secondary indexes, materialized views and maintenance, and shows how ScyllaDB can be used to model and simplify capacity planning.

ConsistencyDatabase CapacityNoSQL
0 likes · 16 min read
Why Is Database Capacity Planning So Hard? Simplify with ScyllaDB
Laravel Tech Community
Laravel Tech Community
Oct 19, 2021 · Backend Development

Redis Scaling Strategies: Partitioning, Master‑Slave Replication, Sentinel, and Cluster

This article introduces various Redis scaling solutions—including basic partitioning, master‑slave replication, Sentinel high‑availability, and Redis Cluster—explaining their concepts, typical usage patterns, configuration commands, advantages, and drawbacks to help developers choose the right approach for high‑traffic environments.

ClusterPartitioningReplication
0 likes · 12 min read
Redis Scaling Strategies: Partitioning, Master‑Slave Replication, Sentinel, and Cluster
IT Xianyu
IT Xianyu
Sep 15, 2021 · Databases

Database Types, Bottlenecks, Optimization Strategies and Scaling Techniques

This article explains the classification of relational and NoSQL databases, analyzes common performance bottlenecks such as query latency, large fields, and write overhead, and presents practical optimization methods including caching, proper indexing, transaction handling, read‑write separation, and sharding for large‑scale systems.

cachingdatabasesindexing
0 likes · 17 min read
Database Types, Bottlenecks, Optimization Strategies and Scaling Techniques
Architect
Architect
Sep 8, 2021 · Databases

Redis Scaling Solutions: Partitioning, Master‑Slave, Sentinel, and Cluster

This article explains how to extend Redis beyond a single instance by covering partitioning, master‑slave replication, Sentinel automatic failover, and Redis Cluster, describing their usage methods, advantages, and drawbacks for high‑traffic, high‑availability scenarios.

ClusterPartitioningredis
0 likes · 11 min read
Redis Scaling Solutions: Partitioning, Master‑Slave, Sentinel, and Cluster
Architects Research Society
Architects Research Society
Aug 23, 2021 · Fundamentals

Agile Architecture Strategies for Scaling Agile Development

This article explains how agile architecture differs from traditional approaches, outlines the full lifecycle of agile architecture, defines responsibilities, introduces the role of an architecture owner, and provides practical guidance for modeling, scaling, communicating, and evolving architecture in large‑scale agile projects while avoiding over‑engineering.

Agile Architecturearchitecture modelingscaling
0 likes · 40 min read
Agile Architecture Strategies for Scaling Agile Development
Full-Stack DevOps & Kubernetes
Full-Stack DevOps & Kubernetes
Jul 19, 2021 · Cloud Native

Mastering Kubernetes Node Isolation, Scaling, and Rolling Updates – Practical Commands and Tips

This guide walks through essential Kubernetes operations such as isolating and recovering nodes, expanding clusters with new nodes, dynamically scaling Pods, managing Labels, scheduling Pods to specific Nodes, performing rolling updates, and configuring high‑availability for etcd and Master components, all with concrete command‑line examples and YAML snippets.

KubernetesNode ManagementRolling Update
0 likes · 19 min read
Mastering Kubernetes Node Isolation, Scaling, and Rolling Updates – Practical Commands and Tips
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Apr 24, 2021 · Databases

Deep Dive into Redis Cluster Architecture and Principles

This article provides a comprehensive analysis of Redis Cluster, covering node and slot assignment, command execution, resharding, redirection, fault‑tolerance, gossip communication, scaling strategies, configuration limits, and practical code examples for building and operating a high‑availability sharded Redis deployment.

ClusterGossip Protocolfailover
0 likes · 21 min read
Deep Dive into Redis Cluster Architecture and Principles
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 22, 2021 · Operations

Designing Highly Available Stateless Services: Load Balancing and Scaling Strategies

This article explains how to build highly available stateless services by using redundant deployment, vertical and horizontal scaling, various load‑balancing algorithms, and automatic recovery mechanisms, while also covering monitoring, high‑concurrency identification, and the role of CDN and OSS in resilient architecture.

CDNOSShigh availability
0 likes · 10 min read
Designing Highly Available Stateless Services: Load Balancing and Scaling Strategies
Liangxu Linux
Liangxu Linux
Apr 12, 2021 · Cloud Native

Understanding Kubernetes Architecture: From Master Nodes to Service Discovery

This article provides a concise yet comprehensive overview of Kubernetes, covering its core architecture, the workflow of creating deployments, pod fundamentals, scaling and rolling updates, networking basics, service discovery, and external access methods such as NodePort, LoadBalancer, and Ingress.

KubernetesMicroservicescontainer orchestration
0 likes · 12 min read
Understanding Kubernetes Architecture: From Master Nodes to Service Discovery
Alibaba Terminal Technology
Alibaba Terminal Technology
Feb 8, 2021 · Cloud Computing

How Alibaba’s Serverless FC Powers Double‑12 Traffic Peaks Without Ops

This article explains how Alibaba Cloud Function Compute (FC) was integrated into internal services to support the Double‑12 promotion, detailing the business requirements, network challenges, hybrid‑cloud solutions, container types, auto‑scaling, rate‑limiting, and the serverless platform’s deployment and gray‑release mechanisms.

Function ComputeServerlesshybrid cloud
0 likes · 15 min read
How Alibaba’s Serverless FC Powers Double‑12 Traffic Peaks Without Ops
21CTO
21CTO
Feb 1, 2021 · Databases

Mastering Redis Cluster: Step‑by‑Step Setup, Scaling, and Failover Guide

This tutorial walks through building a Redis Cluster on Redis 6.0+, covering node startup, handshaking, slot assignment, master‑slave replication, command routing, failover handling, and practical scaling operations such as adding, rebalancing, and removing nodes using redis‑cli commands.

CLIClusterfailover
0 likes · 22 min read
Mastering Redis Cluster: Step‑by‑Step Setup, Scaling, and Failover Guide
Architecture Digest
Architecture Digest
Jan 28, 2021 · Databases

Practical Guide to Setting Up and Scaling a Redis Cluster (Redis 6.0+)

This article provides a step‑by‑step tutorial on building a Redis Cluster on a single server, covering node configuration, cluster handshaking, slot assignment, master‑slave replication, command routing, failover handling, and practical scaling operations such as adding and removing nodes using redis‑cli.

ClusterRedis CLIdatabase
0 likes · 22 min read
Practical Guide to Setting Up and Scaling a Redis Cluster (Redis 6.0+)
DataFunTalk
DataFunTalk
Jan 23, 2021 · Artificial Intelligence

Feature Engineering: Mapping Raw Data to Machine‑Learning Features and Best Practices

This article explains how feature engineering transforms raw data into numerical representations for machine‑learning models, covering mapping of numeric and categorical values, one‑hot and multi‑hot encoding, sparse representations, scaling, handling outliers, binning, data quality checks, and feature interactions to capture non‑linear relationships.

data preprocessingencodingfeature engineering
0 likes · 20 min read
Feature Engineering: Mapping Raw Data to Machine‑Learning Features and Best Practices
21CTO
21CTO
Jan 6, 2021 · Databases

Step-by-Step Guide to Building and Scaling a Redis Cluster on Redis 6.0+

This tutorial walks through setting up a Redis Cluster on a single server with six nodes, covering node startup, handshake, slot assignment, master‑slave replication, command routing, fault‑tolerance, and practical scaling operations such as adding, rebalancing, and removing nodes.

ClusterRedis 6.0Redis CLI
0 likes · 25 min read
Step-by-Step Guide to Building and Scaling a Redis Cluster on Redis 6.0+
Code Ape Tech Column
Code Ape Tech Column
Dec 23, 2020 · Fundamentals

Technical Concepts Illustrated Through Relationship Analogies

The article humorously maps various relationship scenarios to core IT concepts such as backup strategies, high‑availability mechanisms, scaling methods, security measures, cloud services, and big‑data techniques, providing an engaging overview of fundamental system design principles.

BackupBig DataSystem Design
0 likes · 8 min read
Technical Concepts Illustrated Through Relationship Analogies
JD Cloud Developers
JD Cloud Developers
Nov 17, 2020 · Databases

How JD Cloud’s JCHDB Powered the 11.11 Shopping Festival’s Massive Data Surge

This article explains how JD Cloud’s JCHDB database handled PB‑level data growth during the 11.11 shopping festival, detailing the high‑availability architecture, performance optimizations, scaling techniques, and the eight‑step preparation process that enabled millions of queries per second and terabit‑level traffic.

cloude‑commercehigh-availability
0 likes · 8 min read
How JD Cloud’s JCHDB Powered the 11.11 Shopping Festival’s Massive Data Surge
New Oriental Technology
New Oriental Technology
Nov 17, 2020 · Frontend Development

Solving Double-Tap Zoom Issues in Touch Devices

This article explores the challenges and solutions for implementing double-tap zoom functionality in touch-enabled devices, addressing common problems with existing methods and proposing a mathematical approach to achieve accurate scaling and panning.

Touch EventsTouch InteractionUI Development
0 likes · 9 min read
Solving Double-Tap Zoom Issues in Touch Devices
Full-Stack Internet Architecture
Full-Stack Internet Architecture
Oct 22, 2020 · Cloud Native

Kubernetes Overview, Architecture, and Hands‑On Deployment with Minikube

This article introduces Kubernetes fundamentals, explains its production‑grade nature, container concepts, orchestration features, core architecture, and provides step‑by‑step commands for installing Minikube, creating a cluster, deploying an Nginx application, exposing it as a service, scaling, updating, and deleting the deployment.

DeploymentKubernetesService
0 likes · 16 min read
Kubernetes Overview, Architecture, and Hands‑On Deployment with Minikube
Architecture Digest
Architecture Digest
Sep 14, 2020 · Databases

Understanding the Underlying Mechanics of Elasticsearch and Lucene

This article provides a comprehensive, top‑down and bottom‑up explanation of Elasticsearch’s internal architecture, covering clusters, nodes, shards, Lucene segments, inverted indexes, stored fields, document values, caching, merging, routing, scaling, and query processing, while addressing common performance questions.

Elasticsearchcachinglucene
0 likes · 11 min read
Understanding the Underlying Mechanics of Elasticsearch and Lucene
Programmer DD
Programmer DD
Aug 23, 2020 · Databases

How to Overcome Database Bottlenecks with Sharding: Strategies and Tools

This article explains common I/O and CPU bottlenecks in databases, compares horizontal and vertical sharding techniques, outlines practical partitioning strategies, introduces popular sharding tools, and provides step‑by‑step guidance for implementing and scaling sharded architectures.

Performance OptimizationVertical Partitioningdatabase sharding
0 likes · 10 min read
How to Overcome Database Bottlenecks with Sharding: Strategies and Tools
Architect
Architect
Aug 16, 2020 · Databases

Database Bottlenecks and Sharding: Strategies, Tools, and Implementation Steps

This article explains common I/O and CPU bottlenecks in databases, introduces horizontal and vertical sharding concepts, compares sharding tools, outlines practical sharding steps, discusses typical sharding issues such as non‑partition queries and expansion, and provides a concise summary and example implementation.

Partitioningperformancescaling
0 likes · 10 min read
Database Bottlenecks and Sharding: Strategies, Tools, and Implementation Steps
New Oriental Technology
New Oriental Technology
Aug 11, 2020 · Backend Development

Engineering Case Study of New Oriental Cloud Classroom Backend Architecture and Scaling During the Pandemic

The article details how New Oriental's Cloud Classroom backend, built with Java, Spring, MySQL, Redis, Kafka, Sentinel, and other modern technologies, scaled to support millions of users and a hundred‑fold surge in demand during the pandemic through architectural optimizations, distributed caching, traffic control, and rapid performance improvements.

Distributed SystemsJavaKafka
0 likes · 7 min read
Engineering Case Study of New Oriental Cloud Classroom Backend Architecture and Scaling During the Pandemic
Manbang Technology Team
Manbang Technology Team
Jun 8, 2020 · Cloud Native

Design and Implementation of a Zookeeper Operator for Kubernetes

This article outlines the design, functional requirements, CRD definition, architecture, deployment, scaling, monitoring, fault‑tolerance, and upgrade strategies of a Zookeeper operator on Kubernetes, including code examples, service configurations, and integration with Prometheus and OAM standards.

CRDCloud NativeKubernetes
0 likes · 18 min read
Design and Implementation of a Zookeeper Operator for Kubernetes
Top Architect
Top Architect
May 29, 2020 · Databases

Redis Scaling Strategies: Partitioning, Master‑Slave Replication, Sentinel, and Cluster

This article explains how to extend Redis beyond single‑node limits by using partitioning, master‑slave replication, Sentinel for automatic failover, and Redis Cluster with hash slots, detailing their usage, advantages, drawbacks, and configuration examples for building high‑availability and scalable in‑memory data stores.

ClusterPartitioningReplication
0 likes · 11 min read
Redis Scaling Strategies: Partitioning, Master‑Slave Replication, Sentinel, and Cluster
Tencent Tech
Tencent Tech
Apr 27, 2020 · Cloud Computing

How Tencent’s Cloud Disk Snapshots Enable 6000 SCF Servers in 1 Minute

This article explains how Tencent Cloud’s Serverless Cloud Function (SCF) leverages Cloud Disk Snapshot technology to achieve the concurrent creation of 6000 virtual machines within a minute, detailing the snapshot‑based creation method, system architecture, performance challenges, and the engineering solutions that dramatically improve latency and bandwidth usage.

cloud computingperformancescaling
0 likes · 8 min read
How Tencent’s Cloud Disk Snapshots Enable 6000 SCF Servers in 1 Minute
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Apr 26, 2020 · Backend Development

How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google

This article explains how to design and scale a RabbitMQ cluster capable of handling millions of messages per second, covering core concepts, Google’s large‑scale test setup, sharding and federation plugins, mirror queues, reliability mechanisms, and practical tips for high‑availability and performance optimization.

Message QueueRabbitMQclustering
0 likes · 25 min read
How to Build a Million‑Message‑Per‑Second RabbitMQ Cluster: Lessons from Google
Java Backend Technology
Java Backend Technology
Mar 31, 2020 · Databases

Mastering Database Bottlenecks: When and How to Shard Effectively

This article explains common database performance bottlenecks such as IO and CPU limits, introduces horizontal and vertical sharding concepts, compares sharding tools, outlines practical implementation steps, discusses common pitfalls, and provides scaling strategies to keep your data layer responsive under heavy load.

Performance OptimizationVertical Partitioningdatabase sharding
0 likes · 11 min read
Mastering Database Bottlenecks: When and How to Shard Effectively
Programmer DD
Programmer DD
Mar 31, 2020 · Databases

Boost Redis Performance: Practical Optimization Techniques

This article explains why Redis performance matters for high‑traffic services and provides a comprehensive set of practical optimizations—including network latency reduction, command pipelining, avoiding slow commands, tuning persistence, OS/hardware settings, and scaling with sharding—to help you keep Redis fast and reliable.

Network LatencyPersistenceoptimization
0 likes · 17 min read
Boost Redis Performance: Practical Optimization Techniques
Cloud Native Technology Community
Cloud Native Technology Community
Mar 30, 2020 · Cloud Native

Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus

This article explains how to design and implement a cloud‑native, large‑scale distributed monitoring system using Prometheus, covering its limitations, service‑level sharding, centralized storage, federation, and high‑availability strategies to overcome scaling challenges in Kubernetes environments.

Cloud NativeFederationPrometheus
0 likes · 12 min read
Building a Cloud‑Native Large‑Scale Distributed Monitoring System with Prometheus
Tencent Tech
Tencent Tech
Mar 17, 2020 · Cloud Native

How Tencent Meeting Scaled Rapidly with Cloud‑Native TKE: Architecture & Practices

This article explains how Tencent Meeting leveraged Tencent Cloud’s Kubernetes Engine (TKE) and a suite of cloud‑native extensions—dynamic routing, fixed networking, parallel scaling, and controlled batch upgrades—to achieve rapid, reliable version iteration and massive capacity growth during the COVID‑19 surge.

KubernetesNetworkingTKE
0 likes · 14 min read
How Tencent Meeting Scaled Rapidly with Cloud‑Native TKE: Architecture & Practices
ITPUB
ITPUB
Mar 6, 2020 · Backend Development

Why Segment Ditch Microservices for a Monolith—and What We Learned

Segment’s engineering team recounts their evolution from a simple monolith to a sprawling micro‑service ecosystem and back again, detailing queue bottlenecks, repo fragmentation, shared‑library chaos, and how consolidating everything into a single codebase restored performance, scalability, and developer productivity.

BackendMicroservicesService Architecture
0 likes · 16 min read
Why Segment Ditch Microservices for a Monolith—and What We Learned
Tencent IMWeb Frontend Team
Tencent IMWeb Frontend Team
Mar 4, 2020 · Frontend Development

How Tencent Classroom’s Front‑End Team Survived Pandemic Traffic Surges

During the COVID‑19 pandemic, Tencent Classroom’s front‑end team faced unprecedented traffic spikes, forcing rapid decisions on domain stability, video streaming, data platforms, messaging, monitoring, and deployment pipelines, while sharing lessons on scaling, resilience, and collaborative development under extreme pressure.

DeploymentTencent ClassroomVideo Streaming
0 likes · 13 min read
How Tencent Classroom’s Front‑End Team Survived Pandemic Traffic Surges
Alibaba Cloud Native
Alibaba Cloud Native
Feb 10, 2020 · Cloud Native

Mastering Kubernetes StatefulSet: Deploy, Scale, and Upgrade Stateful Apps

This guide explains how Kubernetes StatefulSet solves the challenges of deploying stateful applications by providing stable network identities, persistent storage, ordered scaling, and flexible update strategies, and walks through example manifests, creation commands, status inspection, upgrade procedures, and scaling policies.

KubernetesStatefulSetUpgrade Strategies
0 likes · 21 min read
Mastering Kubernetes StatefulSet: Deploy, Scale, and Upgrade Stateful Apps
Big Data Technology Architecture
Big Data Technology Architecture
Feb 5, 2020 · Big Data

Elasticsearch Index Design: Scaling to PB/TP Levels and Best Practices

This article provides a comprehensive guide on designing Elasticsearch indices for massive data volumes, covering shard and replica sizing, mapping strategies, rollover templates, curator cleanup, tokenization choices, query type selection, and multi‑table association techniques to achieve efficient, reliable search at PB‑scale.

ElasticsearchMappingRollover
0 likes · 24 min read
Elasticsearch Index Design: Scaling to PB/TP Levels and Best Practices
JD Retail Technology
JD Retail Technology
Jan 8, 2020 · Operations

Comprehensive Guide to E‑commerce Promotion Traffic Management and System Preparation

This article explains how e‑commerce promotions differ from offline sales by offering lower participation thresholds and flexible discount tactics, outlines methods for estimating and handling traffic spikes, and provides detailed strategies for system capacity planning, load testing, monitoring, and incident response to ensure stable large‑scale promotional events.

Load Testingcapacity planninge‑commerce
0 likes · 23 min read
Comprehensive Guide to E‑commerce Promotion Traffic Management and System Preparation
360 Quality & Efficiency
360 Quality & Efficiency
Jan 7, 2020 · Backend Development

Scaling a Backend: From Single Server to Reverse Proxy, Load Balancing, Microservices, Caching, and Partitioning

This article explains how to evolve a simple single‑node backend by adding a reverse proxy, introducing load balancers, scaling databases, adopting micro‑services, leveraging caches and CDNs, using message queues, and applying partitioning techniques to handle massive traffic while maintaining consistency and reliability.

Message QueueMicroservicesarchitecture
0 likes · 10 min read
Scaling a Backend: From Single Server to Reverse Proxy, Load Balancing, Microservices, Caching, and Partitioning
Youzan Coder
Youzan Coder
Dec 26, 2019 · Product Management

Youzan's Demand Backlog Management: From Single Product to Multi‑Product Scaling

Youzan scales demand backlog management from a single product to multiple lines by aligning OKR‑driven strategic goals with stakeholder inputs, centralizing ownership in a product‑owner‑led backlog that integrates information, uses user‑story, impact‑mapping and MoSCoW prioritization, employs fixed Scrum/Kanban cycles, splits large backlogs by domain, and leverages electronic kanban tools while continuously refining granularity and value‑loop closure.

KanbanOKRProduct Owner
0 likes · 10 min read
Youzan's Demand Backlog Management: From Single Product to Multi‑Product Scaling
Didi Tech
Didi Tech
Dec 2, 2019 · Operations

Capacity Estimation Methodology for Growing Services

The article presents a systematic capacity‑estimation methodology that links service traffic to order volume, uses CPU‑Idle as a primary metric, predicts traffic growth and upper‑bound limits, validates predictions with load‑testing, and provides scaling recommendations while noting limitations of the CPU‑Idle baseline.

Traffic Predictioncapacity planningresource utilization
0 likes · 9 min read
Capacity Estimation Methodology for Growing Services
Efficient Ops
Efficient Ops
Nov 20, 2019 · Databases

Mastering Codis: Seamless Redis Scaling and High‑Availability Strategies

This comprehensive guide details how Codis extends Redis with a proxy‑based architecture to achieve transparent horizontal scaling, smooth data migration, high availability, fault tolerance, and operational best‑practices, while also covering common Redis pitfalls and performance tuning.

CodisDistributed Systemsredis
0 likes · 26 min read
Mastering Codis: Seamless Redis Scaling and High‑Availability Strategies
Alibaba Cloud Developer
Alibaba Cloud Developer
Nov 5, 2019 · Cloud Native

How Xianyu Scaled to Millions of DAUs: Inside Its Architecture Evolution

This article chronicles Xianyu’s journey from a modest tea‑room startup to a platform with tens of millions of daily active users, detailing each architectural phase—trial, development, platform, and cloud‑native integration—and the technical decisions that enabled rapid scaling, cross‑platform development, and operational efficiency.

cloud-nativemobile-developmentscaling
0 likes · 13 min read
How Xianyu Scaled to Millions of DAUs: Inside Its Architecture Evolution
dbaplus Community
dbaplus Community
Oct 20, 2019 · Big Data

Mastering Kafka: Concepts, Installation, Optimization, and Security

This comprehensive guide covers Kafka's core concepts, design principles, installation steps, configuration tweaks, performance optimizations, permission management, common operational commands, cluster scaling, log retention settings, and monitoring scripts to help you build and maintain a robust Kafka ecosystem.

Big DataConfigurationInstallation
0 likes · 20 min read
Mastering Kafka: Concepts, Installation, Optimization, and Security
WeChat Backend Team
WeChat Backend Team
Sep 3, 2019 · Artificial Intelligence

How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition

This article presents a distributed system that efficiently supports large‑scale n‑gram language models for automatic speech recognition by introducing caching, a two‑level distributed index, batch processing, and a cascading fault‑tolerance mechanism, demonstrating robust scalability and low communication overhead in Tencent's WeChat ASR service.

Language ModelN-gramcaching
0 likes · 35 min read
How Tencent Scaled Massive n‑gram Language Models for Real‑Time Speech Recognition
58 Tech
58 Tech
Jul 25, 2019 · Databases

Design and Evolution of WTable’s Scaling Process Using RocksDB

This article explains how the WTable distributed key‑value store leverages RocksDB’s LSM‑tree architecture and slot‑based data distribution to redesign its scaling workflow, separating full and incremental data migration to reduce compaction overhead and achieve high‑speed, low‑impact cluster expansion.

Data MigrationRocksDBWTable
0 likes · 8 min read
Design and Evolution of WTable’s Scaling Process Using RocksDB
Programmer DD
Programmer DD
Jul 4, 2019 · Backend Development

Why We Dropped 140+ Microservices for a Single Monolith—and What We Learned

The article recounts Segment's journey from a monolithic system to a sprawling micro‑service architecture, the operational pain points that emerged, and how consolidating over 140 services into a single codebase improved testing speed, deployment simplicity, and overall developer productivity while revealing new trade‑offs.

BackendService Architecturemonolith
0 likes · 16 min read
Why We Dropped 140+ Microservices for a Single Monolith—and What We Learned
21CTO
21CTO
Jun 27, 2019 · Operations

From Hundreds to Thousands: Scaling Operations and Building a Custom Monitoring System

This article recounts AdMaster's five‑year journey from a few dozen servers to thousands, detailing the evolution of their monitoring infrastructure, the challenges faced at each scale stage, and the design of a self‑built, distributed monitoring platform that delivers real‑time alerts, visualized data, and business‑level insights.

InfrastructureOperationsscaling
0 likes · 14 min read
From Hundreds to Thousands: Scaling Operations and Building a Custom Monitoring System
Java Backend Technology
Java Backend Technology
Jun 19, 2019 · Backend Development

Enterprise Redis: Scaling, Monitoring, and Business Isolation

This article explores how enterprises can effectively use Redis by partitioning clusters for independent or shared use, addressing key naming conflicts, implementing graceful scaling with Zookeeper, monitoring performance via Open-Falcon, and quickly isolating problematic business traffic to maintain system stability.

Business IsolationClustermonitoring
0 likes · 10 min read
Enterprise Redis: Scaling, Monitoring, and Business Isolation
Qunar Tech Salon
Qunar Tech Salon
Feb 19, 2019 · Operations

Forbidden City Night Festival Ticketing Chaos and How to Recover a Crashed Website

The article recounts the Forbidden City’s first night‑time Lantern Festival event, the overwhelming demand that caused the museum’s ticketing website to crash, and includes an interview with a senior operations engineer who explains the causes of such overloads and outlines rapid mitigation and scaling strategies.

Operationsscalingsystem reliability
0 likes · 6 min read
Forbidden City Night Festival Ticketing Chaos and How to Recover a Crashed Website
Ctrip Technology
Ctrip Technology
Dec 26, 2018 · Databases

CTrip’s Large‑Scale Redis Containerization: Architecture, Practices, and Lessons Learned

This article details CTrip’s experience of containerizing a 200 TB+ Redis deployment with millions of queries per second, covering the motivations, architecture, Kubernetes strategies, performance testing, operational challenges, and the practical solutions they devised to achieve high scalability and resource efficiency.

KubernetesResource Managementcontainerization
0 likes · 15 min read
CTrip’s Large‑Scale Redis Containerization: Architecture, Practices, and Lessons Learned
Java Backend Technology
Java Backend Technology
Dec 4, 2018 · Databases

Mastering MySQL: A Practical Knowledge Map of Deployment Scenarios

This article presents a comprehensive knowledge map of MySQL deployment scenarios—including single‑master, master‑slave, master‑multiple‑slaves, horizontal and vertical clustering, and mixed modes—detailing backup methods, performance tuning, scaling strategies, and high‑availability considerations.

Backup StrategiesDatabase ArchitecturePerformance Optimization
0 likes · 8 min read
Mastering MySQL: A Practical Knowledge Map of Deployment Scenarios
DevOps
DevOps
Nov 12, 2018 · R&D Management

Microsoft's Journey to Modern Software Engineering: Scaling Agile, DevOps, and Service Maturity

Microsoft's Core Services Engineering (CSE) team transformed from a waterfall development model to an agile, DevOps‑driven process using Visual Studio Team Services, introducing engineering fundamentals, a four‑level maturity model, a scaled agile framework, and a rotating Directly Responsible Individual role to accelerate delivery, improve quality, and enhance customer satisfaction.

Continuous DeliveryDevOpsMicrosoft
0 likes · 40 min read
Microsoft's Journey to Modern Software Engineering: Scaling Agile, DevOps, and Service Maturity
High Availability Architecture
High Availability Architecture
Nov 9, 2018 · Backend Development

Scaling Coinbase’s Platform for Spikes in Customer Demand: Lessons, Monitoring, and Traffic Replay

Since 2017, Coinbase has faced rapid cryptocurrency‑driven traffic growth, prompting a series of backend engineering improvements—including database upgrades, monitoring enhancements, relationship refactoring, caching, and a custom traffic capture‑replay system—to ensure reliability and scalability during demand spikes.

BackendMongoDBcaching
0 likes · 9 min read
Scaling Coinbase’s Platform for Spikes in Customer Demand: Lessons, Monitoring, and Traffic Replay
Architects' Tech Alliance
Architects' Tech Alliance
Aug 14, 2018 · Backend Development

System Splitting and Architectural Evolution: Strategies for Scaling and Decoupling

To address growing business complexity and throughput demands, this article outlines systematic approaches to decompose monolithic systems—covering horizontal and vertical scaling, application and database sharding, service governance, caching, and the evolution toward microservices—highlighting practical techniques and real-world experiences.

architecturedatabase shardingscaling
0 likes · 9 min read
System Splitting and Architectural Evolution: Strategies for Scaling and Decoupling
Tencent Cloud Developer
Tencent Cloud Developer
May 3, 2018 · Operations

Tencent Cloud Kafka Automated Operations Practices

Tencent Cloud’s senior engineer Yang Yuan explains how their managed Kafka service tackles version diversity, resource allocation, dynamic scaling, broker addition/removal, and partition migration using versioned clusters, bin‑packing algorithms, penalty weighting, and predictive scheduling to sustain trillions of messages and billions of messages per minute.

KafkaOperations AutomationResource Management
0 likes · 14 min read
Tencent Cloud Kafka Automated Operations Practices
Meitu Technology
Meitu Technology
Apr 27, 2018 · Frontend Development

Front-End Image Processing: Scaling, Cropping, and Rotation with Canvas

This article explains how to perform essential front‑end image processing with the HTML5 canvas—handling cross‑origin loading, ensuring images are fully loaded, then scaling, cropping, and rotating them while preserving aspect ratio and exporting the results as base64 strings, laying groundwork for later composition techniques.

Canvascroppingfrontend
0 likes · 10 min read
Front-End Image Processing: Scaling, Cropping, and Rotation with Canvas
ITPUB
ITPUB
Apr 19, 2018 · Databases

How Didi Scales MySQL: From Manual Ops to Full Automation

This article outlines Didi's MySQL database architecture, the challenges of managing thousands of instances, and the step‑by‑step automation framework—including dbproxy, high‑availability, backup, monitoring, and deployment modules—that reduces manual DBA work by over 70%.

DBADidiOperations
0 likes · 14 min read
How Didi Scales MySQL: From Manual Ops to Full Automation
Architecture Digest
Architecture Digest
Mar 11, 2018 · Backend Development

Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture

The article describes how a Go‑based backend, using a two‑layer job/worker pattern with buffered channels and configurable worker pools, can reliably ingest millions of POST requests per minute, serialize payloads to Amazon S3, and dramatically reduce server count through Elastic Beanstalk auto‑scaling.

BackendS3elasticbeanstalk
0 likes · 12 min read
Handling 1 Million Requests per Minute with Go: A Scalable Backend Architecture
Efficient Ops
Efficient Ops
Jan 10, 2018 · Databases

7 Proven Strategies to Supercharge MySQL Performance

This article explains why MySQL can become a bottleneck as load grows and presents seven practical techniques—using EXPLAIN, building proper indexes, tweaking defaults, caching data in memory, adopting SSDs, scaling horizontally, and improving visibility—to keep MySQL fast and reliable.

explainmysqloptimization
0 likes · 15 min read
7 Proven Strategies to Supercharge MySQL Performance
Qunar Tech Salon
Qunar Tech Salon
Dec 28, 2017 · Databases

7 Essential Tips for Optimizing MySQL Performance

This article presents seven practical techniques—including using EXPLAIN, creating proper indexes, adjusting default settings, loading data into memory, leveraging SSD storage, scaling horizontally, and improving observability—to keep MySQL databases fast, stable, and responsive as workloads grow.

ConfigurationSSDindexing
0 likes · 14 min read
7 Essential Tips for Optimizing MySQL Performance
dbaplus Community
dbaplus Community
Dec 11, 2017 · Backend Development

How 58 Express Scaled from Startup to Industry Leader: Architecture, Sharding, and AI Dispatch

This article recounts the technical evolution of 58 Express from its early startup days through rapid growth to an intelligent dispatch era, detailing challenges, database sharding, service decomposition, big‑data analytics, AI‑driven order routing, monitoring, and lessons learned for building a high‑performance backend system.

System Architecturedatabase shardingintelligent dispatch
0 likes · 21 min read
How 58 Express Scaled from Startup to Industry Leader: Architecture, Sharding, and AI Dispatch
21CTO
21CTO
Nov 21, 2017 · Backend Development

How Uber Scales Its Real-Time Ride‑Sharing Platform: Architecture Secrets

This article examines Uber's rapid 38‑fold growth by detailing the design, scaling techniques, and fault‑tolerance mechanisms of its real‑time market platform, including geographic indexing, microservices, distributed storage, and the DISCO scheduling system.

Distributed SystemsUberreal-time platform
0 likes · 19 min read
How Uber Scales Its Real-Time Ride‑Sharing Platform: Architecture Secrets
Node Underground
Node Underground
Oct 26, 2017 · Backend Development

Mastering Node.js Scaling: Cloning, Decomposing, and Splitting Strategies

This article explains how Node.js’s built‑in cluster module and external tools like PM2 can be used to improve stability and load capacity through three scaling strategies—cloning, decomposing, and splitting—allowing applications to fully leverage multi‑core CPUs and achieve zero‑downtime restarts.

Clusternodejsperformance
0 likes · 2 min read
Mastering Node.js Scaling: Cloning, Decomposing, and Splitting Strategies
MaGe Linux Operations
MaGe Linux Operations
Oct 11, 2017 · Operations

When Celebrities Crash Weibo: Inside the Ops Battle and Hybrid Cloud Solution

A sudden surge of traffic triggered by a celebrity relationship announcement caused a Weibo outage, prompting frantic reactions from developers, operations, and management, and leading to an in‑depth analysis of high‑availability architecture, elastic scaling, hybrid‑cloud DCP platforms, and Docker‑based service deployment.

Operationshigh availabilityhybrid cloud
0 likes · 19 min read
When Celebrities Crash Weibo: Inside the Ops Battle and Hybrid Cloud Solution
High Availability Architecture
High Availability Architecture
Aug 8, 2017 · Big Data

Practical Big Data Architecture Evolution and Lessons Learned

The article reviews the evolution of big‑data architectures from a simple RDB‑centric pipeline to a SaaS‑based solution, highlighting common bottlenecks such as scaling, integration, cost, and operational complexity, and shares practical experiences and best‑practice recommendations for building efficient, maintainable data platforms.

Big DataSaaSarchitecture
0 likes · 12 min read
Practical Big Data Architecture Evolution and Lessons Learned
ITFLY8 Architecture Home
ITFLY8 Architecture Home
Aug 5, 2017 · Databases

Mastering Database Schema: From Normalization to Sharding and Scaling

This comprehensive guide explores essential database design principles—including normalization, denormalization, data partitioning, routing, and scaling techniques—offering practical strategies to optimize schema structures, reduce redundancy, and improve performance for both relational and NoSQL systems.

Database designDenormalizationSchema Optimization
0 likes · 27 min read
Mastering Database Schema: From Normalization to Sharding and Scaling
Efficient Ops
Efficient Ops
Aug 4, 2017 · Operations

How Tencent’s ZhiYun Platform Powered the “Military Photo” Campaign with 4,000 Servers

This article details how Tencent's SNG operations team leveraged the ZhiYun intelligent operations platform—through standardized processes, massive IaaS provisioning, CMDB management, automated workflows, and real‑time capacity monitoring—to support the high‑traffic “Military Photo” H5 campaign, scaling up to 4,000 servers and 24 GB bandwidth.

AutomationCMDBIaS
0 likes · 10 min read
How Tencent’s ZhiYun Platform Powered the “Military Photo” Campaign with 4,000 Servers
Efficient Ops
Efficient Ops
Jun 15, 2017 · Operations

How Tencent Automated Operations for a Billion‑Red‑Packet Event

This article details Tencent's operation automation for the 2016 Chinese New Year QQ red‑packet activity, describing the massive traffic challenge, the architectural design, the shift from manual to CMDB‑driven one‑click scaling, load‑testing, flexible protection strategies, and on‑site monitoring that enabled rapid, reliable handling of billions of red‑packet transactions.

AutomationCMDBOperations
0 likes · 20 min read
How Tencent Automated Operations for a Billion‑Red‑Packet Event
Node Underground
Node Underground
May 12, 2017 · Cloud Native

How Joyent’s Autopilot Pattern Revolutionizes Container Deployment and Scaling

With containers becoming mainstream, Joyent introduced the Autopilot pattern—a one‑click deployment and real‑time scaling solution that leverages container orchestration, service discovery, automatic configuration refresh, and health checks, enabling seamless CI/CD integration and cloud‑agnostic application management.

Autopilot PatternDeployment AutomationDevOps
0 likes · 2 min read
How Joyent’s Autopilot Pattern Revolutionizes Container Deployment and Scaling
Efficient Ops
Efficient Ops
May 9, 2017 · Backend Development

How Tencent Scaled QQ Red Packet to 100k QPS: Architecture & Lessons

This article details how Tencent's AMS system was analyzed, traffic‑estimated, and redesigned for high‑availability during the QQ Spring Festival Red Packet event, covering architecture mapping, scaling strategies, overload protection, flexible availability, disaster recovery, monitoring, and practical lessons learned.

Backenddisaster-recoveryhigh-availability
0 likes · 25 min read
How Tencent Scaled QQ Red Packet to 100k QPS: Architecture & Lessons
High Availability Architecture
High Availability Architecture
Apr 13, 2017 · Backend Development

Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo

This article examines the architecture of Weibo's high‑availability advertising platform, covering match service design with OpenResty, index sharding, business logic optimization, dynamic auto‑scaling, and a real‑time monitoring pipeline to ensure stable, high‑performance ad delivery at massive scale.

AdvertisingBackend ArchitectureOpenResty
0 likes · 11 min read
Designing a High‑Availability Advertising System: Architecture, Scaling, and Real‑Time Monitoring at Weibo
21CTO
21CTO
Apr 10, 2017 · Operations

Alibaba’s Secret to Scaling GitLab: Distributed Sharding and Performance Boosts

This article details how Alibaba Group transformed its GitLab deployment from a single‑node bottleneck into a horizontally scalable, sharded architecture that handles millions of daily requests with high availability, improved performance, and robust data safety.

GitLabOperationsdistributed-systems
0 likes · 15 min read
Alibaba’s Secret to Scaling GitLab: Distributed Sharding and Performance Boosts
Efficient Ops
Efficient Ops
Apr 9, 2017 · Cloud Native

How Ctrip Achieved Seconds‑Level Scaling with a Container Cloud

Ctrip built a private container cloud to handle massive seasonal traffic spikes, enabling rapid, automated scaling and shrinking of resources, improving deployment speed, resource utilization, and operational intelligence across more than 20 business units.

Ctripcloud-nativecontainerization
0 likes · 16 min read
How Ctrip Achieved Seconds‑Level Scaling with a Container Cloud
Alibaba Cloud Infrastructure
Alibaba Cloud Infrastructure
Apr 6, 2017 · Operations

Scaling Alibaba Group GitLab: Distributed Architecture, Sharding, and Operational Practices

This article describes how Alibaba Group transformed its GitLab platform from a single‑node deployment to a distributed, sharded architecture with proxy services, performance optimizations, multi‑datacenter backup, and automated operations to support millions of users and dramatically increase request throughput.

GitLabdistributed-systemshigh-availability
0 likes · 14 min read
Scaling Alibaba Group GitLab: Distributed Architecture, Sharding, and Operational Practices
Qunar Tech Salon
Qunar Tech Salon
Mar 23, 2017 · Cloud Native

Ctrip Container Cloud: Architecture, Scaling, and Operational Practices

The article details Ctrip's rapid business growth driving the need for elastic scaling, the adoption of container technology to achieve second‑level provisioning, the design of their container cloud platform—including deployment principles, network choices, orchestration evaluations, monitoring solutions, and the CDOS overview—providing practical insights for large‑scale cloud‑native operations.

DevOpsOrchestrationcloud-native
0 likes · 16 min read
Ctrip Container Cloud: Architecture, Scaling, and Operational Practices