Fundamentals 20 min read

Understanding Distributed Systems: CAP, BASE, Caching, Message Queues, and Practical Improvements in New Oriental's Mobile App

This article explains the fundamentals of distributed systems, covering the CAP and BASE theorems, caching strategies, message queues, database choices, JVM optimization, and practical architectural improvements applied to New Oriental's mobile app to enhance availability and performance.

New Oriental Technology
New Oriental Technology
New Oriental Technology
Understanding Distributed Systems: CAP, BASE, Caching, Message Queues, and Practical Improvements in New Oriental's Mobile App

Preface: ancient philosophers asked about the ultimate origin of the world; the article draws a parallel to the ultimate question in enterprises – who is the ultimate user and how to serve them.

Comparison of B2B (IBM) and B2C (Didi) technology: B2B focuses on functionality, long‑term contracts, and low emphasis on user experience, while B2C emphasizes user experience, rapid incident resolution, and continuous availability.

Discussion of distributed systems: from traditional bank teller systems to modern internet e‑commerce, highlighting the shift from centralized to distributed architectures and the importance of scalability.

CAP theorem explanation: consistency (C), availability (A), and partition tolerance (P) are mutually exclusive in a distributed system, requiring trade‑offs during design.

Availability (A) is measured by response time and correct status codes; partition tolerance (P) is achieved by replicating data across nodes, which reduces consistency; consistency (C) means all nodes see the same data at the same time, forcing choices between C and A.

BASE theory extends CAP: basic availability, soft state, and eventual consistency. It describes how systems can tolerate temporary inconsistency while remaining usable.

FLP impossibility theorem states that a fully consistent, always‑available distributed system under arbitrary failures is impossible, reinforcing the need for practical trade‑offs.

Caching: centralized Redis provides a single temporary replica, while Guava offers local Java caches with many replicas, worsening consistency. Best practices include short TTLs, avoiding large values/keys, using pipelines, improving hit rates, pre‑fetching hot data, and adding random jitter to expirations.

Message queues: used for distributed transactions, decoupling services, throttling, and ensuring at‑least‑once delivery. Consumer pull models prevent overload; push models are simpler but risk overload. Common solutions include Kafka and RabbitMQ.

Databases: primary‑replica read/write separation introduces temporary inconsistency; strong consistency can be achieved by reading the primary when needed. Sharding, partitioning, index reduction, and avoiding heavy transactions improve performance and availability.

New Oriental's app uses TiDB (a CP system with Raft and three replicas) for money‑related data, sacrificing some availability for strong consistency, while MySQL operates as a CA system; adding replicas introduces partition tolerance at the cost of consistency.

Circuit breaking, rate limiting, and fallback strategies: misconfigured Redis circuit breakers caused full‑app outages; Nginx rate limiting protected downstream ERP services from traffic spikes caused by crawlers.

JVM optimization: avoid large single memory allocations (e.g., huge images or massive result sets) to prevent frequent Full GC pauses, thereby improving system availability.

Voice assignment feature reconstruction: migrated from NFS‑dependent architecture to Tencent Object Storage with NFS as fallback, applying BASE principles to achieve high availability and eventual consistency. The new design survived multiple cloud incidents without user‑visible failures.

Through a year of technical refactoring, the New Oriental app experienced zero online incidents during the 2020 summer, demonstrating the effectiveness of the described distributed‑system principles and practical optimizations.

Distributed SystemsMicroservicesDatabaseCAP theoremCachingmessage queue
New Oriental Technology
Written by

New Oriental Technology

Practical internet development experience, tech sharing, knowledge consolidation, and forward-thinking insights.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.