How We Scaled the Duosuo English App: Architecture Lessons from Day One to Four Months

This article details the technical background and evolution of the Duosuo English learning app, covering initial architecture, bandwidth estimation, risk control, database sharding, code refactoring, and operational lessons learned over four months of scaling.

21CTO
21CTO
21CTO
How We Scaled the Duosuo English App: Architecture Lessons from Day One to Four Months
APP Technical Background

Duosuo English is a free English learning app that combines AI interaction with real‑time foreign‑teacher speaking practice, supporting offline and online modes. It offers rich content, automatic speech recognition with scoring, and a complete listening‑speaking‑reading‑writing environment.

Initial Architecture Design

2‑1 Bandwidth and Traffic Estimation

Virtual assessment based on market feedback and investment.

Use Poisson distribution to estimate average concurrency and peak values.

Apply the 2/8 rule to calculate data traffic from the peak.

2‑2 Risk Control

Encrypt data transmission.

Circuit‑break and redirect malicious single‑interface requests.

Real‑time monitoring and alerts for slow requests.

2‑3 Server Architecture without Single Points of Failure

Database: master‑slave with multiple read replicas, vertical split, horizontal scaling.

LVS to mount multiple web front‑ends per business, enabling horizontal scaling.

Primary‑secondary storage.

2‑4 Business‑Level Sharding

Loose coupling isolation by major business data blocks.

Initial unclear growth allows coarse‑grained business isolation.

2‑5 Session Storage

Early sessions stored as files with multi‑level hash directories.

2‑6 Log Storage

Logs stored on a log server and analyzed with AWK, which was inconvenient for troubleshooting.

Architecture Evolution After Four Months

3‑1 Rapid Data Growth and Table Partitioning

Market response led to fast data growth and single‑table bottlenecks.

Implemented UID sharding and date‑based table splitting, with periodic archiving.

Partitioning reduces pressure but introduces multi‑table query challenges; DB middleware may help.

3‑2 Multi‑Business Sharding Optimization

Version iterations increase business complexity, requiring clear technical and data separation.

Strong coupling between services hampers maintenance and scaling.

Business isolation starts with data‑layer splitting, producing modules like quests, tasks, user‑center.

3‑3 Code Architecture Refactoring

Initial code mixed all logic in one block, making it unreadable. A four‑layer design was introduced:

C layer: parameter filtering and request forwarding.

M layer: single‑interface business logic.

T layer: common parts for specific business interactions.

D layer: database interaction.

Resulting in high cohesion and low coupling.

Current Service Architecture

The system remains layered with independent services for isolation.

Search service for fast data aggregation (e.g., real‑time PK selection).

Log service: collection (Fluentd), aggregation (Mongo + Elasticsearch), visualization (Kibana).

Multi‑level cache: Redis, MongoDB, Elasticsearch for hot data.

Queue: Redis‑queue; Cloud storage: Upyun.

Push, short messages, virtual currency store, chat – third‑party integration.

Lessons Learned and Pitfalls

5‑1 Peak‑Shaving

Traffic spikes occur at focused points like push notifications; we mitigate by traffic diversion.

5‑2 Offline‑Online Mode Issues

Duplicate uploads and submissions are common; timestamp isolation helps but is not optimal.

5‑3 Optimizing Single‑Table Capacity (up to 50 million rows)

Appropriate indexing and suitable index types.

Design data types with minimal size.

Avoid storing large fields.

Further table optimizations are possible.

5‑4 Data Archiving

Currently manual; automation is planned.

5‑5 Session Storage Improvements

Sessions switched from file‑system storage to token‑based approach to address storage and read‑efficiency issues.

These experiences are shared for discussion with peers.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

BackendarchitectureOperationsScalabilitycloud
21CTO
Written by

21CTO

21CTO (21CTO.com) offers developers community, training, and services, making it your go‑to learning and service platform.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.