How WeChat Reading Scaled Its Backend Architecture Over a Decade

Marking ten years of WeChat Reading, this article details the backend's evolution from a monolithic service to a multi‑layered, micro‑service architecture with robust storage, RPC frameworks, book data platforms, account system redesign, and AI‑driven content retrieval, highlighting the technical challenges and solutions behind its scalability.

High Availability Architecture
High Availability Architecture
High Availability Architecture
How WeChat Reading Scaled Its Backend Architecture Over a Decade

Introduction

This year marks the 10th anniversary of WeChat Reading, and its backend architecture has undergone several major iterations and upgrades. Deploying each component upgrade and architectural breakthrough on a system that has been running for a decade required decisive business collaboration and meticulous engineering.

Overall Architecture

WeChat Reading operates as an independent app from WeChat, with distinct development and operations environments. The backend implements a complete stack from the access layer to the storage layer:

Access Layer : Multiple CGI services are divided by business, achieving resource isolation. The CGI layer also provides routing, rate limiting, access‑layer caching, and long‑connection support.

Logic Layer : Built on the WRMesh framework, it hosts numerous micro‑services that are decoupled by business scenario. The framework supplies RPC, service discovery, overload protection, rate limiting, and monitoring reporting.

Storage Layer : Uses PaxosStore for user data, offering KV and K‑Table types with high availability and strong consistency. Customized caching middleware adapts to performance requirements of different scenarios. BookStore stores book content, supporting chapter splitting, modification, and download. Tencent Cloud PaaS storage is also leveraged where appropriate.

RPC Framework

The backend micro‑services originally stemmed from the Hikit framework written in C++. This framework proved its performance, disaster recovery, and monitoring capabilities in early production. As WeChat Reading grew, a heterogeneous recommendation system was deployed on TKE using Go, creating a need for unified service governance across different languages. WRMesh was created with a Sidecar + Business model: the Sidecar handles all network logic, while the Business process focuses on core functionality. Communication occurs via UnixSocket, and the Sidecar can load plugins for special client logic, enabling any language to integrate with the Hikit service‑governance framework. Migration to the WXG container platform P6N introduced Proxy layers (e.g., Svrkit, WQueue) to bridge legacy services with new capabilities.

Book Data Middle‑Platform Construction

Books are the foundation of WeChat Reading. Initially, the platform relied on Yuedu Group’s API for e‑book resources, which simplified onboarding but limited control. In recent years, the team shifted toward self‑signed copyrights and even self‑publishing, building a book‑data middle platform that exposes management APIs for operations teams. The workflow includes:

Formatting and proofreading (manual or partially automated).

Pre‑processing before publishing, handling versioning, user annotations, and progress migration.

EPUB parsing into internal formats (chapter splitting, image‑text separation, style extraction, offline package generation).

Generating BookInfo and BookData and persisting them in the StoreSvr service, which offers high‑availability, low‑latency APIs for book information retrieval and chapter download.

To maintain user experience when books are replaced, a dedicated UGC repair service recalculates offsets for highlights and reading progress using fuzzy full‑text search, updating multiple storage back‑ends. The service was refactored to split repair tasks into subtasks stored in Chubby, allowing multi‑machine consumption and reliable recovery after restarts. KV write middleware aggregates requests to reduce rate‑limit and version‑conflict errors.

Account System Availability Engineering

The account system underpins login, session key generation, and user profile management. Historically, AccountSvr and MySQL ran on the same machine with master‑slave replication; occasional hacks added write capability to the standby, leading to complexity and several severe outages during scaling and data expansion. In 2024, the team decided to rebuild the system using Paxosmemkv, an in‑memory, multi‑replica, strongly consistent store. Migration challenges include handling massive authentication traffic without adding load, ensuring zero data loss to avoid user login failures, and preserving user ID allocation consistency. The migration plan emphasizes gray‑scale rollout, rigorous data‑consistency checks, and a complete redesign of the AccountSvr logic.

Content Recall Evolution

Search in WeChat Reading originally covered only book titles and authors via an in‑memory index service, while full‑text search relied on Elasticsearch with rule‑based segmentation. With the rise of large language models, the platform now supports Retrieval‑Augmented Generation (RAG) for AI‑driven Q&A, facing two main challenges:

Semantic search over millions of books and billions of user‑uploaded EPUBs requires paragraph‑level segmentation and embedding generation, which is infeasible with the existing cost‑effective architecture.

Private user‑generated content (UGC) must be searchable only by the owner, demanding a low‑cost, on‑demand indexing solution.

The solution splits data into globally searchable and user‑personal searchable sets. Global data (book full‑text, outlines, reviews, knowledge bases) undergoes semantic chunking (fine‑tuned open‑source models), stored in a forward index (fkv) and an inverted index built on Elasticsearch with DiskANN for balanced storage cost and recall efficiency. For low‑latency in‑app search, an in‑memory Searchsvr service provides millisecond‑level responses.

Personal data (imported books, private notes) is indexed per user using USearch or Xapian, stored as files on low‑cost COS. At query time, the index is loaded on demand, optionally pre‑warmed with a cache (CFS) to accelerate retrieval. Unused indexes are periodically evicted to control storage costs.

Conclusion

After a decade of continuous development, the WeChat Reading backend has evolved through decisive architectural upgrades, robust service governance, scalable storage solutions, and AI‑enabled content retrieval, positioning the platform for the next ten years of growth.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

backend architectureData Platformwechat readingservice scalabilityAI retrieval
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.