Backend Development 19 min read

Content Ingestion System Refactoring: From Microservices to a High‑Performance Monolithic Plugin Architecture

The article details the comprehensive redesign of QQ Browser's content ingestion platform, highlighting the shortcomings of the legacy micro‑service architecture, the migration to a single‑process monolithic design with a plugin framework, fault‑tolerant Kafka integration, thread separation, memory‑allocator improvements, and the resulting dramatic gains in throughput, latency, CPU utilization, and development efficiency.

High Availability Architecture
High Availability Architecture
High Availability Architecture
Content Ingestion System Refactoring: From Microservices to a High‑Performance Monolithic Plugin Architecture

The legacy content ingestion system for QQ Browser search consisted of over a hundred micro‑services, causing high RPC overhead, complex fault‑tolerance logic, low CPU utilization (max 40%), and slow feature iteration because a single change required modifications across many services.

To address these issues, the team performed a zero‑base redesign, consolidating the pipeline into a single monolithic service that keeps data in‑memory, drastically reducing inter‑service communication. A plugin‑based framework was introduced to handle the diverse content types and processing steps, allowing new business requirements to be satisfied by simply adding or configuring plugins without code changes.

Key architectural improvements include:

Unified processing flow with four configurable pipelines (incremental update, feature update, incremental batch, feature batch).

Kafka‑based fault‑tolerance: all HTTP/trpc pushes are first written to Kafka, ensuring no data loss during node failures.

Separation of consumption and computation threads using a lock‑free queue, enabling each Kafka partition to be consumed by a dedicated thread while multiple worker threads process the data, raising CPU utilization to 100% and achieving up to 13× higher QPS.

Memory‑management optimizations such as replacing double‑buffering with std::atomic > and switching from RapidJSON to Sonic‑JSON, cutting serialization overhead and improving throughput by 15%.

Performance benchmarks show single‑core QPS increasing from 13 to 172, batch‑load QPS improving from 13 to 230, and overall latency dropping by more than 70%. The refactor also reduced code size from 113k to 28k lines, eliminated 93 micro‑services, and cut the P80 lead‑time for new features from 5.72 days to under 1 day.

In addition to architectural changes, the team standardized CI/CD pipelines, code review processes, and documentation practices, further boosting development velocity and code quality.

backendperformance optimizationmicroservicessystem designPlugin ArchitectureDevOpsC++
High Availability Architecture
Written by

High Availability Architecture

Official account for High Availability Architecture.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.