Backend Development 12 min read

How Meitu Scaled Twemproxy with Multi‑Process Architecture and Live Reload

This article details Meitu's engineering of a Redis/Memcached proxy platform, describing why twemproxy was chosen, the limitations of its upstream version, the multi‑process redesign with live configuration reload, added latency metrics, reuse‑port handling, Redis master‑slave support, performance testing, and remaining challenges.

Meitu Technology

Oct 10, 2018

How Meitu Scaled Twemproxy with Multi‑Process Architecture and Live Reload

Background

Meitu began building a Redis/Memcached resource PaaS platform in the second half of 2017. To achieve seamless scaling, the team introduced Twitter's open‑source twemproxy as a gateway in November 2017.

Why Twemproxy?

Twemproxy reduces backend connection counts and provides horizontal cache scaling. It supports multiple hash sharding algorithms and automatic removal of failed nodes, which matched Meitu's need for a protocol‑level proxy for both Redis and Memcached.

Limitations of the Upstream Version

Single‑threaded model cannot utilize multi‑core CPUs.

Configuration cannot be reloaded online.

No support for Redis master‑slave mode.

Absence of latency monitoring metrics.

Meitu’s Modifications

The team added a multi‑process architecture, online configuration reload, and latency monitoring while preserving the original proxy logic.

Core Architecture

Three connection objects drive the data flow:

Proxy connection : Listens for client connections and creates a corresponding client connection.

Client connection : Parses client requests, selects a server based on the key and hash rule, and forwards the request to a server connection.

Server connection : Sends the request to the backend cache, receives the response, and passes it back to the client connection.

The data flow is illustrated in the diagram below.

Multi‑Process Design

The redesigned model mirrors Nginx: a master process manages worker processes, restarts crashed workers, and handles several signals:

SIGHUP – reload configuration.

SIGTTIN – increase log level.

SIGTTOU – decrease log level.

SIGUSR1 – reopen log file.

SIGTERM – graceful shutdown.

SIGINT – immediate termination.

Global settings such as worker_shutdown_timeout control how long an old worker remains alive after a reload, enabling seamless configuration changes.

Reuse Port

Reuse‑port allows multiple sockets to listen on the same port, eliminating the accept‑thread bottleneck and the “thundering herd” problem. It is enabled by setting the SO_REUSEPORT flag on the listening socket.

Redis Master‑Slave Support

Although the original twemproxy treats Redis as a pure cache and omits replication, Meitu added a simple master‑slave configuration: if a server’s name is master, it is treated as the primary instance, and only one master is allowed per pool.

Latency Metrics

Two latency metrics were introduced:

Request latency – time from client request arrival to response, covering both proxy and backend processing.

Server latency – time spent inside twemproxy communicating with the backend server.

Metrics are recorded in buckets (e.g., <1 ms, <10 ms) to facilitate alerting and root‑cause analysis.

Remaining Issues

Backlog loss when the number of workers is reduced.

Unix sockets lack a reuse‑port‑like mechanism, so they remain single‑process.

Binary Memcached protocol is not supported.

Configured maximum client connections are not effective yet.

Incomplete command support (missing key‑based and blocking commands).

Potential dirty data during configuration reloads.

Performance Benchmark

Tests were run on CentOS 6.6 with an Intel E5‑2660 (32 logical cores), 64 GB RAM, and a bonded 1 Gbps NIC. A single worker achieved roughly 100 k QPS; performance scaled linearly with additional workers until the NIC became the bottleneck (observed around 8 cores).

Open‑Source Release

The modified twemproxy code is hosted at https://github.com/meitu/twemproxy. The team also open‑sourced a Golang Kafka consumer group, a PHP Kafka consumer group, and an Ethereum‑based DPoS implementation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Monitoring performance Proxy Redis multi-process Twemproxy Memcached

Written by

Meitu Technology

Curating Meitu's technical expertise, valuable case studies, and innovation insights. We deliver quality technical content to foster knowledge sharing between Meitu's tech team and outstanding developers worldwide.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.