How Meitu Scaled Twemproxy with Multi‑Process Architecture and Live Reload
This article details Meitu's engineering of a Redis/Memcached proxy platform, describing why twemproxy was chosen, the limitations of its upstream version, the multi‑process redesign with live configuration reload, added latency metrics, reuse‑port handling, Redis master‑slave support, performance testing, and remaining challenges.
Background
Meitu began building a Redis/Memcached resource PaaS platform in the second half of 2017. To achieve seamless scaling, the team introduced Twitter's open‑source twemproxy as a gateway in November 2017.
Why Twemproxy?
Twemproxy reduces backend connection counts and provides horizontal cache scaling. It supports multiple hash sharding algorithms and automatic removal of failed nodes, which matched Meitu's need for a protocol‑level proxy for both Redis and Memcached.
Limitations of the Upstream Version
Single‑threaded model cannot utilize multi‑core CPUs.
Configuration cannot be reloaded online.
No support for Redis master‑slave mode.
Absence of latency monitoring metrics.
Meitu’s Modifications
The team added a multi‑process architecture, online configuration reload, and latency monitoring while preserving the original proxy logic.
Core Architecture
Three connection objects drive the data flow:
Proxy connection : Listens for client connections and creates a corresponding client connection.
Client connection : Parses client requests, selects a server based on the key and hash rule, and forwards the request to a server connection.
Server connection : Sends the request to the backend cache, receives the response, and passes it back to the client connection.
The data flow is illustrated in the diagram below.
Multi‑Process Design
The redesigned model mirrors Nginx: a master process manages worker processes, restarts crashed workers, and handles several signals:
SIGHUP – reload configuration.
SIGTTIN – increase log level.
SIGTTOU – decrease log level.
SIGUSR1 – reopen log file.
SIGTERM – graceful shutdown.
SIGINT – immediate termination.
Global settings such as worker_shutdown_timeout control how long an old worker remains alive after a reload, enabling seamless configuration changes.
Reuse Port
Reuse‑port allows multiple sockets to listen on the same port, eliminating the accept‑thread bottleneck and the “thundering herd” problem. It is enabled by setting the SO_REUSEPORT flag on the listening socket.
Redis Master‑Slave Support
Although the original twemproxy treats Redis as a pure cache and omits replication, Meitu added a simple master‑slave configuration: if a server’s name is master, it is treated as the primary instance, and only one master is allowed per pool.
Latency Metrics
Two latency metrics were introduced:
Request latency – time from client request arrival to response, covering both proxy and backend processing.
Server latency – time spent inside twemproxy communicating with the backend server.
Metrics are recorded in buckets (e.g., <1 ms, <10 ms) to facilitate alerting and root‑cause analysis.
Remaining Issues
Backlog loss when the number of workers is reduced.
Unix sockets lack a reuse‑port‑like mechanism, so they remain single‑process.
Binary Memcached protocol is not supported.
Configured maximum client connections are not effective yet.
Incomplete command support (missing key‑based and blocking commands).
Potential dirty data during configuration reloads.
Performance Benchmark
Tests were run on CentOS 6.6 with an Intel E5‑2660 (32 logical cores), 64 GB RAM, and a bonded 1 Gbps NIC. A single worker achieved roughly 100 k QPS; performance scaled linearly with additional workers until the NIC became the bottleneck (observed around 8 cores).
Open‑Source Release
The modified twemproxy code is hosted at https://github.com/meitu/twemproxy. The team also open‑sourced a Golang Kafka consumer group, a PHP Kafka consumer group, and an Ethereum‑based DPoS implementation.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Meitu Technology
Curating Meitu's technical expertise, valuable case studies, and innovation insights. We deliver quality technical content to foster knowledge sharing between Meitu's tech team and outstanding developers worldwide.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
