Backend Development 19 min read

Design and Implementation of a High‑Concurrency Proxy for the Weibo Recommendation Engine

This article describes the background, challenges, technical research, architectural design, module implementation, performance testing, and lessons learned when building a Golang‑based, high‑concurrency, easily extensible proxy to replace twemproxy for Weibo's recommendation system.

Architect
Architect
Architect
Design and Implementation of a High‑Concurrency Proxy for the Weibo Recommendation Engine

1 Background

Data is the foundation of Weibo's recommendation engine; both raw data from platform departments and derived data such as keywords and candidate sets are constantly read and written. Over time the data layer was abstracted into import, storage, and external access models. Import uses a message middleware with memcache‑style pub/sub, storage relies on Redis and Lushan, and external access is mediated by twemproxy.

Layer‑7 business proxies route requests to backend App or Data server clusters; all cloud DB or distributed DB services pass through such proxies.

2 Problems

During incremental growth of Weibo's business, several issues with twemproxy emerged:

Lushan keys embed a DB number (e.g., "$db-$uid"); twemproxy can parse them only if the upstream hashes and concatenates the full key, coupling hash logic to callers.

twemproxy only supports Redis and memcache protocols; extending it to MySQL, MongoDB, HTTP, etc., is costly.

The asynchronous callback model tightly couples proxy internals with business code, scattering logic across many places.

Routing is single‑layer; mixed cache‑and‑DB scenarios (e.g., cache‑miss fallback to MySQL) require additional orchestration that twemproxy does not provide.

Technical drawbacks also include a single‑process design that underutilizes multi‑core servers and a C implementation with custom data structures that increase maintenance risk.

3 Technical Research

To meet the dual push of business demand and technical drivers, two open‑source proxies were evaluated besides twemproxy:

McRouter (Facebook, C++0x) – memcache only, uses lightweight fibers, supports cross‑datacenter fault tolerance and logging.

Codis (Wandoujia, Go) – Redis only, leverages ZooKeeper for configuration, provides transparent sharding and resharding.

Both have merits but were deemed unsuitable for direct extension; a custom solution was preferred.

4 Proxy Design

The new proxy must support Redis, memcache, HTTP protocols and backend services including Redis, memcache, MySQL, with configurable hash rules (modulo, consistent hash) and two‑tier storage (hot cache layer + persistent layer).

4.1 Configuration

Inspired by twemproxy's YAML, the proxy uses TOML for configuration; an example (image) shows connection‑pool lists separated by four hyphens to indicate different data‑center groups.

4.2 Logging

IO‑intensive logging requires buffering and asynchronous writes; the Go library seelog is adopted for size‑ or time‑based log rotation and automatic retention.

4.3 Monitoring

A simple HTTP endpoint exposes metrics such as current connections, QPS, and request counts for the existing monitoring platform.

4.4 Proxy Modules

The proxy is organized into four layers, each as a Go package:

protocol : parses Redis and memcache commands using bufio.Reader.

hash : defines type HashFunc func(key string) (dbIndexNum int, serverPosition int, err error) and implements modulo, consistent hash, etc.

tunnel : handles a client connection, parses requests, and dispatches work using goroutine‑based workers.

entry : provides TCP and HTTP entry points; TCP uses a task queue with a goroutine pool, HTTP adds gzip/deflate support and context propagation.

conn‑pool : maintains fixed‑size pools per backend, handles failures, heartbeats, and cross‑datacenter failover.

common : utilities for logging, configuration, MySQL client, monitoring, error handling, etc.

business logic : main package loads config, starts services, and gracefully shuts down.

4.5 Business Workflow

For a multi‑key read (e.g., Redis MGET), keys are hashed into K groups, K goroutines issue parallel MGETs to the appropriate backends, results are sent through a channel with original key order metadata, and finally merged before responding.

The same pattern can be recursively applied for two‑tier storage, where a miss in the first tier triggers a second‑tier lookup and the result is written back to the cache.

5 Performance

Benchmarking progressed from redis‑benchmark and memtier_benchmark demos to high‑concurrency Python scripts; the ultimate bottleneck is the backend storage I/O, with in‑memory stores outperforming file‑system stores. The proxy aims to keep request latency comparable to direct storage access.

6 Summary

The proxy is now deployed across many services; most business changes can be accommodated with little or no code modification. Remaining challenges include abstracting the framework further (akin to Hadoop's MapReduce) and evolving the proxy beyond simple routing to support sharding, distributed transactions, and locks.

Distributed SystemsProxygolangRecommendation Enginehigh concurrencyWeibo
Architect
Written by

Architect

Professional architect sharing high‑quality architecture insights. Topics include high‑availability, high‑performance, high‑stability architectures, big data, machine learning, Java, system and distributed architecture, AI, and practical large‑scale architecture case studies. Open to ideas‑driven architects who enjoy sharing and learning.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.