Operations 8 min read

Inside Efficient Ops: Docker, Redis Cluster, and Container Resource Management

The article records a Q&A session from the Efficient Operations talk, where guest Peng Zhef of Mango TV discusses Docker, Redis Cluster, container resource controls, and related operational challenges, providing practical insights into Python/Flask development, pod scaling, and Linux cgroup tuning.

Efficient Ops
Efficient Ops
Efficient Ops
Inside Efficient Ops: Docker, Redis Cluster, and Container Resource Management

This article compiles the discussion from the “Efficient Operations” talk series, featuring guest speaker Peng Zhef, core technology lead at Mango TV, and discussants Hou Junwei (Meituan) and Deng Lei (Touch).

Main Participants

Peng Zhef @ Mango TV (guest speaker)

Hou Junwei @ Meituan – Beijing (discussant)

Deng Lei @ Touch – Beijing (discussant)

Guest Introduction

Peng Zhef leads the core technology team of Mango TV's platform department, focusing on Docker and Redis Cluster infrastructure. He previously worked on Douban App Engine and Kingsoft Kuaipan, with extensive system engineering experience.

Q1: Web side language and architecture? Is it Flask? A1: We use Python and Flask.

Q2: How do server and client communicate – Socket or other? A2: If the client is the Eru API, we use HTTP.

Q3: When creating a Redis Cluster, do you use a pre‑built image or a Dockerfile? A3: We build the image once; subsequent deployments pull the image directly.

Q4: Does the system still have bottlenecks? Where and how to address them? A4: Currently we see no obvious bottlenecks.

Q5: What does “Eru gains container group resource control capability” mean? A5: It refers to controlling a group of Pods.

Q6: How many physical machines constitute a Pod group? A6: There is no strict limit; we have tested resource allocation for up to 10 k machines, and the Redis Cluster currently runs on about 20 Pods.

Q7: Has the business side started customizing the scaling component? What challenges arose? A7: Redis Cluster scaling is done via the Eru API; the process is straightforward because we use simple HTTP and RESTful APIs.

Q8: For CPU limits on new instances, do you use --cpuset‑cpus or --cpu‑shares? Are “fragment cores” and “independent cores” referring to CPU threads? A8: Both cpuset and cpushare are used. A container with three CPUs may have one shared core limited to 20 % usage; if no other containers bind to that core, it can use the full core.

When other containers bind to the same core, the container can only use up to its allocated share during peak periods, guaranteeing at least the minimum share.

Q9: Are kernel parameters like overmemory modified inside the Redis container? How? A9: Yes, we use a startup script run as root to adjust such parameters, employing some Docker “black magic”. The script is available at the provided GitHub link.

Q10: Do you bind the container’s /proc/sys to the host’s /writable‑proc/sys to modify host files? A10: No. Docker mounts /proc as read‑only; the container’s /proc is isolated. We use a writable‑proc overlay to allow mutable changes inside the container without affecting the host.

Q11: How do you handle MooseFS single‑point failures? A11: In practice, we simply restart the service.

Q12: Do containers have individual IPs? How do you ensure Redis Cluster masters and slaves aren’t placed on the same physical machine? A12: Using macvlan, each container gets its own IP; our cluster uses a /16 subnet, theoretically supporting over 60 k containers.

Q13: You rewrote Redis tools for auto‑scaling and monitoring; what issues did the native tools have? A13: As a Python team, we consider the original Ruby implementation suboptimal.

--- End of Q&A ---

DockerpythonoperationsFlaskContainer ManagementRedis Cluster
Efficient Ops
Written by

Efficient Ops

This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.