The Journey of Containerization at Ximalaya: Practices, Principles, and Lessons Learned
This article recounts Ximalaya's multi‑year containerization effort, detailing the evolution from early Docker templates and Marathon to Kubernetes, the development of internal tools like barge and k8s‑sync, health‑check strategies, deployment patterns, and the practical lessons gained from integrating containers with existing middleware.
Ximalaya's containerization journey began in late 2016 with a Docker‑based project template on a Jenkins machine, allowing developers to clone the template, modify source, CPU, memory, and trigger a shell script.
The initial process involved Maven building WAR/JAR, assembling a Dockerfile, building the image, pushing to a registry, and deploying via Marathon. This low‑level version served as the foundation for later evolution.
Key principles were established: developers should not need to write Dockerfiles or understand containers; test environments allow direct container access with IP connectivity across machines; Kubernetes clusters (Test, UAT, Product) are separated with environment‑specific and independent configurations; and failed startups retain their state for debugging.
Migration from Marathon to Kubernetes was accompanied by an internal Docker release system that abstracted differences, and the creation of the barge command‑line tool. Developers add a barge.yaml file to define project name and settings, then run barge deploy to publish or barge exec -m $projectName to enter the container for debugging. The tool leverages Google Jib to build images directly from code, eliminating the need for Docker on developer machines.
The naming convention stems from the Harbor image registry (harbor = harbor, barge = barge) and reflects the metaphor of ships delivering containers.
To reconcile the “one process per container” ideal with early adoption constraints, Ximalaya used runit as an entrypoint to manage multiple processes such as SSH and a custom nile process that registers service IPs in ZooKeeper, enabling Nginx upstream updates.
Health‑check strategies evolved: each web service exposed a /healthcheck endpoint for readiness; RPC services later required port checks. After several iterations, readiness probes were simplified back to HTTP checks, while liveness probes were eventually omitted in favor of readiness‑only monitoring.
Deployment practices shifted to using two Deployments per project (old and new) to support gradual rollouts, with the plan to replace this pattern with Alibaba's OpenKruise Cloneset CRD. The internal release platform hides Kubernetes details from developers and provides APIs for publishing, rollback, and scaling.
The k8s‑sync component watches pod state changes, invokes upstream/downstream registration APIs, synchronizes pod information to MySQL, and is monitored via Prometheus alerts. It also ensures zero‑downtime upgrades by leveraging pod preStop hooks and optional traffic‑draining checks.
Reflecting on five years of practice, the team emphasizes the heavy effort required to integrate containers with existing middleware, the importance of developer‑friendly tooling (e.g., barge, container cloud platform, wrench diagnostics), and the cultural shift needed for successful container adoption.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Architect's Guide
Dedicated to sharing programmer-architect skills—Java backend, system, microservice, and distributed architectures—to help you become a senior architect.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
