How Youzan Built a Highly Available Kubernetes Platform for Massive E‑commerce
This article explains why Youzan chose Kubernetes, describes their multi‑IDC, multi‑cluster architecture with high‑availability master components, logging and monitoring solutions, custom service exposure, image building process, lifecycle hooks, continuous delivery pipeline, operational challenges faced, and future plans such as operators and auto‑scaling.
Background
Youzan selected Kubernetes because it supports virtually all container workloads—stateless, stateful, batch, and DaemonSet—and has become the de‑facto standard for container orchestration, improving resource utilization, development, testing, and DevOps efficiency.
Overall Architecture
The platform provides a web‑based operations console covering application deployment, scaling, rollback, blue‑green releases, CI/CD pipelines, and log and metric visualization.
Cluster Deployment
To achieve high availability, multiple IDC‑level Kubernetes clusters are deployed. Applications can be deployed across different IDC clusters simultaneously, and two clusters can run within the same IDC to avoid single‑cluster scheduling bottlenecks. One cluster may be self‑built while another uses a cloud provider, allowing rapid scaling during peak events like Double‑Eleven.
A custom component k8s-sync synchronizes container IPs to a unified ingress layer (yz7). If IP sync fails in one cluster, the operation fails fast without affecting other clusters.
Master High Availability
etcd : A dedicated etcd cluster (3 or 5 nodes) stores all Kubernetes state. Using independent etcd nodes simplifies rolling upgrades and reduces maintenance burden.
kube‑apiserver : Deployed behind a load balancer; being stateless makes HA straightforward.
kube‑controller‑manager & kube‑scheduler : Multiple instances run; leader election via endpoint locking ensures a new leader takes over if the current one fails.
Logging
Two logging paths are used: stdout/stderr logs are collected by Filebeat and sent to Kafka; Java applications use a custom agent‑based solution originally built for VMs and adapted for containers.
Cluster Monitoring
Node monitoring continues to use Open‑Falcon. For containers, cAdvisor and kube‑state‑metrics collect resource data, while core components expose Prometheus metrics. All metrics are scraped by Prometheus and displayed in Grafana and the operations platform.
Application Monitoring
The platform provides CPU, memory, disk I/O, and network I/O metrics via cAdvisor and kube‑state‑metrics, plus alerts for container restarts, image pull failures, orphaned Pods, etc.
Service Exposure
Instead of Traefik, Youzan uses an internal ingress layer (yz7) with a custom k8s-sync component that watches Endpoints and syncs IPs to yz7. RPC services rely on macvlan to keep compatibility with legacy VM networking.
Image Building
Images consist of three layers: OS, runtime, and application. For Python and Node.js apps, an app.yaml at the repo root defines these layers. Example configuration:
stack: youzanyun-centos6 runtime: python-2.7 entrypoint: gunicorn -c gunicorn_config.py wsgi:applicationPod Labels
Pods are labeled with application name, cluster name, environment, IDC, and release channel (gray/blue‑green) to facilitate management and future affinity/anti‑affinity rules.
Lifecycle Hooks
PostStart hooks run preload and online scripts after health checks pass; PreStop hooks execute offline and stop scripts to gracefully shut down services.
Continuous Delivery
CI/CD is implemented per project environment, with each environment deployed to a separate Kubernetes cluster and isolated by namespaces.
Multi‑Cluster Management
A custom management platform allows administrators to create clusters, add or remove nodes, tag nodes, and monitor resource usage across both self‑built and public‑cloud clusters.
Issues Encountered
CPU core count inaccuracies for Java workloads; resolved by upgrading JDK after an initial LXCFS hack.
CrashLoopBackOff debugging: a special debug mode disables health checks and lifecycle hooks to force pod startup.
Problematic Pods are isolated by applying special labels that detach them from Deployments while preserving the problematic state for investigation.
Container dependency ordering: sidecar containers are required before the main business container; a “rich container” approach ensures ordered startup.
Future Outlook
Plans include adopting Kubernetes Operators for better workload management, enabling Horizontal and Vertical Pod Autoscaling, and improving scheduling granularity to boost cluster utilization.
Images
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Youzan Coder
Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
