Cloud Native 11 min read

How Youzan Built a Highly Available Kubernetes Platform for Massive E‑commerce

This article explains why Youzan chose Kubernetes, describes their multi‑IDC, multi‑cluster architecture with high‑availability master components, logging and monitoring solutions, custom service exposure, image building process, lifecycle hooks, continuous delivery pipeline, operational challenges faced, and future plans such as operators and auto‑scaling.

Youzan Coder
Youzan Coder
Youzan Coder
How Youzan Built a Highly Available Kubernetes Platform for Massive E‑commerce

Background

Youzan selected Kubernetes because it supports virtually all container workloads—stateless, stateful, batch, and DaemonSet—and has become the de‑facto standard for container orchestration, improving resource utilization, development, testing, and DevOps efficiency.

Overall Architecture

The platform provides a web‑based operations console covering application deployment, scaling, rollback, blue‑green releases, CI/CD pipelines, and log and metric visualization.

Cluster Deployment

To achieve high availability, multiple IDC‑level Kubernetes clusters are deployed. Applications can be deployed across different IDC clusters simultaneously, and two clusters can run within the same IDC to avoid single‑cluster scheduling bottlenecks. One cluster may be self‑built while another uses a cloud provider, allowing rapid scaling during peak events like Double‑Eleven.

A custom component k8s-sync synchronizes container IPs to a unified ingress layer (yz7). If IP sync fails in one cluster, the operation fails fast without affecting other clusters.

Master High Availability

etcd : A dedicated etcd cluster (3 or 5 nodes) stores all Kubernetes state. Using independent etcd nodes simplifies rolling upgrades and reduces maintenance burden.

kube‑apiserver : Deployed behind a load balancer; being stateless makes HA straightforward.

kube‑controller‑manager & kube‑scheduler : Multiple instances run; leader election via endpoint locking ensures a new leader takes over if the current one fails.

Logging

Two logging paths are used: stdout/stderr logs are collected by Filebeat and sent to Kafka; Java applications use a custom agent‑based solution originally built for VMs and adapted for containers.

Cluster Monitoring

Node monitoring continues to use Open‑Falcon. For containers, cAdvisor and kube‑state‑metrics collect resource data, while core components expose Prometheus metrics. All metrics are scraped by Prometheus and displayed in Grafana and the operations platform.

Application Monitoring

The platform provides CPU, memory, disk I/O, and network I/O metrics via cAdvisor and kube‑state‑metrics, plus alerts for container restarts, image pull failures, orphaned Pods, etc.

Service Exposure

Instead of Traefik, Youzan uses an internal ingress layer (yz7) with a custom k8s-sync component that watches Endpoints and syncs IPs to yz7. RPC services rely on macvlan to keep compatibility with legacy VM networking.

Image Building

Images consist of three layers: OS, runtime, and application. For Python and Node.js apps, an app.yaml at the repo root defines these layers. Example configuration:

stack: youzanyun-centos6
runtime: python-2.7
entrypoint: gunicorn -c gunicorn_config.py wsgi:application

Pod Labels

Pods are labeled with application name, cluster name, environment, IDC, and release channel (gray/blue‑green) to facilitate management and future affinity/anti‑affinity rules.

Lifecycle Hooks

PostStart hooks run preload and online scripts after health checks pass; PreStop hooks execute offline and stop scripts to gracefully shut down services.

Continuous Delivery

CI/CD is implemented per project environment, with each environment deployed to a separate Kubernetes cluster and isolated by namespaces.

Multi‑Cluster Management

A custom management platform allows administrators to create clusters, add or remove nodes, tag nodes, and monitor resource usage across both self‑built and public‑cloud clusters.

Issues Encountered

CPU core count inaccuracies for Java workloads; resolved by upgrading JDK after an initial LXCFS hack.

CrashLoopBackOff debugging: a special debug mode disables health checks and lifecycle hooks to force pod startup.

Problematic Pods are isolated by applying special labels that detach them from Deployments while preserving the problematic state for investigation.

Container dependency ordering: sidecar containers are required before the main business container; a “rich container” approach ensures ordered startup.

Future Outlook

Plans include adopting Kubernetes Operators for better workload management, enabling Horizontal and Vertical Pod Autoscaling, and improving scheduling granularity to boost cluster utilization.

Images

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

monitoringci/cdhigh availabilityKubernetesMulti-Clusterlogging
Youzan Coder
Written by

Youzan Coder

Official Youzan tech channel, delivering technical insights and occasional daily updates from the Youzan tech team.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.