How Kuaishou Built a Private Kubernetes Cloud with CI/CD and Service Mesh
This talk shares Kuaishou's practical experience in building a private Kubernetes container cloud, optimizing Docker image layers, using Helm for application deployment, implementing GitLab CI/CD pipelines, and adopting Istio service mesh for multi‑cluster service governance, highlighting challenges and solutions for real‑world adoption.
1. Building a Kubernetes‑based Container Cloud
Kuaishou faced two main challenges: instability of IDC networks and the need to integrate container technology with existing systems while maintaining service quality and resource utilization.
Key issues included cross‑IDC network jitter, disaster recovery, debugging difficulties in containers, short container lifecycles, and data migration problems.
To address these, Kuaishou adopted a pragmatic approach that leveraged existing tools and platforms, focusing on simplicity, low learning curve, and “idiot‑proof” solutions for developers.
High availability / disaster recovery: Haproxy/DPVS Data backup: etcd Network: Calico/Flannel DNS: CoreDNS, Proxy mode: IPVS Storage driver: direct‑LVM CI/CD: GitLab CI/CD on Kubernetes Service orchestration: Kubernetes + Helm Service Mesh: Istio Application configuration management: Helm Application store: Helm charts
Architecture diagrams illustrate the private cloud components and their interactions.
2. Optimizing Docker Image Construction
Reducing Dockerfile layers significantly speeds up image builds; merging commands can save tens of seconds per build, especially when many layers are present.
Using Alpine Linux as a base image reduces size, but additional steps (e.g., installing glibc) are required for certain workloads like JDK8.
3. Deploying Applications with Helm
Helm provides templated, configurable deployments for complex Kubernetes applications, enabling versioned releases, history inspection, and easy rollbacks.
ChartMuseum can be used to host Helm charts, and Helm’s templating allows environment‑specific configurations without modifying the templates.
4. Implementing CI/CD with GitLab
GitLab CI/CD automates the pipeline from code commit through compilation, testing, image building, and deployment, integrating with Helm to produce versioned releases.
Quality gates such as unit tests, code style checks, and coverage thresholds are enforced before proceeding to packaging and deployment.
5. Service Mesh with Istio
Istio is adopted for service governance, supporting gRPC traffic, multi‑cluster service discovery, and advanced routing strategies.
Service mesh enables gradual rollouts, traffic splitting, and observability across hybrid environments that include both containerized and non‑containerized services.
6. Multi‑Cluster Scheduling and Traffic Management
Kubernetes scheduling strategies are used to allocate resources across clusters within the same or different data centers, ensuring high availability and balanced load.
Istio’s traffic management features handle cross‑cluster communication and hybrid deployment scenarios.
Overall, Kuaishou’s practices demonstrate how to efficiently adopt container cloud, CI/CD, Helm, and Service Mesh in a large‑scale production environment.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Efficient Ops
This public account is maintained by Xiaotianguo and friends, regularly publishing widely-read original technical articles. We focus on operations transformation and accompany you throughout your operations career, growing together happily.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
