Cloud Native 12 min read

Building a Custom Kubernetes Cluster on AWS for Hybrid‑Cloud Container Deployment at Ctrip

The article describes Ctrip's motivation, design choices, and implementation details for deploying a self‑managed Kubernetes cluster on AWS, covering network architecture with VPC and ENI, custom image distribution, logging and monitoring, data synchronization, and operational challenges such as ELB loopback and kubelet configuration.

Ctrip Technology
Ctrip Technology
Ctrip Technology
Building a Custom Kubernetes Cluster on AWS for Hybrid‑Cloud Container Deployment at Ctrip

Introduction – As Ctrip expands internationally, its hybrid‑cloud platform needed a faster, more efficient container deployment solution. Deploying Kubernetes on AWS reduced VM provisioning time from minutes to seconds and simplified operations across multiple public‑cloud providers.

Why Build Kubernetes on AWS Instead of Using Managed Services – Ctrip required IP‑direct connectivity and fixed IP addresses for containers, which native Kubernetes services (Service, Ingress) and existing CNI plugins (flannel, calico) could not fully satisfy. A custom solution also preserved control‑plane management and alignment with the private‑cloud IDC stack, leading to the decision to build its own cluster rather than use AWS EKS.

Network Design – The network is based on VPC and Elastic Network Interfaces (ENI). Each ENI provides a primary private IP, optional secondary IPs, a MAC address, and security‑group bindings. Two sub‑solutions were evaluated:

Single NIC with multiple IPs – high pod density but requires complex scheduling and security‑group handling.

Single NIC with a single IP per ENI – simpler, isolates each pod on its own ENI, easier to manage, though lower pod density.

Ctrip selected the single‑NIC‑single‑IP approach for its simplicity and sufficient resource utilization for Java workloads. A global IP address management module (GIPAM) stores a one‑to‑one mapping between pod names (StatefulSets) and IPs, ensuring IP stability across redeployments.

Image Management – Each private‑cloud IDC runs a Harbor instance forming a federated Harbor cluster. DNS hijacking directs Docker pushes to the local IDC Harbor, which asynchronously syncs to other IDC Harbors. In AWS, a dedicated Harbor backed by S3 stores images; DNS hijacking ensures the same image domain is used across environments.

Logging and Monitoring – The public‑cloud side mirrors the private‑cloud stack using Prometheus, Telegraf, InfluxDB, Grafana, and Hickwall. Metrics and alerts are processed in‑cloud and integrated with the NOC for 24/7 monitoring.

Data Synchronization from Private to Public Cloud – Large data transfers use public‑internet paths to avoid costly dedicated lines. Services are placed in public subnets with public IPs to bypass NAT‑gateway data‑processing fees. The architecture shows data collection agents in public subnets pushing data to AWS.

Operational Issues Encountered

AWS ELB registered by instance ID does not support loopback routing, causing API‑server timeouts; the issue is resolved by registering targets by IP address.

kubelet max‑pods must be tuned to respect the limited number of ENIs per instance.

Conclusion – The article outlines the background, requirements, and detailed design of Ctrip's self‑managed Kubernetes on AWS, covering network, image distribution, monitoring, data sync, and lessons learned, and invites further community exchange.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesDevOpsAWShybrid cloud
Ctrip Technology
Written by

Ctrip Technology

Official Ctrip Technology account, sharing and discussing growth.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.