How We Built a Scalable Multi‑Lane Testing Environment with Kubernetes and Istio
To meet rapid business growth, the team at 收钱吧 evolved its testing environments through several generations—from isolated physical setups to a Kubernetes‑based multi‑lane system powered by Istio and custom operators—significantly reducing resource usage, speeding deployments, and improving scalability and flexibility.
In fast‑moving development projects, multiple environments (development, testing, pre‑release) are essential for efficiency and risk reduction. 收钱吧 iterated its testing‑environment strategy over several versions to address scaling challenges.
Early Testing Environment
Before 2018, two physically isolated test environments were used, each with dedicated hardware and separate configuration files. Deployment relied on Jenkins + Docker, which was simple but suffered from lengthy manual processes, limited parallelism, and high hardware overhead.
Key pain points included:
Manual creation of Jenkins jobs and domain configurations for new services.
Long build‑and‑deploy cycles when switching branches.
Only two concurrent test versions per service.
Risk of configuration errors causing cross‑environment calls.
Difficulty adding third or fourth environments due to hardware and config constraints.
Desired improvements were faster, low‑knowledge deployments, easier branch switching, isolation between developers, and reduced repetitive work.
Multi‑Lane Environment 1.0
In 2018 the services migrated to a Kubernetes cluster and CI moved from Jenkins to GitLab CI, making the build pipeline more transparent. However, the core requirement—running multiple versions of the same service concurrently—remained unsolved.
Kubernetes native networking offers two options for version routing, both unsuitable for testing:
Same label across versioned Pods, causing random traffic distribution.
Separate Services per version, requiring downstream services to change configuration.
Concept of Environment Lanes
Inspired by Alibaba’s “feature environment”, a logical “lane” concept was introduced. Services in the same lane can communicate directly; cross‑lane calls require a special identifier. Requests without a lane identifier default to the base lane.
Benefits of lane‑based design:
Multiple versions share a single domain; discovery is handled by the underlying platform.
Lanes can be created or removed on demand.
Traffic must carry a lane flag to reach a specific lane.
Each service propagates the lane identifier.
Implementation with Istio
The solution aligns with Service Mesh goals. After evaluating options, Istio was chosen. Istio’s traffic management relies on two CRDs: VirtualService and DestinationRule.
Example VirtualService configuration:
apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: reviews-route
spec:
hosts:
- reviews.prod.svc.cluster.local
http:
- name: "reviews-v2-routes"
match:
- uri:
prefix: "/wpcatalog"
route:
- destination:
host: reviews.prod.svc.cluster.local
subset: v2
- name: "reviews-v1-route"
route:
- destination:
host: reviews.prod.svc.cluster.local
subset: v1Corresponding DestinationRule defining subsets:
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: bookinfo-ratings
spec:
host: ratings.prod.svc.cluster.local
subsets:
- name: testversion
labels:
version: v3
trafficPolicy:
loadBalancer:
simple: ROUND_ROBINRouting logic simplified to:
if headers.x-env-flag == v2:
route(subset_v2)
else if request.sourceLabels.version == v2:
request.headers.x-env-flag = v2
route(subset_v2)
else:
route(subset_v1)This approach required minimal changes to existing services—only an upgrade of the RPC library.
A management platform named Volac was built to create lanes and generate the corresponding VirtualService and DestinationRule resources.
Multi‑Lane Environment 2.0
While 1.0 solved many early issues, new challenges emerged, such as tight coupling between business logic and Istio, reliance on a single Service Mesh, and configuration storage that hindered multi‑cluster migration.
The platform evolved into the Next platform for internal application release, project management, and efficiency reporting, while the technical implementation shifted to the elastic‑env‑operator , a more Kubernetes‑native operator pattern.
The final control flow, illustrated in the diagram, provides:
Routing to a specific lane when the request carries the appropriate flag.
Fallback to the base lane when no matching service exists in the lane.
Results and Benefits
Since the 1.0 launch in 2019, the multi‑lane environment has delivered direct benefits:
Nearly 50% reduction in cloud server resources by eliminating a fixed environment.
Eliminated duplicate test configurations, saving manpower and reducing errors.
On‑demand scaling now supports ~600 concurrent environments, removing queue delays and accelerating delivery.
Indirect benefits include automated environment management that integrates smoothly with CI/CD pipelines, lower operational costs, and easier migration to multi‑cloud deployments.
Conclusion
The progression from isolated physical test setups to a Kubernetes‑based, Istio‑driven multi‑lane system demonstrates how logical environment isolation, operator automation, and Service Mesh capabilities can dramatically improve testing efficiency, scalability, and resource utilization.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
