Cloud Native 17 min read

Service Mesh Practice at iQIYI: Istio Deployment, Traffic Management, Custom Dashboard, and Monitoring

iQIYI adopted Istio on Kubernetes to create isolated mesh clusters, integrated its Kong gateway via Consul, built a custom dashboard that bundles deployments, services and Istio resources into high‑level “App” objects for one‑click releases, and implemented comprehensive monitoring, logging and alerting with Prometheus, Grafana and Flink, demonstrating improved traffic control, observability and security while outlining future cross‑cluster and policy enhancements.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Service Mesh Practice at iQIYI: Istio Deployment, Traffic Management, Custom Dashboard, and Monitoring

In recent years, cloud‑native technologies represented by service mesh have become a hot topic for developers. Service mesh, as the most typical cloud‑native technology, lowers the technical threshold and reduces code intrusion that were common in traditional micro‑service architectures. iQIYI's technical team has been exploring service mesh, and part of its business now runs on a mesh. This article shares iQIYI's practical experience.

Background

Legacy monolithic applications become unwieldy as business grows, prompting the shift to micro‑services. While micro‑services improve development speed, managing a large number of services creates new challenges such as tracing, circuit breaking, monitoring, and service discovery. Service mesh addresses these issues by introducing a dedicated infrastructure layer that transparently proxies all service‑to‑service traffic.

According to William Morgan, a service mesh is a dedicated infrastructure layer that handles service‑to‑service communication, removing the need for developers to embed networking logic in business code.

Practical Application

iQIYI built isolated test and production service‑mesh clusters on Kubernetes and deployed a monitoring system and a custom Istio Dashboard. Some backend services have already been migrated to the mesh.

After evaluating several mesh solutions, iQIYI chose Istio because it aligns well with Kubernetes and is backed by Google, IBM, and Lyft. Istio’s rapid iteration and rich ecosystem made it a suitable choice.

Traffic Management

The mesh’s traffic‑control components include:

Gateway – entry point at the mesh edge.

VirtualService – core Istio resource for routing, retries, canary releases, etc.

DestinationRule – defines subsets of service versions.

Service – Kubernetes Service that forwards requests to Pods.

Pod – smallest deployable unit in Kubernetes.

Istio enables features such as traffic routing, multi‑version deployments, traffic mirroring, retries, fault injection, circuit breaking, HTTP redirects/rewrites, all via simple YAML configurations.

iQIYI also integrated its existing Kong/Mashape API gateway with the mesh using Consul Agent, preserving authentication, rate‑limiting, and routing while adding mesh‑level capabilities.

Dashboard

Because Kubernetes’ native Dashboard is insufficient for mesh‑level operations, iQIYI built a custom Istio Dashboard that abstracts low‑level resources into an “App” concept. An App bundles Deployment, Service, VirtualService, DestinationRule, and Gateway, allowing developers to manage the entire lifecycle with a few clicks.

Key features of the Dashboard include:

App‑level operations – users interact with high‑level App objects instead of individual mesh resources.

One‑click deployment – a single API call deploys the whole application and exposes it via a domain.

Full traffic‑control UI – configure gray releases, retries, and routing visually.

The backend uses the open‑source Istio client to translate UI actions into HTTP/RPC calls to the Kubernetes API, eliminating the need to write YAML manually.

Monitoring

Istio’s sidecar proxy (Envoy) forwards traffic through a Mixer component that collects metrics. Mixer’s adapters can forward these metrics to APM back‑ends. By default, Istio ships with a Prometheus adapter, allowing out‑of‑the‑box metric collection.

iQIYI also deploys third‑party exporters (node_exporter, kube‑state‑metrics) to gather node and resource metrics, which Prometheus scrapes. Grafana visualizes CPU, memory, and other cluster metrics, and alerts are routed via Webhook to an internal alerting platform (email, chat, SMS).

Log collection is handled by the company’s Venus system, which stores logs in Elasticsearch. A custom “critical‑node alert” service built on Flink consumes these logs, detects error patterns, and notifies subscribers within five minutes.

Other Features

Istio also provides strong security (TLS certificates injected into Envoy), extensibility via custom adapters, and supports chaos engineering by modifying VirtualService and DestinationRule to inject latency or failures. Kubernetes CRDs make it easy to add further extensions such as custom gateways.

Conclusion and Future Plans

Service mesh is a key direction for micro‑service evolution, but challenges remain, including blurred responsibilities between developers and operators and the need for higher technical expertise. iQIYI’s initial exploration has proven the mesh’s benefits in traffic control, observability, and decoupling business code from infrastructure. Future work will focus on reliability, controllability, and usability, such as cross‑cluster routing, network policies for specific business scenarios, and extending the Dashboard.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Cloud NativeKubernetesIstiotraffic management
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.