Operations 9 min read

Optimizing Gray Release for iQIYI Mobile Backend Using Dogfooding

iQIYI’s mobile backend employs dogfooding‑driven gray releases with cloud‑controlled traffic, gray‑tag propagation, comprehensive front‑end and back‑end metrics, device white‑lists, and downstream service integration, allowing internal users to quickly verify code and configuration changes and catch issues before full production rollout.

iQIYI Technical Product Team
iQIYI Technical Product Team
iQIYI Technical Product Team
Optimizing Gray Release for iQIYI Mobile Backend Using Dogfooding

Gray release is an important means of ensuring service quality. The usual approach is to select a subset of machines as a gray environment, deploy code there first, observe for a period, and then roll out to all machines if no issues are found. To make gray release more effective, several challenges need to be addressed:

(1) How to easily integrate verification in the gray environment? (2) How to obtain rapid feedback on gray environment issues? (3) How to comprehensively cover both front‑end and back‑end metrics in the gray environment? (4) How to interconnect gray environments across different services?

This article introduces how iQIYI's mobile backend leverages a gray environment to solve the above problems.

Dogfooding Introduction

The term “dogfooding” originated from a 1970s Alpo dog food TV commercial. In the IT industry, it was first used around 1988 when Microsoft senior manager Paul Maritz wrote an email titled “Eating our own Dogfood”, encouraging internal use of their own products.

Dogfooding means internal employees use the product to discover issues early. Combined with backend gray environments, code is first released to an internal environment, feedback is collected, and only after validation is it rolled out to production.

Gray Environment Optimization Goals

iQIYI’s mobile business has a fast iteration pace, requiring the earliest possible detection and feedback of issues before production. The goals are:

(1) Direct access for internal users to the gray environment for quick verification. (2) Support for device white‑lists to include devices in gray traffic. (3) Comprehensive gray metrics covering front‑end crash rate, back‑end success rate, response time, and error logs. (4) Transparent propagation of gray identifiers across services. (5) Support for configuration‑type services to be released via gray deployment.

Implementation Plan

1. Cloud‑controlled gray traffic : The client checks with a backend service at startup to see if it belongs to the gray group. If so, subsequent requests carry a gray identifier in the header, allowing the server to separate traffic and collect metrics.

2. Gray tag propagation : After the gray audience is configured, the APP calls a backend API at launch. If matched, the gray tag is attached to request headers. Downstream gateways use this tag to route traffic to gray or production machines, and the tag is passed further downstream.

3. Gray traffic metrics : Both front‑end (crash and error rates) and back‑end (success rate, response time, QPS, error logs) metrics are collected with the gray tag, enabling clear distinction between gray and production data.

4. Configuration‑type gray release : Configuration changes are first written to a gray storage. Configuration services read from gray storage first, falling back to the official storage if unavailable. After validation, the configuration is promoted to the official store. A QR‑code preview feature is provided for safe and convenient verification.

5. Downstream service integration : Two approaches are supported – (a) downstream gateways use the gray tag in headers to split traffic (suitable for HTTP services), and (b) downstream services expose separate gray and production endpoints (e.g., different domains or RPC groups) and the upstream service selects the appropriate endpoint based on the gray tag (suitable for RPC or storage services).

Conclusion

The article first explains the dogfooding concept, then outlines the expected effects of applying dogfooding to gray releases. It details iQIYI’s mobile backend optimization, covering cloud‑controlled traffic, gray tag transmission, metric collection, and downstream integration. The solution is already in daily use for code and configuration releases, having uncovered multiple hidden issues that were promptly fixed.

Future improvements include simplifying gray environment onboarding (e.g., easier white‑list integration and streamlined downstream service access) and reducing the impact of gray traffic on online users by lowering the proportion of real users in gray experiments and rotating the gray user pool regularly.

BackendmobiledeploymentconfigurationGray Releasemetricsdogfooding
iQIYI Technical Product Team
Written by

iQIYI Technical Product Team

The technical product team of iQIYI

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.