Operations 16 min read

How Shifting MR Testing Left Supercharged Our Rendering Engine Quality

This article details how the KE rendering engine team tackled explosive task growth, stability pressure, and testing bottlenecks by implementing a test‑left shift strategy that mandates pre‑merge unit tests, automated CI pipelines, and an efficient regression platform, ultimately boosting quality and developer productivity.

Qunhe Technology Quality Tech

Aug 28, 2025

How Shifting MR Testing Left Supercharged Our Rendering Engine Quality

Background Introduction

KE rendering engine is a fully self‑developed ray‑tracing based engine. Over the past year, driven by business expansion, the rendering volume exploded, bringing significant testing challenges.

Stability pressure from rapid business scale :

Since the second half of 2024, daily tasks grew from 300k to 2.5M, putting huge pressure on version testing and risking high‑severity production failures if coverage is insufficient.

The number of business partners increased from 1 to 3, requiring 100% automated coverage of all engine capabilities.

Balancing iteration speed and quality :

To optimize rendering performance and GPU usage, KE needs fast iteration.

Each version changes 30%‑50% of the code, with a 4‑week development cycle and only ~10‑day testing window, leading to a development‑to‑test ratio of 8:1.

High complexity of problem diagnosis :

Effect changes often interact, requiring exhaustive binary search across all submissions to locate root causes.

Large scene data makes local debugging take 5‑30 minutes, slowing issue resolution.

Solution Approach

To address insufficient test coverage and low diagnosis efficiency, we introduced a "test left shift" strategy, requiring every developer to pass basic effect and functional verification before merging to the main branch. This strategy builds three core modules:

Comprehensive and effective unit test cases

Construct test suites covering core engine functions and common scenarios.

Add test cases for new features and integrate them into the automated regression flow.

Automated CI process

Package the MR source branch and its baseline branch (containing only MR diffs) for testing.

Moon automatically triggers the configured regression test set after packaging.

Automatically trigger MR regression and notify results.

Stable and efficient regression testing platform

Provide visual effect diff display and integrate branch management, packaging, and task distribution.

Support consumption and post‑processing of MR regression tasks.

Improve usability so developers can self‑diagnose issues.

With this system, problems are discovered early, test quality improves, and development efficiency rises.

MR Left‑Shift Test Case Design

Challenges and Ideas

As non‑graphics specialists, we face a knowledge barrier regarding rendering concepts such as textures, models, and materials. Designing comprehensive test cases therefore requires collaboration among testing, product, development, and design teams.

Limitations of Traditional Test Cases

Coverage is hard to quantify.

Functional differences are difficult to automate; SSIM and PSNR differences are often too small.

Redundant problem diagnosis across multiple scenes.

Material parameter coverage is incomplete.

Example difference: Even with SSIM = 0.9888 and PSNR > 40, functional defects may still exist and are hard to spot visually.

Unit Test Case Design Strategy

Historical function coverage: assign test cases to developers based on engine modules.

Simplified scene design: designers collaborate with developers to create lightweight scenes for faster rendering.

Parameterized material ball generation: script creates PBR materials by mixing all material node parameters.

Template application: map generated material parameters to template balls, covering many test points with a single render.

New feature coverage: require new test cases before each MR submission.

MR Left‑Shift Workflow

Pre‑Shift Engine Workflow

GitLab : code management and branch control.

Moon : manual packaging platform, supports webhook triggers.

Rendering regression platform : batch add test cases, execute different sets, view results.

All modules operate independently, requiring manual triggers.

Changes After Left‑Shift

GitLab : merge latest main before MR to reduce conflicts.

Moon : automatic packaging triggered by webhook with specified test set.

Regression platform : manage MR baseline and auto‑create tasks.

Testing process : developers self‑diagnose, fix, and re‑trigger tests before merging.

MR Left‑Shift Tool Implementation

GitLab : analyze MR API to obtain branch, commit, description, status, and base commit for packaging and baseline management.

Moon : invoke packaging API with branch and commit; webhook adds regression platform task ID.

Regression platform : implement webhook logic to avoid duplicate packaging, support unit‑test task types (image, video, channel map), and provide a diff algorithm that colors pixel differences by magnitude, marking tasks as failed when exceeding thresholds.

Improving Test Result Readability

Diff logic now compares pixel values and colors differences, making subtle changes obvious.

Task status now fails on diff failures, displaying only failed cases first.

Practice Results

Before vs. After Comparison

Code verification process : before, code merged without testing; after, each commit automatically triggers CI to ensure core functionality.

Problem discovery stage : shifted from SIT to feature stage, reducing defect leakage by 42%.

Defect localization difficulty : reduced by 50% because only MR changes are tested.

Test case accumulation : test cases are now integrated before release, improving engine coverage.

Quality Data

Daily offline rendering volume grew from 300k (June 2024) to 2.5M (April 2025) with no P3+ incidents caused by missed coverage.

Since August 2024, MR regression has lowered monthly bug counts; 152 MR regressions ran, with 26% of commits requiring repeated MR testing.

Regression speed improved by 50%, enabling hot‑fix verification within 0.5 day.

Future Outlook

While MR regression left‑shift secures rendering quality, performance testing remains a gap. As KE scales and real‑time demands rise, performance left‑shift will become essential, leveraging server usage dashboards, diversified performance regression suites, and AI‑assisted analysis.