Operations 32 min read

How We Cut Risk‑Control Test Regression from Hours to Minutes with a Global Feature Graph

This article details how a risk‑control team built a global feature relationship graph, precise testing platform, and full‑domain interception to dramatically reduce regression testing time, improve data quality, and boost overall testing efficiency across the organization.

Huolala Tech

Jan 23, 2024

How We Cut Risk‑Control Test Regression from Hours to Minutes with a Global Feature Graph

Background and Challenges

As Huolala’s business expands rapidly, identifying and avoiding risks becomes critical. The risk‑control system monitors billions of requests daily, relying on feature data sourced from upstream services and enriched through cleaning and completion. Ensuring high‑quality features is essential to prevent mis‑judgments.

Figure: Source of features and risk‑control dependency

Note ①: Risk‑control strategy judgment assembles feature fields, matches conditions, and decides subsequent actions. Note ②: Features are auxiliary information for strategy conditions, originating from request fields or derived through cleaning and completion. Note ③: Completion enriches existing fields via service calls, code functions, statistics, or condition matching.

1.1 Feature Quality Assurance Focus

Risk‑control demands are classified into three types, each with distinct testing emphasis.

1. Risk‑control code changes

When code changes, feature quality assurance focuses on the impact range of feature modifications and the correctness of feature logic processing.

2. No code change, configuration change

When only configuration changes, assurance centers on impact range and data‑fetch accuracy.

3. No code or config change, new business integration

Here, assurance primarily ensures accurate data fetching for the new business.

2.1 Feature Quality Assurance Difficulties

Three main challenges remain despite clear focus areas:

Identifying impact range lacks testing handle

Complex dependencies between strategies, features, and completions make visualisation difficult, leading to time‑consuming and error‑prone impact analysis.

Feature changes trigger extensive regression work with low efficiency.

Validating feature logic correctness has low efficiency

Features depend on various external data sources, requiring extra learning to construct valid test data.

Risk‑control sits downstream; testing requires upstream triggers.

Numerous feature configurations demand re‑learning for regression testing.

Ensuring feature data collection accuracy is passive

The system does not validate upstream data quality, and core production strategies are not configured in test environments, causing data issues to surface only after release.

Downstream placement and many upstream partners make timely detection of field errors difficult.

Solution and Goals

Figure: Risk‑control feature quality assurance system

We adopted three solutions:

Identify impact range: map data dependencies, visualise them, and improve testing efficiency.

Validate logic correctness: standardise feature data construction to lower data‑creation cost.

Ensure data collection accuracy: intercept non‑compliant upstream data, turning passive problems into proactive detection.

To achieve these goals, we built a Feature Test Platform that integrates real‑time risk, data factories, mock services, and requirement data, providing precise, scenario, and full‑domain testing capabilities, thereby improving quality and efficiency throughout the development lifecycle.

3. Risk‑Control Specific Precise Testing

Risk‑control decisions depend heavily on feature fields, which may reference other features or completions. When features are added or changed, we must assess the impact on all dependent data. Since the risk‑control system lacks direct visualisation of these dependencies, manual statistics or complex SQL are usually required.

3.1 Global Relationship Graph

We built a global relationship graph by parsing all strategies, features, and completions via API calls, establishing bidirectional references stored in a database.

Figure: One‑way parsing and two‑way referencing

3.1.1 Double‑Sided Reference Tree Rendering

We render a three‑part tree on the frontend: the centre node (searched feature), the left subtree (incoming references), and the right subtree (outgoing references). Using the relation-graph library (compatible with Vue2, Vue3, React), we create nodes and lines arrays, assign hierarchy intervals, and render the graph.

/**
 * Append node
 * @param id node id
 * @param hierarchy node hierarchy
 * @param nodeType node type dataCompletion,feature,strategy
 * @param relateType query level type 0 child,1 parent
 */
let nextHierarchy = hierarchy + 1;
let preHierarchy = hierarchy - 1;
let currentHierarchy = relateType === 0 ? nextHierarchy : preHierarchy;
let newJsonData = { nodes: [], lines: [] };
let node = { id: 'feature-' + currentHierarchy + item.id, data: { hierarchy: currentHierarchy } };
let line = { from: relateType === 0 ? id : 'feature-' + currentHierarchy + item.id, to: relateType === 0 ? 'feature-' + currentHierarchy + item.id : id, isReverse: relateType !== 0 };
newJsonData.nodes.push(node);
newJsonData.lines.push(line);
this.$refs.seeksRelationGraph.appendJsonData(newJsonData, (seeksRGGraph) => {});

3.1.2 Retrieve Relationship Graph

After building the global graph, any strategy, feature, or completion can be queried, displaying its related nodes and allowing rapid impact identification. This reduces regression case analysis from 4 hours to 5 minutes.

Figure: Retrieved relationship graph

3.2 Precise Testing

Even with the global graph, change detection remains challenging. Unexpected configuration changes or unnoticed code modifications can escape testing. We therefore combine data‑change monitoring and code‑change monitoring to achieve precise testing.

3.2.1 Data Change Monitoring

We poll the risk‑control decision system’s data interface, compare updated_at timestamps, and determine whether a record is new, unchanged, or modified.

/**
 * Determine if data has changed based on Zeuss API model.
 */
Integer zeusId = zeusApiModel.getZeusId();
ZeusApiModel findZeusApi = findByZeusId(zeusId);
if (findZeusApi.getId() != null) {
    if (zeusApiModel.getZeusUpdate().equals(findZeusApi.getZeusUpdate())) {
        return true; // no change
    }
    // update existing record
    zeusApiModel.setId(findZeusApi.getId());
    return false;
} else {
    // insert new record
    addZeusApi(zeusApiModel);
    return false;
}

3.2.2 Code Change Monitoring

We leverage Huolala’s existing Jacoco‑based incremental code scan platform. It provides class and method lists for changed code, which we map to corresponding completions.

public List<String> getClassPathByUrl(String url) {
    List<String> classList = new ArrayList<>();
    String html = sendGet(url);
    Document doc = Jsoup.parse(html);
    String title = "";
    String classTitle = "";
    if (doc.title().contains("release")) {
        title = doc.title();
    } else {
        classTitle = doc.title();
    }
    Elements classElementsList = doc.select("a.el_class");
    for (Element classNode : classElementsList) {
        classList.add(classTitle + "." + classNode.text());
    }
    Elements packageElementsList = doc.select("a.el_package");
    for (Element packageNode : packageElementsList) {
        String urlNext = url + "/" + packageNode.text() + "/index.html";
        classList.addAll(getClassPathByUrl(urlNext));
    }
    return classList;
}

3.2.3 Traversing Referenced Tree

We perform a breadth‑first search on the left (referenced) subtree to collect all features or completions affected by a change.

Breadth‑first search visits each node once, level by level.

LinkedHashMap<String, RiskNode> linkedHashMap = new LinkedHashMap<>();
Queue<RiskNode> queue = new LinkedList<>();
queue.add(startNode);
while (!queue.isEmpty()) {
    RiskNode node = queue.poll();
    linkedHashMap.put(node.getName(), node);
    List<RiskNode> nextNodes = getNextNodes(lines, node);
    if (!nextNodes.isEmpty()) {
        for (RiskNode nextNode : nextNodes) {
            queue.add(nextNode);
        }
    }
}

The traversal reduces impact analysis time from 4 hours to 10 minutes.

4. Scenario‑Based Testing

Risk‑control testing is divided into three stages: risk‑control testing, integration testing, and regression testing. We built direct testing, scenario‑based testing, historical replay, and requirement scenario libraries to streamline the process.

4.1 Direct Risk‑Control Testing

4.1.1 Quick Test

We fetch risk‑control configuration, parse input parameters, and invoke the risk engine directly without upstream traffic, adding automatic assertions.

Figure: Quick test

Test result:

Figure: Quick test page

4.1.2 Completion Mock

We mock completion calls using Huolala’s Java mock platform (JVM‑sandbox bytecode enhancement) to bypass external data sources.

Figure: Mock configuration

4.2 Scenario‑Based Testing

We employ component‑based, keyword‑driven, and data‑driven automation to construct reusable test actions.

Component‑based automation

Keyword‑driven automation

Data‑driven automation

Components from the data factory are wrapped as keywords, enabling rapid scenario composition.

Figure: Keyword configuration

4.2.1 Components and Keywords

External system functionalities and risk‑control actions are encapsulated as components and then exposed as keywords for scenario building.

Figure: Keyword configuration

4.2.2 Scenario Orchestration

We add expected steps (keywords), adjust order, and provide input parameters for each step.

Figure: Scenario orchestration

4.2.3 Scenario Execution

During execution, scenario inputs become global variables; each step reads and updates these variables, achieving data‑driven flow.

Figure: Scenario execution

4.3 Scenario Replay

4.3.1 Historical Scenario Replay

Failed cases are recorded with their inputs and steps; replaying them validates bug fixes.

4.3.2 Requirement Scenario Library

Requirement‑level configurations (strategies, features, completions) are stored, enabling impact queries and reuse across regressions.

Figure: Requirement scenario library

5. Global Interception

Risk‑control accuracy depends on upstream field completeness. To proactively catch non‑compliant data, we introduced a full‑domain interception mechanism in the pre‑environment.

5.1 Global Interception Solution

All changes first pass through a pre‑environment where a full‑traffic control policy validates fields against compliance rules, returning intercept codes and notifying testers.

Figure: Global interception solution

5.2 Global Risk‑Control Strategy

Compliance conditions are derived from interface docs (e.g., user_type == 2 && ep_id is empty) and combined into interception policies that block non‑compliant requests.

Figure: Strategy rollout process

5.3 Issue Tracking

Intercepted flows trigger Feishu notifications; testers create follow‑up tickets to investigate and resolve data quality defects.

Figure: Interception follow‑up ticket

5.4 Intercept Bug Tracking

Long‑standing issues are catalogued and periodically pushed to product owners for resolution.

Figure: Pending bug list

6. Results and Benefits

6.1 End‑to‑End Efficiency Gains

Risk‑control scenario testing: 100+ data‑bound scenarios, 2000+ executions, delivering 600+ hours of efficiency.

Full‑process improvement: total effort reduced from 26 hours to 2.58 hours .

6.2 Data Quality Improvement

Global interception: 60+ policies blocked >70 k abnormal flows, generated 80+ tickets, uncovered 40+ data defects.

Product documentation standards: identified 20+ doc issues, improving upstream data contracts.

Online data quality: mitigated mis‑judgment risk for 8.71% of traffic .

7. Future Outlook

We plan to further enhance risk‑control testing with AI‑driven capabilities.

Intelligence: apply deep learning to auto‑label abnormal traffic, reducing manual effort.

Smart: auto‑generate functional and automation test cases from the strategy graph.

Intelligent Testing: bind cases to scenarios, combine traffic replay for AI‑assisted validation.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

quality assurance Data Quality risk control feature testing

Written by

Huolala Tech

Technology reshapes logistics

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.