Fundamentals 9 min read

Precise Testing Technology: Definition, Implementation, and Practice

Precise testing technology uses static code scanning and dynamic tracing to build a Neo4j call‑graph, automatically recommends test scopes and cases via diff analysis and weighted relationships—including call‑count, module, text similarity, and GCN—thereby improving test adequacy, cutting regression cycles, and dramatically reducing test execution time.

Bilibili Tech
Bilibili Tech
Bilibili Tech
Precise Testing Technology: Definition, Implementation, and Practice

Background

Traditional software testing relies heavily on testers' understanding of business logic, which often leads to insufficient coverage due to limited experience, system complexity, and gaps with real data. With the rise of distributed, micro‑service architectures and big‑data technologies, software becomes more complex and releases faster, increasing testing challenges. Precise testing technology emerged to provide more accurate and efficient testing methods.

Definition

Precise testing aims to solve three core problems: measuring the adequacy of a limited test set, selecting and executing that set effectively, and automating the process with higher precision. Its key characteristics are:

Completeness – measuring test adequacy via code‑coverage metrics.

Accuracy – using precise recommendations to replace manual impact analysis for regression.

Speed – automatic recommendation of test cases and rapid failure localization.

Thus, precise testing is defined as a code‑and‑test‑case‑relationship‑based method for measuring and improving test adequacy.

Implementation Approach

The solution consists of two parallel pipelines:

Static Scan + Test‑Scope Recommendation Static scanning of source code to obtain a basic function call graph. Parsing raw scan data and storing results in Neo4j. Using git diff to capture version differences and querying the graph for impacted interfaces. Recommending test scope based on the analysis.

Dynamic Trace + Test‑Case Recommendation Instrumenting business code. Executing business or automated test cases on the instrumented code. Collecting weighted "test‑case → function call chain" data. Using git diff to detect version changes. Recommending relevant test cases.

Technical Architecture

Key components include:

Static code scanning (custom algorithm + AST scanning) to build a call‑graph.

Storing the graph in Neo4j with relationships: (package)-[contains]->(file)-[contains]->(function)-[calls]->(function).

Version‑diff extraction via Git open API to identify changed functions.

Graph traversal to locate impacted interfaces.

Visualization of version differences and impacted components.

Static Scanning Details

Two rounds of scanning are performed:

Custom algorithm extracts the basic call chain (function nodes and call relations).

AST scanning enriches nodes with additional metadata.

AST (Abstract Syntax Tree) represents the syntactic structure of source code as a tree.

Data Persistence

After obtaining the call‑graph, it is transformed into Cypher statements and stored in Neo4j. The graph captures package → file → function relationships along with path, location, parameters, comments, and line numbers.

Test‑Scope Recommendation

A visual page shows code differences, impacted server interfaces, and client components.

Test‑Case Recommendation Practice

Weighting strategies for the "test‑case → function" relationship include:

Call‑Count Weighting : counts how many times a function is invoked by a test case.

Module Weighting : binds functions under the same business module to a test case.

Text‑Similarity Weighting : computes TF‑IDF similarity between test‑case texts.

GCN (Graph Convolutional Network) Weighting : builds a sparse C×N matrix (C = number of test cases, N = number of functions), normalizes it, and trains a two‑layer GCN. The output embeddings are reduced with t‑SNE for Euclidean distance calculation and stored back in Neo4j.

Examples of each weighting method are illustrated with diagrams in the original material.

Future Outlook

Planned improvements include scaling GCN accuracy with more incremental test cases and adding call‑chain code coloring with visualizations to aid debugging.

Impact

The platform has reduced iteration cycles from three weeks to two weeks, cut automated test execution by 80% in early stages, decreased regression test volume by 60%, and lowered coverage‑related code statistics by 90%.

Code Coveragegraph databasesoftware testingstatic analysisGCNDynamic Analysistest case recommendation
Bilibili Tech
Written by

Bilibili Tech

Provides introductions and tutorials on Bilibili-related technologies.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.