Big Data 12 min read

Intelligent Merchant‑Side Diagnostic System: Architecture, Rule Engine, and Data Center

The article describes an intelligent merchant‑side diagnostic platform that unifies ad‑operation data in a centralized lake, uses a low‑code rule engine with arithmetic, code, and Java class modes to orchestrate reusable SOPs, and employs an acceleration layer for fast large‑scale queries, achieving over 90% coverage and outlining future expansion.

Alimama Tech
Alimama Tech
Alimama Tech
Intelligent Merchant‑Side Diagnostic System: Architecture, Rule Engine, and Data Center

The article introduces an intelligent diagnostic system for merchants that aims to streamline the handling of time‑consuming ad‑operation tickets. By combining data, metrics, rules, and a service‑oriented approach, the system provides rapid SOP configuration and automated diagnosis, thereby reducing manual effort and building a reusable knowledge base.

Construction Goals

1) Framework: Use SQL or a rule engine with low‑code techniques to improve iteration speed for both business and algorithm teams. 2) Data: Consolidate replay, operation logs, and performance data into a standardized advertising diagnosis data center. 3) Knowledge Base: Create reusable SOP rules (e.g., insufficient balance, campaign offline) that can be shared across multiple diagnostic flows.

Technical Solution

The system is built on a merchant‑side engine that standardizes data storage, service interfaces, and core operators (SQL‑ified, function‑based). It leverages the Dolphin data lake for unified storage and query, and a rule engine for SOP orchestration.

Standardization: unified data access, service APIs, core operators.

Scalability: extensible data and rule layers.

Low‑code: abstracted data reads (SQL‑like), modular flow configuration.

Traceability: logging and monitoring.

Rule Engine

The engine abstracts SOP processes into standardized flows, separating business decisions from system logic. It supports three rule modes:

Simple arithmetic

Four‑operation and conditional expressions that output true/false.

if(100.0 * (competition_times - adReplayInfo.getSn_real_competition_times()) / competition_times) > 80

Code mode

For complex data structures, users write code snippets using QLExpress syntax.

for (i = 0; i < bidwordList.size(); i++) {
    bidword = bidwordList.get(i);
    keywordCateInfoEntity = keywordCatInfo.get(bidword);
    if (entity == null) {
        subSopDiagnosis.appendDetails(String.format("Complaint word [%s] does not match AD category.", bidword));
    } else if (!entity.getCate_map().containsKey(cate_id)) {
        subSopDiagnosis.appendDetails(String.format("Complaint word [%s] mismatched, recommended category [%s].", bidword, entity.getMain_cate_full_name()));
    } else {
        goodWords.add(bidword);
        subSopDiagnosis.appendDetails(String.format("Complaint word [%s] matches AD category.", bidword));
    }
}

Java class mode

Users can provide a custom rule class that is dynamically loaded at runtime.

SOP Chain

The diagnostic chain is divided into four stages: parameter check, data read, rule organization, and result return. Each stage binds specific rule sets to form a complete diagnostic workflow.

Data Center

A unified SQL query engine offers both synchronous and asynchronous queries across Dolphin, IGraph, Hologres, HTTP, HSF, etc., supporting cross‑engine queries. Data is categorized into six groups (material, performance, delivery, intervention, operation, audience) and integrated from multiple sources (BP, SDS, ad engine, logs, DMP). Both offline (ETL to ODPS) and real‑time streams feed the data lake, enabling derived metrics and multi‑metric calculations.

Acceleration Engine

To handle low‑QPS but large‑volume queries, the system uses external table technology to read data directly from HDFS, achieving second‑level latency for long‑term queries. Optimizations include intelligent column ordering, dynamic row‑group sizing, and local caching of indexes and data.

Summary & Outlook

The system currently covers display fluctuation, traffic inaccuracy, and relevance optimization SOPs, achieving high coverage and accuracy (>90%). Future work includes expanding to more ticket types, front‑ending the service layer, and leveraging multi‑dimensional time‑series data for proactive attribution analysis.

rule engineBig Datadata pipelineSQLlow-codeDiagnostics
Alimama Tech
Written by

Alimama Tech

Official Alimama tech channel, showcasing all of Alimama's technical innovations.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.