Rule‑Based NLQ vs LLMs: How ChatBI’s MQL Engine Delivers Precise BI Queries

The article explains how the rule‑based NLQ component of ChatBI replaces large language models with a detailed dictionary‑driven architecture, using a custom Metrics Query Language (MQL) to transform natural‑language business questions into accurate SQL, highlighting its stability, low cost, transparency, and limitations compared to LLM solutions.

Big Data Technology & Architecture
Big Data Technology & Architecture
Big Data Technology & Architecture
Rule‑Based NLQ vs LLMs: How ChatBI’s MQL Engine Delivers Precise BI Queries

Core Mechanism

Instead of a generative "brain" like a large language model, the NLQ engine acts as a meticulous "traffic control tower" that relies on a pre‑populated, highly detailed "city traffic manual"—the dictionary and domain knowledge base.

Dictionary Structure

The dictionary is the soul of NLQ and consists of several layered vocabularies:

Data‑Metadata Dictionary : defines tables (e.g., order table), fields (customer name, order amount), and data types (text, number, date). This forms the basic map.

Business Semantic Dictionary :

Dimension Words : dimensions such as year, month, province, product category.

Metric Words : business metrics like sales amount, monthly active users; each metric can bind a calculation formula (e.g., sales = unit_price * quantity).

Constant Words : enumerated values like city names that map to internal IDs.

Query Logic Dictionary :

Comparison Words : >, ≤, between, each mapped to an expression (e.g., ?1 > ?2).

Aggregation Words : sum, avg, max, corresponding to SQL functions.

Invalid Words : filler phrases that are filtered out.

From Natural Language to MQL

In the BI scenario most queries can be abstracted as combinations of dimensions, metrics, and conditions. NLQ defines a dedicated Metrics Query Language (MQL) as an intermediate representation. User input is first parsed into a structured MQL statement, then translated into SQL for execution.

Examples: "40 岁以上雇员姓名、年龄、城市和省" → NLQ identifies the age filter and returns the requested fields. "每月订单数" → MQL automatically groups by month and counts orders. "订单编码,商品名称,供应商名称和城市" → NLQ resolves multi‑table relationships and returns aligned data. "每年的付款数和总销售金额" → MQL aligns metrics across different fact tables on a common time dimension. "订单金额总和大于 20 万元的女员工" → NLQ performs a sub‑table aggregation then filters employees.

Concrete Query Process

Tokenization & Filtering : The sentence "去年北京发往青岛的订单" is split into tokens (去年, 北京, 发往, 青岛, 订单) and stop words are removed.

Dictionary Matching & Semantic Linking :

去年 → matches the "year" dimension, expression becomes year(ADDYEARS(now(), -1)).

北京 / 青岛 → constant words under the "city" dimension, mapped to internal city IDs.

发往 → identified as a verb linked to the "shipping" field cluster, which includes shipping city, receiving city, shipping time, etc.

订单 → matches the "order" entity, establishing the primary table and default fields.

MQL Generation : All matched elements are assembled into a single MQL statement describing the query logic.

Execution & Return : The MQL engine converts the MQL into executable SQL, runs it on the database, and returns the result set.

Image
Image

Hard‑Core Advantages

Stability & No Hallucination : If a term is missing from the dictionary, NLQ reports "unrecognizable" instead of fabricating an answer.

Low Cost & Easy Deployment : Rule‑engine computation runs on ordinary CPUs, avoiding expensive GPU clusters required by LLMs.

Transparent & Maintainable Knowledge : Business rule changes (e.g., adjusting the sales formula) are made by editing the metric dictionary, unlike the opaque fine‑tuning required for LLMs.

Limitations

Flexibility : NLQ cannot handle highly informal queries like "most popular products"; it requires relatively structured language.

Manual Knowledge Updates : New business metrics must be added to the dictionary by administrators; the system does not learn autonomously.

Hybrid Architecture Idea

A practical design combines the strengths of both approaches: an LLM serves as an "intelligent front‑desk" to converse freely with users and translate informal utterances into the more formal language that NLQ can understand. After user confirmation, NLQ executes the precise query, delivering accurate results at low cost.

This synergy offers the conversational friendliness of LLMs while retaining the deterministic, cost‑effective execution of the rule‑based NLQ engine.

Image
Image
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

Rule Enginebusiness intelligenceData QueryLLM comparisonMQLNLQ
Big Data Technology & Architecture
Written by

Big Data Technology & Architecture

Wang Zhiwu, a big data expert, dedicated to sharing big data technology.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.