How LLMs Are Revolutionizing Natural Language to SQL for Intelligent Data Queries
This article explores how large language models break the natural‑language‑to‑SQL barrier, outlines the challenges of NLP‑driven data retrieval, compares Text2SQL and Text2DSL approaches, and proposes a unified data service and metric platform to power enterprise‑grade ChatBI solutions.
Background
In the AI wave, large language models (LLMs) can translate natural language into SQL, unlocking the previously unreachable "intelligent data query" scenario and promising faster decision‑making for enterprises.
Generalized NLP‑to‑SQL seems to solve all problems, allowing business users without SQL knowledge to obtain accurate information quickly.
Actually, the "intelligent query" problem is not just about LLMs understanding language and translating to SQL; user queries often contain complex metric intents and internal terminology, making the task much harder.
ChatBI Industry Approach
ChatBI (chat‑based business intelligence) enables users to interact with data via natural language, simplifying business data queries.
1. NLP→x→SQL
Two main solutions dominate the market: Text2SQL and Text2DSL.
Drawbacks of Text2SQL: Generating accurate, executable SQL is challenging; the model must understand SQL syntax, schema, and dialects, and receive clear table and field descriptions. Complex business queries (cross‑table joins, custom metrics) require strong semantic understanding and logical reasoning. Performance issues arise with wide tables containing hundreds of columns, leading to response times over 3 seconds, which harms user experience.
Text2DSL converts natural language to a domain‑specific language (DSL), which is easier for users and abstracts metric, dimension, and filter configuration.
Text2DSL essentially follows a Text→DSL→SQL pipeline; DSL abstracts SQL, and SQL executes the data retrieval. While Text2DSL is easier to implement and can provide more accurate, timely results when the metric system covers most user needs, it still faces challenges such as reliance on a mature metric system and limited query flexibility.
2. Knowledge Base Enhancement
NLP‑to‑SQL requires database DDL and business data knowledge. Vector databases or MCP servers can supplement information for the NLP step.
Knowledge Base Type
Content Example
Enhancement Goal
Structured metadata
Database DDL, data lineage, field comments
Resolve technical ambiguities (e.g., same‑named fields with different meanings)
Business semantic
Metric definitions, business term dictionary, dimension hierarchy
Resolve business ambiguities (e.g., user‑spoken terms vs. internal terminology)
Contextual
Historical dialogs, user permissions, common query patterns
Address personalization needs (e.g., user preferences, data permissions)
The workflow then proceeds as illustrated in the following diagram:
3. Unified Data Service
In ChatBI, a unified data query engine acts as the "central nervous system" linking NLP‑to‑SQL with heterogeneous data sources, abstracting underlying technical differences and automatically adapting generated SQL to various back‑ends.
Heterogeneous Source
Traditional Pain Point
Unified Engine Solution
Relational DB
Manual dialect adaptation (MySQL vs PostgreSQL)
Automatic dialect rewriting (e.g., LIMIT → FETCH FIRST)
OLAP Engine
Manual aggregation optimization (ClickHouse vs Doris)
Smart aggregation conversion (GROUP_CONCAT → array_agg)
NoSQL DB
Custom query logic required (MongoDB pipeline)
Translate SQL to native query language
API/REST
Manual HTTP request + JSON parsing
Auto‑generate REST calls and parse responses
Data Lake
Switching Spark/Presto engines
4. Metric Service
Metric services standardize company metrics, reduce duplicate calculations, and provide a single source of truth for self‑service analytics. Complex, mutable business logic is extracted from the NLP‑generated SQL and handled by a dedicated metric platform (MCP).
Aspect
Traditional NLP‑to‑SQL
NLP + Metric Service (MCP)
Advantage
NLP Core Task
Generate full, complex SQL
Identify metrics, dimensions, filters and map to standard APIs
Reduces NLP difficulty, improves accuracy
Business Logic
Embedded in generated SQL, scattered, volatile
Centralized in MCP with version control
Ensures metric consistency, simplifies changes
Scalability
New metrics require retraining the NLP model
New metrics only need definition in MCP; NLP just recognizes names
Faster response to business needs
Security
Relies on coarse DB permission control
Fine‑grained metric/dimension permissions, supports masking
More secure data access
Maintenance
NLP must understand business logic, high cost
NLP and metric development decoupled, lower maintenance
Higher development efficiency
Consistency
Different SQL styles may yield inconsistent results
MCP provides a single truth source
Improves data trustworthiness
Reusability
SQL usually single‑use
Metrics can be reused across BI, APIs, reports
Maximizes data asset value
Applicability
Simple queries, exploratory analysis, immature metric system
Mature metric system, core business metrics, high‑frequency queries
Better suited for enterprise‑grade ChatBI
5. Code Query
After obtaining metric results, many scenarios require secondary processing. Converting this logic into Python code and leveraging a code executor with self‑correction capabilities enables powerful post‑processing.
The overall workflow follows an Agentic pattern: intent decomposition → tool execution → result aggregation.
Technical Thought
We decompose the process into a workflow, as shown below:
Actual Dify workflow:
Routing Prompt
User intent still needs LLM identification; this step extracts metric, dimension, and filter parameters.
司机的汽车类型名称枚举值为:A、B、C、D等。
业务大区的枚举值为:A、B、C、D等。
请根据用户输入问题,精准匹配以上枚举,抽取出对应的司机汽车类型,业务大区参数。
##要求
如包含多个大区值,用,将值隔开。
如包含多个司机汽车类型,拆分成多个元素存入列表
未明确说明司机汽车类型时,默认为全部的枚举值Intent routing consists of two parts: providing LLM with preset operational knowledge and specifying output format and fallback logic.
Problem Splitting Prompt
After intent routing, the task is split into deterministic data‑processing units to ensure high success rates.
Effect
Data‑Application‑Business Boundary
What Should Data Do?
Data teams must provide appropriate granularity (detail‑level wide tables or atomic metric tables). To avoid overly complex SQL templates and hallucinations, results are usually a single table with metrics laid out horizontally for downstream derivations.
Metadata (field meanings, dimension/metric definitions) helps LLM comprehension, and performance tuning (indexes, materialized views, caches) ensures query latency stays low.
What Should Application Do?
Applications maintain an operational knowledge base and a set of workflows that map user semantics to execution plans. They then perform secondary data processing on the detail‑level results.
What Can Business Say?
Business domain knowledge: custom metrics, industry terms (e.g., special car models).
Operational rules: e.g., "if within range a‑b, apply strategy 1".
Real‑time Q&A: "Scan all cities' xxx model A metric for the last 7 days and apply strategy X".
Both preset knowledge and live queries must be distinguished.
Extended Scalability
Metric systems should evolve from static, manually defined metrics to dynamic, AI‑driven ones, enabling automatic metric generation, intelligent recommendation, real‑time calculation, anomaly detection, root‑cause analysis, explainability, predictive metrics, personalized services, data‑quality governance, and knowledge‑base enrichment.
Evolution Direction
Traditional Solution
AI‑Enhanced Solution
Dynamic metric generation
Fixed, manually defined metrics
AI auto‑identifies data patterns and creates new metrics (e.g., anomaly or forecast metrics)
Intelligent metric recommendation
Expert‑driven static reports
AI suggests high‑value metrics based on user behavior and history
Real‑time metric computation
Batch T+1 or simple streaming
AI optimizes real‑time paths and resource allocation
Metric anomaly detection
Static thresholds (e.g., 3σ)
AI detects complex anomaly patterns across multiple metrics
Automated root‑cause analysis
Manual log and data trace
AI links metric anomalies to underlying causes (business, infra, data)
Metric explainability
Charts + text, manual verification
AI generates natural‑language explanations for metric changes
Predictive metrics
Historical ratios only
AI forecasts future values (e.g., 7‑day churn, inventory demand)
Personalized metric service
One‑size‑all reports
Custom views for different roles (operations, executives)
Metric data‑quality governance
Manual rule checks
AI auto‑detects drift, missing data, consistency issues and repairs logic
Metric knowledge consolidation
Document search by keywords
Enterprise‑level metric knowledge graph with semantic search
The evolution turns the metric system from a "business mirror" into a "business brain", enabling proactive decision‑making.
General Promotion
To extend this workflow to other scenarios, one should:
Decompose operational processes into reusable workflows.
Structure preset knowledge in knowledge bases or code prompts for better LLM understanding.
Provide fine‑grained detail tables when metric services are immature, reducing query complexity.
Future work will move from workflow‑based pipelines to fully Agentic patterns that can autonomously decide, select tools, and act without human intervention.
Product‑Research Team: Li Ming, Lu Huandian, Yang Shulin, Bao Hengbin Author Bio: Li Ming – Big Data expert, former Tencent engineer, now at Huolala, focusing on data‑driven AI applications.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
