Enterprise Text2SQL with Qwen3.5‑Plus: Let Business Users Query Databases Directly
This article walks through building an enterprise‑grade Text2SQL system using Qwen3.5‑Plus, covering model selection, schema injection, system architecture, code integration, security checks, accuracy engineering, common pitfalls, and future outlook for data democratization.
Problem: Data democratization’s last mile
Business users often need simple questions like “What were last week’s sales by region?” but must rely on data engineers to write multi‑table JOIN SQL, leading to slow, error‑prone cycles.
Why Qwen3.5‑Plus?
Released on 2026‑02‑16, Qwen3.5‑Plus marks a breakthrough for code/SQL generation in Chinese large models.
Total parameters : 397B (MoE high‑sparsity architecture)
Active parameters : 17B (low inference cost)
Context window : 256K tokens (expandable to 1M) – enough to inject dozens of table schemas.
Attention mechanism : Gated DeltaNet + Gated Attention (NeurIPS 2025 best paper)
Inference throughput : up to 19× previous generation, crucial for high‑concurrency enterprise workloads.
VRAM saving : 60% reduction, lowering private‑deployment barriers.
License : Apache 2.0 (commercial use and fine‑tuning allowed).
Why these specs matter for Text2SQL
The 256K context enables full‑schema injection for medium‑size databases (20‑50 tables), eliminating the need for a separate retrieval step that often returns wrong tables. The deep visual‑language‑code fusion gives the model superior understanding of SQL semantics compared with ordinary dialogue models. Dual‑mode inference (Thinking Mode for complex JOINs, Flash Mode for simple queries) lets enterprises switch dynamically based on query complexity.
Enterprise Text2SQL Architecture
┌─────────────────────────────────────────────┐
│ User Interaction Layer │
│ Natural language → Multi‑turn dialogue → │
│ Result visualization │
├─────────────────────────────────────────────┤
│ Understanding Enhancement Layer │
│ Intent detection → Context completion → │
│ Ambiguity elimination │
├─────────────────────────────────────────────┤
│ SQL Generation Core (Qwen3.5‑Plus) │
│ Schema injection → CoT reasoning → SQL gen │
│ Validation & correction │
├─────────────────────────────────────────────┤
│ Data Execution Layer │
│ Permission check → SQL execution → │
│ Result formatting → Audit logging │
└─────────────────────────────────────────────┘The core layer is the focus of the implementation.
Schema Injection: The often‑underestimated step
Simply feeding raw DDL to the model introduces noise (character sets, engines, indexes). Instead, use the M‑Schema format, which provides clean, business‑oriented descriptions.
表:t_order(订单主表)
字段名 类型 说明 示例值
order_id BIGINT 订单唯一ID(主键) 10086
user_id BIGINT 用户ID → 关联 t_user.user_id 20001
region_code VARCHAR 区域编码 → 关联 t_region.code 'BJ','SH','GZ'
amt DECIMAL 实付金额(元,已含税) 299.00
status TINYINT 订单状态:1=待支付 2=已完成 3=已取消 2
create_time DATETIME 创建时间 2026-04-15 10:30:00Three essential elements for a reliable M‑Schema:
Business meaning of each field (not just the name).
Example values, especially for enumerations.
Explicit table relationships (which column links to which table).
Automatic Table‑Relation Extraction
python def extract_table_relations(tables: list[str], db_conn):
"""Based on naming conventions infer foreign‑key relations.
Rule: {entity}_id → t_{entity}.id"""
relations = {}
for table in tables:
columns = get_columns(db_conn, table)
for col in columns:
if col.endswith('_id') or col.endswith('_code'):
entity = col.rsplit('_', 1)[0]
candidate = f"t_{entity}"
if candidate in tables:
relations[f"{table}.{col}"] = f"{candidate}.id"
return relationsCore Code: Accessing Qwen3.5‑Plus
Option 1 – Alibaba Cloud DashScope (recommended for enterprises)
python from openai import OpenAI
client = OpenAI(
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
api_key="YOUR_DASHSCOPE_API_KEY"
)
SYSTEM_PROMPT = """You are a professional SQL generation assistant. Generate accurate SELECT statements based on the user's natural‑language question and the provided database schema.
Rules:
1. Only SELECT queries; no INSERT/UPDATE/DELETE/DROP.
2. Use standard MySQL 8.0 syntax.
3. Use BETWEEN or >=/<= for time ranges.
4. Wrap SQL in ```sql code block.
5. Precede SQL with a brief 1‑2 sentence explanation of the logic."""
def text2sql(question: str, schema: str, thinking_mode: bool = False) -> str:
user_message = f"""Database schema:
{schema}
User question: {question}
Generate the corresponding SQL."""
extra_params = {}
if thinking_mode:
extra_params["extra_body"] = {"enable_thinking": True}
response = client.chat.completions.create(
model="qwen3.5-plus",
messages=[{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": user_message}],
temperature=0.1,
max_tokens=2048,
extra_params=extra_params
)
return response.choices[0].message.contentOption 2 – Local deployment (private data scenarios)
pip install vllm
python -m vllm.entrypoints.openai.api_server \
--model Qwen/Qwen3.5-Plus \
--tensor-parallel-size 4 \
--max-model-len 32768 \
--enable-lora # if fine‑tuning is neededAfter local deployment, simply change the base_url to the local address; the rest of the code stays identical.
SQL Security Execution Layer
The biggest risk after deployment is data security, not model accuracy.
python import sqlparse
from sqlparse.tokens import Keyword, DDL, DML
class SQLSecurityChecker:
FORBIDDEN_KEYWORDS = {'INSERT','UPDATE','DELETE','DROP','TRUNCATE','CREATE','ALTER','GRANT','REVOKE','EXEC'}
def is_safe(self, sql: str) -> tuple[bool, str]:
parsed = sqlparse.parse(sql)[0]
for token in parsed.flatten():
if token.ttype in (DDL, DML) and token.normalized.upper() in self.FORBIDDEN_KEYWORDS:
return False, f"Prohibited {token.normalized} operation"
if 'LIMIT' not in sql.upper():
sql = sql.rstrip(';') + ' LIMIT 1000'
return True, sqlExecution flow wraps generation, security check, and actual query execution, returning results or detailed error messages.
Accuracy Engineering: From 70% to 90%+
7.1 Few‑shot examples
FEW_SHOT_EXAMPLES = """以下是一些参考示例:
问题:上个月北京区域的订单总量和总金额?
SQL: SELECT COUNT(*) AS order_cnt, SUM(amt) AS total_amt FROM t_order o JOIN t_region r ON o.region_code = r.code WHERE r.name = '北京' AND create_time BETWEEN DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 MONTH), '%Y-%m-01') AND DATE_FORMAT(NOW(), '%Y-%m-01') AND o.status = 2
问题:哪个销售员今年成单量最多?
SQL: SELECT s.name, COUNT(*) AS deal_cnt FROM t_order o JOIN t_sales s ON o.sales_id = s.id WHERE YEAR(o.create_time) = YEAR(NOW()) AND o.status = 2 GROUP BY s.id, s.name ORDER BY deal_cnt DESC LIMIT 10"""Embedding these business‑specific Q&A pairs in the system prompt dramatically improves model alignment with the database naming conventions.
7.2 Self‑consistency voting
python def reliable_text2sql(question: str, schema: str, n_samples: int = 3) -> str:
candidates = []
for _ in range(n_samples):
sql = text2sql(question, schema)
candidates.append(sql)
results = {}
for sql in candidates:
result_hash = hash(str(execute_query(sql)))
results.setdefault(result_hash, []).append(sql)
best = max(results.values(), key=len)
return best[0]7.3 Error self‑repair
python def text2sql_with_retry(question: str, schema: str, db_conn, max_retries: int = 2) -> dict:
result = execute_text2sql(question, db_conn, schema)
if "error" not in result:
return result
for attempt in range(max_retries):
fix_prompt = f"""The previous SQL raised an error:
Error: {result['error']}
SQL: {result.get('sql','unknown')}
Please analyze the cause and regenerate a correct SQL."""
fixed_output = client.chat.completions.create(
model="qwen3.5-plus",
messages=[{"role": "system", "content": SYSTEM_PROMPT},
{"role": "user", "content": f"Database schema:
{schema}
Original question: {question}"},
{"role": "assistant", "content": result.get('sql','')},
{"role": "user", "content": fix_prompt}],
temperature=0.1
).choices[0].message.content
result = execute_text2sql(question, db_conn, schema)
if "error" not in result:
break
return resultModel Selection Matrix
Not every scenario needs the most powerful model. Recommended layering:
Simple single‑table queries (WHERE/ORDER BY) → Qwen‑Plus (high cost‑effectiveness, $0.12 per M input tokens).
Medium complexity (2‑3 table JOIN, GROUP BY) → Qwen3‑Max (strong reasoning).
Complex multi‑table nested sub‑queries → Qwen3.5‑Plus (Thinking Mode gives higher success rate).
Historical query reuse → vector retrieval + any model (SQL example library fills parameters).
Highly sensitive data (no external exposure) → locally fine‑tuned Qwen3.5‑27B (runs on a single RTX 4090).
Principle: use the cheapest model that meets the task; upgrade only when the capability ceiling is reached.
Known Pitfalls and Mitigations
Pitfall 1: Too many tables exceed context. Mitigation: embed table names + descriptions with an embedding model (e.g., Milvus + bge‑m3), retrieve top‑10 relevant tables, then inject full schema.
Pitfall 2: Model generates nonexistent field names. Mitigation: ensure every field appears in M‑Schema with business meaning and example values; never omit rarely used columns.
Pitfall 3: Ambiguous time‑range expressions. Mitigation: define “last month” explicitly in the system prompt as the first to last day of the previous calendar month.
Pitfall 4: Multi‑turn context loss. Mitigation: pass the full dialogue history (question + generated SQL + result summary) to the next turn.
Pitfall 5: Model refuses to generate SQL due to sensitive keywords. Mitigation: prepend a statement in the system prompt confirming that all queries have internal authorization.
Future Outlook
Text2SQL is not a universal solution. It excels at standardized historical data queries, clear OLAP scenarios, and internal BI self‑service, but struggles with complex financial calculations, cross‑system joins, and real‑time streaming data. The code capabilities of Qwen3.5‑Plus were unattainable six months ago; by the end of 2026, accuracy is expected to surpass the 90 % industrial threshold, turning Text2SQL from a helpful assistant into a trusted data entry point.
Repository Structure
text2sql-enterprise/
├── config/
│ ├── db_config.yaml # Database connection settings
│ └── model_config.yaml # Model API key, etc.
├── schema/
│ ├── extractor.py # Automatic M‑Schema extraction
│ └── schemas/ # Per‑domain schema files (sales.md, finance.md, hr.md)
├── core/
│ ├── text2sql.py # Core SQL generation logic
│ ├── security.py # SQL security checker
│ ├── executor.py # SQL executor
│ └── retry.py # Error self‑repair module
├── api/
│ └── server.py # FastAPI interface
└── examples/
└── few_shots.yaml # Business example SQL librarySigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Lao Guo's Learning Space
AI learning, discussion, and hands‑on practice with self‑reflection
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
