Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

This article presents a comprehensive study on improving Text-to-SQL performance by introducing J‑Schema for structured schema representation, applying iterative Direct Preference Optimization (DPO) training, and leveraging self‑consistency voting mechanisms, achieving up to a 12% accuracy gain on the BIRD benchmark.

JD Cloud Developers
JD Cloud Developers
JD Cloud Developers
Boosting Text-to-SQL Accuracy: J‑Schema, Iterative DPO, and Self‑Consistency

Technical background: Text2SQL converts natural language queries to SQL, evolving through rule‑based, neural, pretrained language model, and large language model stages. Current challenges are prompt optimization, model training, and inference robustness, investigated on the BIRD dataset.

Text2SQL Challenges

Text‑to‑SQL (NL2SQL) aims to generate executable SQL from natural language, enabling non‑expert users to query complex databases. The field has progressed through four stages: rule‑based, neural network, pretrained language models, and large language models, each addressing increasing complexity.

Three major difficulties remain: prompt optimization (designing prompts and schema presentation), model training (enhancing base capabilities), and inference enhancement (stabilizing LLM outputs).

Prompt & J‑Schema

We propose J‑Schema, a fully structured representation of database schema using special markers such as #DB_ID, #Table, and #Foreign keys. For each table we provide name, column information via basic_info, and example values, with rules to limit examples for date, float, integer, and text types.

Training Method: Iterative DPO

Iterative Direct Preference Optimization (DPO) repeatedly samples chain‑of‑thought reasoning steps and final answers, builds positive and negative example pools, forms preference pairs, and fine‑tunes the model. Multiple iterations increase execution accuracy, peaking at the third stage.

Execution accuracy on the BIRD benchmark improves from 63.69% (baseline) to 67.60% after the third iterative DPO stage.

Hyperparameter Scan

We vary the DPO loss weight β from 0.1 to 0.6, training two epochs per setting. The highest execution accuracy (≈68%) is achieved at β = 0.5.

Self‑Consistency

Self‑consistency generates multiple candidate SQL answers per query and selects the best via hard or soft voting. Soft voting, which considers answer similarity, consistently outperforms hard voting, yielding over 1% absolute accuracy gains.

For the iterative stage‑3 model, execution accuracy rises from 67.60% (no self‑consistency) to 68.97% with soft voting.

Future Directions

We plan to construct higher‑quality data from the million‑scale SynSQL‑2.5M dataset for BIRD, explore alternative training methods such as GRPO, and evaluate on additional benchmarks like Spider, ScienceBenchmark, and EHRSQL.

LLMText-to-SQLSelf-ConsistencyIterative DPOJ-SchemaDatabase QA
JD Cloud Developers
Written by

JD Cloud Developers

JD Cloud Developers (Developer of JD Technology) is a JD Technology Group platform offering technical sharing and communication for AI, cloud computing, IoT and related developers. It publishes JD product technical information, industry content, and tech event news. Embrace technology and partner with developers to envision the future.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.