How I Integrated Antlr into Seata’s SQL Parser to Fix Transaction Parsing Issues
In 2020 I joined Alibaba's ASoC program, explored Seata's Druid‑based SQL parser, replaced it with an Antlr‑generated solution, tackled spacing bugs by overriding visitTerminal, and reflected on the open‑source contribution experience that enhanced Seata's distributed‑transaction capabilities.
Background
Seata is an open‑source distributed‑transaction framework. Its SqlParser module originally relied on Alibaba Druid, which limited consistent SQL parsing across micro‑services.
Why Antlr
Antlr offers flexible grammar definitions and is used by projects such as Hive, Spark, and NetBeans. It enables generation of a dedicated lexer and parser for the MySQL dialect, providing better extensibility and maintainability than Druid.
Implementation Details
Created MySQL lexer and parser grammars ( MySqlLexer.g4, MySqlParser.g4) compatible with Antlr v4.0.
Implemented a visitor that traverses the Antlr parse tree and builds Seata’s internal AST (e.g., InsertSpecificationSql).
Replaced the previous look‑ahead (LA) traversal with Antlr’s CommonTokenStream to avoid case‑conversion errors.
Adapted the existing SqlParser API so that method signatures remain unchanged while delegating to the Antlr‑generated parser.
Addressed loss of whitespace when rendering the parsed SQL back to text. Overrode visitTerminal and added a shouldAddSpace helper to insert spaces only when syntactically required.
Space‑Preserving Rendering
@Override
public StringBuilder visitTerminal(TerminalNode node) {
String text = node.getText();
if (text != null && !"".equals(text.trim())) {
if (shouldAddSpace(text.trim())) {
sb.append(" ");
}
sb.append(text);
}
return sb;
}
private boolean shouldAddSpace(String text) {
if (sb.length() == 0) return false;
char lastChar = sb.charAt(sb.length() - 1);
switch (lastChar) {
case '.': case ',': case '(':
return false;
default:
break;
}
switch (text.charAt(0)) {
case '.': case ',': case ')':
return false;
default:
break;
}
return true;
}Supported SQL Statements
The Antlr‑based parser successfully handles MySQL INSERT, SELECT, UPDATE, DELETE statements and batch operations. The design allows straightforward extension for additional MySQL syntax.
Integration Outcome
The new parser was merged into the Seata GitHub repository, improving the framework’s SQL parsing flexibility, reducing dependency on Druid, and simplifying future grammar extensions.
Illustration of the Whitespace Issue
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Native
We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
