How I Integrated Antlr into Seata’s SQL Parser to Fix Transaction Parsing Issues

In 2020 I joined Alibaba's ASoC program, explored Seata's Druid‑based SQL parser, replaced it with an Antlr‑generated solution, tackled spacing bugs by overriding visitTerminal, and reflected on the open‑source contribution experience that enhanced Seata's distributed‑transaction capabilities.

Alibaba Cloud Native
Alibaba Cloud Native
Alibaba Cloud Native
How I Integrated Antlr into Seata’s SQL Parser to Fix Transaction Parsing Issues

Background

Seata is an open‑source distributed‑transaction framework. Its SqlParser module originally relied on Alibaba Druid, which limited consistent SQL parsing across micro‑services.

Why Antlr

Antlr offers flexible grammar definitions and is used by projects such as Hive, Spark, and NetBeans. It enables generation of a dedicated lexer and parser for the MySQL dialect, providing better extensibility and maintainability than Druid.

Implementation Details

Created MySQL lexer and parser grammars ( MySqlLexer.g4, MySqlParser.g4) compatible with Antlr v4.0.

Implemented a visitor that traverses the Antlr parse tree and builds Seata’s internal AST (e.g., InsertSpecificationSql).

Replaced the previous look‑ahead (LA) traversal with Antlr’s CommonTokenStream to avoid case‑conversion errors.

Adapted the existing SqlParser API so that method signatures remain unchanged while delegating to the Antlr‑generated parser.

Addressed loss of whitespace when rendering the parsed SQL back to text. Overrode visitTerminal and added a shouldAddSpace helper to insert spaces only when syntactically required.

Space‑Preserving Rendering

@Override
public StringBuilder visitTerminal(TerminalNode node) {
    String text = node.getText();
    if (text != null && !"".equals(text.trim())) {
        if (shouldAddSpace(text.trim())) {
            sb.append(" ");
        }
        sb.append(text);
    }
    return sb;
}

private boolean shouldAddSpace(String text) {
    if (sb.length() == 0) return false;
    char lastChar = sb.charAt(sb.length() - 1);
    switch (lastChar) {
        case '.': case ',': case '(':
            return false;
        default:
            break;
    }
    switch (text.charAt(0)) {
        case '.': case ',': case ')':
            return false;
        default:
            break;
    }
    return true;
}

Supported SQL Statements

The Antlr‑based parser successfully handles MySQL INSERT, SELECT, UPDATE, DELETE statements and batch operations. The design allows straightforward extension for additional MySQL syntax.

Integration Outcome

The new parser was merged into the Seata GitHub repository, improving the framework’s SQL parsing flexibility, reducing dependency on Druid, and simplifying future grammar extensions.

Illustration of the Whitespace Issue

SQL parsing issue illustration
SQL parsing issue illustration
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

JavaANTLRopen sourceSeataSQL Parser
Alibaba Cloud Native
Written by

Alibaba Cloud Native

We publish cloud-native tech news, curate in-depth content, host regular events and live streams, and share Alibaba product and user case studies. Join us to explore and share the cloud-native insights you need.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.