How Didi and Ant Financial Co‑Built SQLFlow to Bring AI to Data Analysts

The article describes how Didi's data science team partnered with Ant Financial to open‑source SQLFlow, a tool that translates SQL into Python for AI model training and inference, enabling analysts to use familiar SQL to run deep‑learning, XGBoost, and clustering models across diverse business scenarios.

ITPUB
ITPUB
ITPUB
How Didi and Ant Financial Co‑Built SQLFlow to Bring AI to Data Analysts

Background

Oracle highlighted in 2018 that while many developers build AI services with Python or C++, most business analysts work primarily with SQL. In July 2019 Ant Financial open‑sourced SQLFlow, a framework that translates SQL programs into Python code, invokes database and AI engines, and delivers end‑to‑end AI capabilities directly from SQL.

Collaboration between Didi and Ant Financial

The joint effort follows a three‑step plan:

Didi contributes deep domain knowledge of its transportation and finance products.

Didi contributes three high‑value models—a DNN classification model, an explainable model, and an unsupervised clustering model—to SQLFlow.

Didi joins the broader SQLFlow open‑source community to co‑build the ecosystem, sharing models, culture, and practices.

Contributed Models and Use Cases

DNN classification model : powers product‑growth recommendation and targeted user recommendations.

Explainable model : provides SHAP‑based visual explanations for predictions, improving interpretability for operations and marketing.

Unsupervised clustering model : analyses driver activity patterns to generate dispatch suggestions and balance supply‑demand.

Technical Enhancements Added by Didi

Integrated XGBoost, enabling tree‑based models alongside deep‑learning models.

Added support for unsupervised‑learning workflows, allowing clustering analysis within SQLFlow.

Implemented SHAP‑based explanation visualizations for both deep‑learning and tree models.

Provided compatibility with Didi’s Hive data warehouse.

Architecture Vision

The “SQL garden” concept treats SQLFlow as a foundational platform where domain‑specific plug‑ins (e.g., transportation) can be added, creating an open‑source marketplace of reusable SQL‑driven AI solutions.

Resources

Website: https://sqlflow.org

GitHub repository: https://github.com/sql-machine-learning/sqlflow

Docker command to run the car‑price prediction example (XGBoost model with SHAP explanations):

docker run -p 8888:8888 sqlflow/sqlflow:didi
Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

machine learningSQLAIData ScienceSQLFlow
ITPUB
Written by

ITPUB

Official ITPUB account sharing technical insights, community news, and exciting events.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.