Databases 12 min read

OceanBase Breaks Records at ICDE 2025 with Six Papers and a Best Industry Award

At ICDE 2025 in Hong Kong, OceanBase showcased six accepted papers—including a Best Industry and Application Paper Runner‑Up award—while also hosting a technical symposium on AI‑era databases and presenting detailed abstracts on distributed database innovations, secure query processing, federated learning, hypergraph clustering, query equivalence, and cardinality‑estimation workload generation.

AntTech
AntTech
AntTech
OceanBase Breaks Records at ICDE 2025 with Six Papers and a Best Industry Award

ICDE 2025 Overview

The 41st IEEE International Conference on Data Engineering (ICDE 2025) was held in Hong Kong from May 19 to 23, attracting over 850 leading scholars, industry experts, and practitioners to discuss cutting‑edge advances and practical applications in data engineering.

OceanBase Achievements

OceanBase achieved a historic milestone by having six papers accepted—two as first‑author and four co‑authored with universities—setting a new record for the company at a top‑tier conference. One paper, "OceanBase Unitization: Building the Next Generation of Online Map Applications," earned the Best Industry and Application Paper Runner‑Up award, marking OceanBase’s first major prize at ICDE.

Highlighted Papers

OceanBase Unitization: Building the Next Generation of Online Map Applications

Authors from OceanBase, Ant Group, AMap, Alibaba, and Cornell University propose a unit‑based distributed database architecture that isolates services and operations onto individual machines, ensuring data replication and seamless failover. By unitizing read/write operations with a hybrid centralized‑unit approach and dynamically optimizing for OLTP and OLAP workloads, the system was deployed on AMap’s online map platform, demonstrating strong fault‑tolerance and performance gains in both write‑intensive and read‑intensive benchmarks.

How to Answer Secure and Private SQL Queries?

The paper explores a combination of secure computation and differential privacy to protect both the query execution process and the resulting data. It introduces a modular decomposition of complex SQL statements for efficient secure computation and shows how differential privacy can be layered on top to provide end‑to‑end protection while allowing adjustable privacy‑utility trade‑offs.

Hounding Data Diversity: Towards Participant Selection in Vertical Federated Learning

Addressing high cost and diversity loss in participant selection for vertical federated learning, the authors present the VFPS‑SM framework with submodular optimization properties. Using KNN as a proxy model, the selection problem is transformed into submodular function maximization. Combined with top‑k query algorithms and homomorphic encryption, VFPS‑SM improves selection efficiency up to 365× and training efficiency up to 35×, while enhancing model accuracy by up to 6 %.

Efficient Structural Clustering over Hypergraphs

Building on the limitations of SCAN for hypergraph clustering, the paper introduces HSCAN, a new structural clustering model designed for hypergraphs. It features sequential indexing for fast key‑information extraction and lightweight bucket indexing to reduce index cost. Two index‑based query algorithms—sequential and parallel—are proposed, and experiments show HSCAN outperforms existing models, with query algorithms achieving up to three orders of magnitude speedup.

Query Weak Equivalence and Its Verification in Analytical Databases

The authors define “query weak equivalence” and introduce Query Lattice to identify queries that produce identical results in OLAP workloads despite not being semantically equivalent. By reusing stored queries, the approach reduces redundant plan generation. Benchmarks on TPC‑H and TPC‑H Skew demonstrate up to a 44.95 % performance improvement over vanilla PostgreSQL.

Artemis: A Customizable Workload Generation Toolkit for Benchmarking Cardinality Estimation

Artemis is a workload generator that creates diverse SQL workloads featuring data dependencies, complex query structures, and varied cardinalities. It uses deterministic primary‑key‑based data generation and a search‑based method to construct queries of differing complexity, while constraint optimization reduces cardinality‑estimation cost.

Technical Symposium on AI‑Era Databases

During the conference, OceanBase hosted a symposium titled “Databases in the AI Era,” featuring talks from leading academics and industry pioneers on topics such as graph‑neural integration for relational databases, vector database evolution, cross‑modal retrieval, and AI‑driven business intelligence. The symposium highlighted OceanBase’s vision for an integrated AI‑enabled data foundation.

Beyond the symposium, OceanBase participated in paper presentations, round‑table discussions, and tutorial sessions, showcasing the latest research directions and practical innovations in database technology, and reinforcing its role as a leading distributed database platform in the AI era.

AIdistributed databasesOceanBaseDatabase ResearchPaper SummariesICDE 2025
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.