OceanBase Breaks Records at ICDE 2025 with Six Papers and a Best Industry Award
At ICDE 2025 in Hong Kong, OceanBase showcased six accepted papers—including a Best Industry and Application Paper Runner‑Up award—while also hosting a technical symposium on AI‑era databases and presenting detailed abstracts on distributed database innovations, secure query processing, federated learning, hypergraph clustering, query equivalence, and cardinality‑estimation workload generation.
ICDE 2025 Overview
The 41st IEEE International Conference on Data Engineering (ICDE 2025) was held in Hong Kong from May 19 to 23, attracting over 850 leading scholars, industry experts, and practitioners to discuss cutting‑edge advances and practical applications in data engineering.
OceanBase Achievements
OceanBase achieved a historic milestone by having six papers accepted—two as first‑author and four co‑authored with universities—setting a new record for the company at a top‑tier conference. One paper, "OceanBase Unitization: Building the Next Generation of Online Map Applications," earned the Best Industry and Application Paper Runner‑Up award, marking OceanBase’s first major prize at ICDE.
Highlighted Papers
OceanBase Unitization: Building the Next Generation of Online Map Applications
Authors from OceanBase, Ant Group, AMap, Alibaba, and Cornell University propose a unit‑based distributed database architecture that isolates services and operations onto individual machines, ensuring data replication and seamless failover. By unitizing read/write operations with a hybrid centralized‑unit approach and dynamically optimizing for OLTP and OLAP workloads, the system was deployed on AMap’s online map platform, demonstrating strong fault‑tolerance and performance gains in both write‑intensive and read‑intensive benchmarks.
How to Answer Secure and Private SQL Queries?
The paper explores a combination of secure computation and differential privacy to protect both the query execution process and the resulting data. It introduces a modular decomposition of complex SQL statements for efficient secure computation and shows how differential privacy can be layered on top to provide end‑to‑end protection while allowing adjustable privacy‑utility trade‑offs.
Hounding Data Diversity: Towards Participant Selection in Vertical Federated Learning
Addressing high cost and diversity loss in participant selection for vertical federated learning, the authors present the VFPS‑SM framework with submodular optimization properties. Using KNN as a proxy model, the selection problem is transformed into submodular function maximization. Combined with top‑k query algorithms and homomorphic encryption, VFPS‑SM improves selection efficiency up to 365× and training efficiency up to 35×, while enhancing model accuracy by up to 6 %.
Efficient Structural Clustering over Hypergraphs
Building on the limitations of SCAN for hypergraph clustering, the paper introduces HSCAN, a new structural clustering model designed for hypergraphs. It features sequential indexing for fast key‑information extraction and lightweight bucket indexing to reduce index cost. Two index‑based query algorithms—sequential and parallel—are proposed, and experiments show HSCAN outperforms existing models, with query algorithms achieving up to three orders of magnitude speedup.
Query Weak Equivalence and Its Verification in Analytical Databases
The authors define “query weak equivalence” and introduce Query Lattice to identify queries that produce identical results in OLAP workloads despite not being semantically equivalent. By reusing stored queries, the approach reduces redundant plan generation. Benchmarks on TPC‑H and TPC‑H Skew demonstrate up to a 44.95 % performance improvement over vanilla PostgreSQL.
Artemis: A Customizable Workload Generation Toolkit for Benchmarking Cardinality Estimation
Artemis is a workload generator that creates diverse SQL workloads featuring data dependencies, complex query structures, and varied cardinalities. It uses deterministic primary‑key‑based data generation and a search‑based method to construct queries of differing complexity, while constraint optimization reduces cardinality‑estimation cost.
Technical Symposium on AI‑Era Databases
During the conference, OceanBase hosted a symposium titled “Databases in the AI Era,” featuring talks from leading academics and industry pioneers on topics such as graph‑neural integration for relational databases, vector database evolution, cross‑modal retrieval, and AI‑driven business intelligence. The symposium highlighted OceanBase’s vision for an integrated AI‑enabled data foundation.
Beyond the symposium, OceanBase participated in paper presentations, round‑table discussions, and tutorial sessions, showcasing the latest research directions and practical innovations in database technology, and reinforcing its role as a leading distributed database platform in the AI era.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
