Highlights of Ant Group Papers Presented at VLDB 2024
From August 26‑30, 2024, Ant Group showcased eight papers—including seven oral presentations—at the VLDB 2024 conference in Guangzhou, covering advances in knowledge‑graph warehouses, graph analytics, joint clustering and imputation, secure multi‑party queries, distributed logging, deep recommendation model training, cloud autoscaling, and LLM‑driven database interaction.
From August 26‑30, 2024, the VLDB 2024 conference was held in Guangzhou, featuring eight papers from Ant Group, seven of which were oral presentations, spanning topics such as databases, graph computing, data analysis, knowledge graphs, and deep learning.
KGFabric: A Scalable Knowledge Graph Warehouse for Enterprise Data Interconnection Category: Industry Paper & Oral Source: Independent Fields: Knowledge Graph, Massive Data Management, Distributed Storage and Computing Abstract: Based on Ant Group's diverse application scenarios, the Ant Knowledge Graph Platform (AKGP) manages trillions of structured knowledge graphs serving search, recommendation, and risk control. Existing relational or graph DBMSs cannot meet the growing workload demands, so KGFabric is proposed as an industrial‑scale knowledge graph management system built on a distributed file system (DFS). It provides a near‑real‑time storage engine with a semantic‑enhanced programmable graph model, stores data persistently on DFS (e.g., HDFS), supports native mixed graph and storage formats, and includes a graph‑fusion framework that minimizes data duplication while ensuring safety. KGFabric can manage petabyte‑scale data and supports over 100 billion graph relationships, achieving >90% storage reduction, 21× graph‑fusion speedup, and 100× improvement in multi‑hop semantic graph analysis compared to popular DBMSs and graph databases.
Enabling Window‑Based Monotonic Graph Analytics with Reusable Transitional Results for Pattern‑Consistent Queries (MergeGraph) Category: Research Paper & Oral Source: University‑Industry Collaboration Field: Graph Computing Abstract: Many graph datasets grow continuously (e.g., billions of Alipay transactions daily). Queries often target a time window, making on‑the‑fly analysis costly. Observing that most queries share patterns and exhibit monotonicity, the authors compute transitional results during slice generation for reuse. The proposed MergeGraph framework enables window‑based monotonic graph analytics by reusing these transitional results, achieving up to 11.3× speedup over state‑of‑the‑art methods across four typical graph applications.
Win‑Win: On Simultaneous Clustering and Imputing over Incomplete Data Category: Research Paper & Oral Source: University‑Industry Collaboration Fields: Database, Missing‑Data Imputation, Clustering, Integer Programming Abstract: Traditional pipelines first impute missing values then cluster, yielding only locally optimal solutions. This work jointly formulates clustering and imputation, proving NP‑hardness, and transforms the problem into an integer linear program for exact solutions. Approximation algorithms based on LP relaxation and local‑neighbor methods are designed with provable bounds, demonstrating significant superiority in experiments.
SecretFlow‑SCQL: A Secure Collaborative Query Platform Category: Industry Paper & Oral Source: Independent Fields: Secure Multi‑Party Computation, Data Analytics Abstract: Growing demand for joint data analysis across domains (e.g., medical, finance) clashes with privacy regulations, creating data silos. SCQL is built to support general SQL queries over multi‑party data while optimizing underlying MPC protocols and database relational operations, delivering a secure, easy‑to‑use, and efficient collaborative analytics system.
PALF: Replicated Write‑Ahead Logging for Distributed Databases Category: Industry Paper & Oral Source: Independent Fields: Distributed Database, Consensus Protocols Abstract: Distributed databases require robust write‑ahead logging (WAL) for transaction support. PALF introduces a Paxos‑based append‑only log file system that co‑designs database and distributed log primitives, enabling rich database functionality and serving as a building block for other distributed systems. Experiments show PALF outperforms existing consensus‑based systems and meets performance requirements of distributed databases; PALF code is open‑sourced with OceanBase 4.0.
DLRover‑RM: Resource Optimization for Deep Recommendation Models Training in the Cloud Category: Industry Paper & Oral Source: University‑Industry Collaboration Field: Deep Learning Abstract: Deep Learning Recommendation Models (DLRM) rely on large embedding tables, increasing GPU/CPU/memory usage. The authors investigate Ant Group's DLRM training platform, identifying low resource utilization and instability in cloud environments. DLRover‑RM is introduced as an elastic training framework that improves resource utilization and handles cloud instability, achieving higher efficiency and being adopted by over ten companies.
OptScaler: A Collaborative Framework for Robust Autoscaling in the Cloud Category: Industry Paper & Oral Source: Independent Fields: Cloud Resource Scheduling, Model Predictive Control, Time‑Series Forecasting, Robust Optimization Abstract: Autoscaling balances resource cost and Service Level Objective (SLO) compliance. Existing active (prediction‑based) and passive (feedback‑based) methods suffer from prediction errors and latency, respectively. OptScaler integrates both via mathematical optimization, using a high‑reliability load predictor and a Widrow‑Hoff‑based passive module, combined with MPC and chance constraints. Simulations show OptScaler reduces SLO violations by at least 36% compared to representative baselines.
Demonstration of DB‑GPT: Next Generation Data Interaction System Empowered by Large Language Models Category: Demo Source: University‑Industry Collaboration Fields: Database, AI Abstract: Recent breakthroughs in large language models (LLMs) enable transformative data interaction. DB‑GPT is a production‑ready Python library that integrates LLMs into traditional data interaction tasks, supporting natural‑language‑to‑SQL, multi‑agent workflows, and secure private‑LLM deployment. The system works in local, distributed, and cloud settings, and its code has garnered over 11k GitHub stars.
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.