Artificial Intelligence 18 min read

Ant Group’s 11 Papers Accepted at ICML 2024 Cover AI Efficiency, Security, Multimodal Learning, and More

At ICML 2024 in Vienna, Ant Group had eleven papers accepted, spanning topics such as quantization-aware secure inference for transformers, multimodal contrastive captioners, self-cognitive denoising with noisy labels, directed graph embedding, GAN improvement via score matching, and trustworthy alignment of retrieval-augmented large language models.

AntTech
AntTech
AntTech
Ant Group’s 11 Papers Accepted at ICML 2024 Cover AI Efficiency, Security, Multimodal Learning, and More

From July 21–27, 2024, the International Conference on Machine Learning (ICML 2024) was held in Vienna, Austria. The conference received a record 9,473 paper submissions and accepted 2,609 (27.5% acceptance rate). Ant Group had eleven papers accepted, covering a range of cutting‑edge AI and machine learning topics.

1. Ditto: Quantization‑aware Secure Inference of Transformers – This work builds on the Secretflow‑SPU framework to enable quantization‑aware, secure multi‑party computation (MPC) inference for large transformer models. By designing a layer‑wise static dual‑quantization scheme and a compiler that automatically selects precision, Ditto achieves 2–4× inference speedup on BERT and GPT‑2 without sacrificing model utility.

2. SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking – The paper proposes a symmetric contrastive captioner that introduces bidirectional global and local interactions between image and text representations. An attention‑based masking strategy selects effective image patches, leading to notable gains on multimodal tasks such as image‑text retrieval, captioning, and visual question answering.

3. Self‑cognitive Denoising for Multiple Noisy Label Sources (SDM) – The authors analyze the self‑cognitive ability of neural networks to distinguish noisy from clean samples across multiple label sources. They introduce a self‑cognitive denoising method and a selective distillation module, demonstrating superior performance on several noisy‑label benchmarks.

4. DUPLEX: Dual GAT for Complex Embedding of Directed Graphs – DUPLEX leverages Hermitian adjacency matrix decomposition and a dual‑graph attention encoder to capture directed neighbor information. It decouples training from downstream tasks with two parameter‑free decoders, achieving state‑of‑the‑art results on sparse directed graphs.

5. SMaRt: Improving GANs with Score Matching Regularity – By incorporating score‑matching regularity, SMaRt addresses the inability of standard GAN losses to cover the full data manifold. Experiments show improved FID scores (e.g., from 8.87 to 7.11 on ImageNet 64×64) and competitive performance with consistency models.

6. KPOD: Keypoint‑based Progressive Chain‑of‑Thought Distillation for LLMs – KPOD introduces a weighted token loss and a value‑function‑driven progressive distillation schedule, enabling student models to learn critical reasoning steps more effectively. The method yields up to 5% accuracy improvements on chain‑of‑thought benchmarks.

7. Mind the Boundary: Coreset Selection via Reconstructing the Decision Boundary – The authors propose a geometry‑aware coreset construction that selects samples to reconstruct the decision boundary of deep networks. Their approach achieves a 50% data pruning rate on ImageNet‑1K with less than 1% accuracy loss.

8. FastEGNN: Improving Equivariant GNNs on Large Geometric Graphs via Virtual Nodes – FastEGNN introduces a set of ordered virtual nodes to approximate large unordered graphs, preserving E(3) symmetry while improving efficiency and accuracy on N‑body, protein, and water‑molecule datasets.

9. CCM: Real‑Time Controllable Visual Content Creation Using Text‑to‑Image Consistency Models – The paper adapts ControlNet for consistency models, enabling real‑time, condition‑controlled image synthesis across various modalities (sketch, depth, pose, etc.) while maintaining high visual fidelity.

10. Trustworthy Alignment of Retrieval‑Augmented LLMs via Reinforcement Learning – A reinforcement‑learning‑based “trustworthy alignment” algorithm aligns retrieval‑augmented LLMs to rely on external evidence, achieving up to 55% EM improvement on Natural Questions and reducing hallucination rates.

11. Unifying Bayesian Flow Networks and Diffusion Models through Stochastic Differential Equations – The work connects Bayesian Flow Networks (BFNs) with diffusion models via linear SDEs, showing equivalence between BFN regression loss and denoising score matching. The proposed BFN‑Solvers achieve 5–20× speedups while improving sample quality.

Collectively, these papers demonstrate Ant Group’s broad research contributions across AI efficiency, security, multimodal learning, graph representation, generative modeling, and trustworthy large‑language‑model alignment.

machine learningmultimodal learningAI securityAnt GroupICML2024
AntTech
Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.