AAAI‑2024 Highlights: Alibaba Cloud’s Deep Tabular Learning & Multi‑Modal Fusion
Alibaba Cloud’s AI platform PAI showcased four cutting‑edge papers at AAAI‑2024—introducing AMFormer for deep tabular learning via arithmetic feature interaction, MuLTI for efficient video‑language understanding, M2SD for few‑shot class‑incremental learning, and M2Doc for multi‑modal document layout analysis—demonstrating the platform’s growing impact on artificial‑intelligence research.
Recent papers from Alibaba Cloud’s AI platform PAI were accepted at AAAI‑2024, one of the most prestigious international conferences in artificial intelligence, highlighting the platform’s advances in fundamental and applied AI research.
Unlocking Deep Tabular Learning (AMFormer)
The authors identify arithmetic feature interaction as a crucial inductive bias for deep tabular learning. By embedding this bias into a Transformer‑based architecture called AMFormer, they achieve superior modeling accuracy, data‑efficiency, and generalization on both synthetic and real‑world tabular datasets.
MuLTI: Efficient Video‑Language Understanding
MuLTI addresses the high computational cost of multimodal video‑language models by introducing a Text‑Guided MultiWay‑Sampler and a multiple‑choice modeling pre‑training task. These innovations reduce GPU memory usage while preserving performance, achieving state‑of‑the‑art results on several video‑question‑answering and retrieval benchmarks.
M2SD: Multiple Mixing Self‑Distillation for Few‑Shot Class‑Incremental Learning
M2SD proposes a dual‑branch architecture with virtual classes and a multiple‑mixing self‑distillation strategy to expand the feature space for new categories while preserving knowledge of old ones. Extensive experiments on few‑shot class‑incremental benchmarks demonstrate significant improvements in accuracy and robustness.
M2Doc: Plug‑in Multi‑Modal Fusion for Document Layout Analysis
M2Doc introduces early‑fusion and late‑fusion modules that combine visual and textual features within existing object detectors. This plug‑in design yields consistent performance gains on document layout benchmarks such as DocLayNet and M6Doc, achieving state‑of‑the‑art results when combined with detectors like DINO.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Alibaba Cloud Big Data AI Platform
The Alibaba Cloud Big Data AI Platform builds on Alibaba’s leading cloud infrastructure, big‑data and AI engineering capabilities, scenario algorithms, and extensive industry experience to offer enterprises and developers a one‑stop, cloud‑native big‑data and AI capability suite. It boosts AI development efficiency, enables large‑scale AI deployment across industries, and drives business value.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
