Three Ant Group Papers Featured at EMNLP 2024: Dynamic Transformers, Plug‑and‑Play Visual Reasoner, and Efficient Fine‑Tuning of Large Language Models
This announcement introduces three Ant Group papers accepted at EMNLP 2024—Mixture‑of‑Modules for dynamic Transformer assembly, a plug‑and‑play visual reasoning framework built via data synthesis, and a layer‑wise importance‑aware efficient fine‑tuning method for large language models—highlighting their innovations and upcoming live presentations.
With the rapid development of artificial intelligence, large language models and visual reasoning have become foundational to intelligent technology. At the upcoming EMNLP 2024 conference, several Ant Group papers showcase the latest advances, and we preview three representative works.
1. Mixture‑of‑Modules: Reinventing Transformers as Dynamic Assemblies of Modules Traditional Transformers use a fixed computation path, limiting flexibility and efficiency. The Mixture‑of‑Modules (MoM) framework dynamically composes Transformer modules, assigning the most suitable computation module to each token. This design improves flexibility, reduces computation while maintaining performance, and achieves strong results across tasks.
2. From the Least to the Most: Building a Plug‑and‑Play Visual Reasoner via Data Synthesis Visual reasoning requires models to understand complex image information. The authors propose a new paradigm that leverages a collaborative visual‑language model with problem decomposition and tool invocation. To address data scarcity, they introduce a data‑synthesis pipeline that automatically generates multi‑step visual reasoning data, releasing a dataset of one million examples. Fine‑tuned on this data, the visual reasoner significantly boosts performance on various visual‑question‑answering tasks.
3. Layer‑wise Importance Matters: Less Memory for Better Performance in Parameter‑efficient Fine‑tuning of Large Language Models While large language models store extensive knowledge, downstream tasks often require fine‑tuning. Existing parameter‑efficient fine‑tuning (PEFT) methods treat all layers uniformly, ignoring layer importance. The proposed Importance‑aware Sparse Tuning (IST) scores layer importance, selects a critical subset of full layers for fine‑tuning, reduces memory usage, and delivers superior performance. IST is compatible with various PEFT techniques and includes theoretical convergence guarantees and strong empirical results.
These three papers advance intelligent technology from dynamic Transformer composition, through data‑driven visual reasoning, to memory‑efficient fine‑tuning of LLMs.
We have invited the first authors—Gong Zhuocheng, Cheng Chuanqi, and Yao Kai—to share their work live on October 31, 2024, from 18:00 to 20:30 via the "Paper Show Live #8" stream on WeChat Channels (Ant Technology Research Institute), Ant Technology AntTech, and Bilibili. Viewers are welcome to interact with the authors and learn about their research ideas and experiments.
Live Viewing Guide
Time: 2024‑10‑31 18:00‑20:30 Platforms: WeChat Video Channels (Ant Technology Research Institute, Ant Technology AntTech) and Bilibili (Ant Technology Research Institute).
AntTech
Technology is the core driver of Ant's future creation.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.