Artificial Intelligence 17 min read

Ant Group’s 24 Papers Featured at CVPR2024: Topics and Abstracts

The IEEE CVPR2024 conference in Seattle accepted 2,719 papers out of 11,532 submissions, and Ant Group contributed 24 papers covering computer vision, deep learning, digital humans, large models, multimodal remote sensing, vision‑language distillation, federated incremental learning, model‑stealing defense, and more, with one highlighted as a highlight.

AntTech

Jun 18, 2024

Ant Group’s 24 Papers Featured at CVPR2024: Topics and Abstracts

On June 17, the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR2024) took place in Seattle, receiving 11,532 valid submissions and accepting 2,719 papers (a 23.6% acceptance rate).

Ant Group had 24 papers accepted, one of which was highlighted by the program committee. The papers span a wide range of topics, including computer vision, deep learning, digital humans, large models, multimodal remote sensing, vision‑language model distillation, federated incremental learning, model‑stealing defense, and 3D scene generation.

Selected highlights include:

CoDeF : A novel video representation using a canonical content field and a temporal deformation field to enable temporally consistent video processing.

SkySense : A multi‑modal remote sensing foundation model that leverages billions of time‑series images for universal Earth observation interpretation.

PromptKD : An unsupervised prompt‑distillation method for transferring knowledge from large CLIP models to lightweight vision‑language students.

Efficient Replay in Federated Incremental Learning : A simple framework (Re‑Fed) that caches important samples for replay to mitigate catastrophic forgetting in federated settings.

Model‑Stealing Defense with Noise Transition Matrix : A low‑overhead perturbation technique that injects noise into model predictions to protect against unauthorized cloning.

DynTet : A dynamic tetrahedra approach that combines explicit mesh representations with neural rendering for high‑quality talking‑head synthesis.

Other papers address topics such as multimodal large models (Pink), vision‑inspired VIVL modules, sparse token reduction (SparseFormer), and 4D scene understanding for autonomous driving (DriveWorld). The collection showcases Ant Group’s extensive research contributions across the AI spectrum presented at CVPR2024.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

computer vision Deep Learning Multimodal research CVPR2024 Ant Group

Written by

AntTech

Technology is the core driver of Ant's future creation.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.