Meituan Technical Team Shares CVPR 2021 Pre-lecture: Five Papers on Video Instance Segmentation, Facial Expression Recognition, Real-time Semantic Segmentation, Weakly Supervised Semantic Segmentation, and Multi-source Domain Adaptation

At a CVPR 2021 pre‑lecture, Meituan’s Visual Intelligence Center showcased five cutting‑edge papers—VisTR transformer‑based video instance segmentation, a feature‑decomposition facial expression recognizer, an accelerated BiSeNet for real‑time semantic segmentation, an embedded discriminative attention mechanism for weakly supervised segmentation, and a partial‑feature selection framework for multi‑source domain adaptation—highlighting the company’s large AI R&D team, university collaborations, real‑world deployment across its services, and ongoing recruitment.

Meituan Technology Team
Meituan Technology Team
Meituan Technology Team
Meituan Technical Team Shares CVPR 2021 Pre-lecture: Five Papers on Video Instance Segmentation, Facial Expression Recognition, Real-time Semantic Segmentation, Weakly Supervised Semantic Segmentation, and Multi-source Domain Adaptation

Opening remarks by Wei Xiaolin, head of Meituan Visual Intelligence Center, highlighting Meituan's AI-driven R&D team of over 10,000, collaborations with >20 universities, and diverse application scenarios from online search/recommendation/advertising/content safety to offline delivery, smart stores, autonomous vehicles, etc.

He emphasized that Meituan provides a sustainable research environment where CVPR 2021 papers are grounded in real‑world business scenarios.

The team actively explores frontier technologies such as self‑supervised learning, multimodal learning, Visual Transformers, and AutoML, encouraging publication in top international venues.

Paper 1 – End‑to‑End Video Instance Segmentation with Transformers (VisTR) by Wang Yuqing (Meituan Unmanned Delivery Center). Introduces VisTR, the first transformer‑based method for video instance segmentation, treating the task as parallel sequence decoding; achieves state‑of‑the‑art performance on YouTube‑VIS without tricks.

Paper 2 – Feature Decomposition and Reconstruction Learning for Effective Facial Expression Recognition by Ruan Delian (Xiamen University). Proposes a method that splits facial expression information into shared and expression‑specific features via decomposition and reconstruction networks, showing superior performance on CK+, MMI, OuluCASIA, RAFDB, and SFEW datasets.

Paper 3 – Rethinking BiSeNet For Real‑time Semantic Segmentation by Fan Mingyuan (Meituan Visual Intelligence Center). Improves BiSeNet with an efficient short‑term dense‑connect backbone and a detail‑guided decoder, yielding >45% speed‑up at equal accuracy; already used in Meituan for watermark removal and map scene parsing.

Paper 4 – Embedded Discriminative Attention Mechanism for Weakly Supervised Semantic Segmentation by Wu Tong (Meituan Visual Intelligence Center intern, Beijing Institute of Technology). Introduces EDAM, which integrates class‑activation‑map generation into the classification network via a Discriminative Activation layer and Collaborative Multi‑Attention mechanism, achieving 70.6% mIoU on PASCAL VOC 2012 test set.

Paper 5 – Partial Feature Selection and Alignment for Multi‑Source Domain Adaptation by Zhang Ming (Meituan Dianping intern, Electronic Science and Technology University). Formulates multi‑source partial domain adaptation (MSPDA) and proposes a PFSA framework that selects source features relevant to the target domain and aligns them via multiple losses, delivering leading results on both traditional MSDA and the new MSPDA setting.

The article concludes with recruitment information for Meituan’s Visual Intelligence Center and Unmanned Delivery Center, inviting candidates to scan QR codes for details.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Sign in to view source
Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactadmin@besthub.devand we will review it promptly.

AITransformersdomain adaptationsemantic segmentationVideo Instance SegmentationFacial Expression RecognitionCVPR2021Meituan Research
Meituan Technology Team
Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers
Reader feedback

How this landed with the community

Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.