Meituan Vision AI Research Highlights and Open‑Source Releases
The Meituan Vision AI Research Highlights compiles recent breakthroughs—including championship‑winning street‑scene segmentation at CVPR 2023, summaries of eight CVPR 2023 papers, the YOLOv6 3.0 release that outperforms YOLOv7‑E6E, a large‑scale Food2K dataset with a progressive region‑enhancement network, GPU inference service optimizations, quantized YOLOv6 deployment, YOLOv6 2.0 model specs, six CVPR 2022 papers, the Twins attention model, short‑video understanding techniques, and earlier CVPR 2019 and ICDAR 2019 contributions—alongside open‑source releases.
This collection presents a series of Meituan technology articles and paper summaries covering recent advances in computer vision and artificial intelligence.
It highlights the street‑scene segmentation techniques that won two championships and one runner‑up at CVPR 2023, and provides a detailed introduction to the methods used.
Eight selected CVPR 2023 papers from Meituan are summarized, covering self‑supervised learning, domain adaptation, federated learning, object detection, tracking, segmentation, and low‑level vision.
The release of YOLOv6 3.0 is announced, showcasing technical innovations that push its overall performance beyond YOLOv7‑E6E.
A large‑scale Food2K dataset and a progressive region‑enhancement network for food image recognition are introduced, with results published in T‑PAMI 2023.
An engineering case study on GPU inference service architecture optimization is described, increasing GPU utilization from 40 % to 100 % and improving QPS by more than three times.
Eight Meituan papers accepted at ACM MM and ECCV 2022 are overviewed, illustrating AI applications in content creation, moderation, and distribution.
The quantized deployment of YOLOv6 at Meituan is detailed, achieving high inference speed while preserving accuracy.
YOLOv6 2.0 is released, featuring lightweight and medium/large models with COCO AP of 49.5 %/52.5 % and T4 GPU speeds of 233/121 FPS (batch size = 32).
A collection of six CVPR 2022 papers is summarized, covering model compression, video object segmentation, 3D vision, image captioning, model security, and cross‑modal video retrieval.
The Twins visual attention model, co‑developed with the University of Adelaide and accepted at NeurIPS 2021, is introduced with its design and deployment insights.
Short‑video content understanding and generation techniques applied in Meituan’s business scenarios are presented.
A recap of the CVPR 2019 trajectory‑prediction competition champion method is provided.
An ICDAR 2019 paper on natural‑scene text detection using a pyramid network and 8‑neighbor connections is described.
---------- END ----------
Meituan Technology Team
Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
