Artificial Intelligence 20 min read

VICFace: High-Precision Face Detection in Natural Scenes

VICFace, a Meituan Vision Intelligence Center model, combines advanced data augmentation, a ResNet‑152‑based SRN architecture, specialized anchor design, and multi‑task loss functions to achieve state‑of‑the‑art face detection accuracy on WIDER FACE, enabling robust natural‑scene applications such as content filtering and safety checks.

Meituan Technology Team

Feb 6, 2020

VICFace: High-Precision Face Detection in Natural Scenes

The article introduces VICFace, a high‑accuracy face detection model developed by Meituan’s Vision Intelligence Center for natural‑scene applications.

Background : Face detection in unconstrained environments faces challenges such as varying illumination, pose, occlusion, and scale. Accurate detection is essential for downstream tasks like face recognition, attribute analysis, and privacy protection.

Technical Development : Traditional methods (e.g., Viola‑Jones) rely on handcrafted features and struggle with large‑scale data. Deep‑learning‑based detectors dominate, classified into cascade‑based, two‑stage, and single‑stage (anchor‑based) approaches. Single‑stage methods (SSD, RetinaNet, SRN, DSFD) offer a good trade‑off between speed and accuracy.

Optimization Strategies :

1. Data augmentation and sampling : VICFace builds on ISRN’s augmentation, adds mixup, and applies dynamic weighting to hard, tiny, or blurred faces.

2. Model architecture : It adopts the SRN detection framework, enhances feature fusion with weighted channels (example for P4 shown in the original figure), and uses a ResNet‑152 backbone with modified convolutions.

3. Prediction module : Combines dilated convolutions and 1×k/k×1 convolutions as a context module, and incorporates a Maxout layer to improve recall.

4. Anchor design and sample assignment : Uses a mixed anchor set (e.g., {2S,4S} on C3/P3 layers, {4S} elsewhere) with aspect ratio 0.8, and defines IoU thresholds for positive/negative samples.

5. Loss functions : Employs Focal Loss for classification, Complete IoU Loss for bounding‑box regression, and auxiliary tasks (key‑point detection, segmentation) to boost performance.

Results : On the WIDER FACE benchmark, VICFace achieves state‑of‑the‑art AP scores on Easy, Medium, and Hard subsets, surpassing other leading detectors.

Business Applications : Deployed across Meituan services for UGC image filtering, POI image display, and safety checks (e.g., detecting hats and masks on kitchen staff). Future work includes exploring anchor‑free detectors and further efficiency improvements.

References : The article lists over 50 citations covering face detection benchmarks, classic algorithms, and recent deep‑learning advances.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

face detection Meituan Anchor-based VICFace

Written by

Meituan Technology Team

Over 10,000 engineers powering China’s leading lifestyle services e‑commerce platform. Supporting hundreds of millions of consumers, millions of merchants across 2,000+ industries. This is the public channel for the tech teams behind Meituan, Dianping, Meituan Waimai, Meituan Select, and related services.

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.