Inside Suning’s Scalable Real‑Time Face Recognition Architecture and Algorithms
Suning’s face recognition solution combines front‑end detection, optimal photo selection, alignment, and cloud‑based feature extraction and matching, leveraging deep‑learning models, weight and feature normalization, angular margins, and triplet loss, while optimizing hardware, bandwidth, and data quality for large‑scale 1:N deployments.
System Overview
A mature face recognition system typically includes modules for face detection, optimal photo selection, face alignment, feature extraction, and feature comparison.
Application Scenarios
Face recognition applications are divided into 1:1 verification (authentication) and 1:N identification (search). Both share the same underlying technology, but 1:N systems face higher false‑accept rates as the database size grows, leading to a trade‑off between false‑accept and false‑reject rates.
Key Technological Advances
The breakthrough came in 2014 when DeepFace introduced deep learning for face feature extraction, shifting the technology to commercial viability. Current research focuses on improving network architectures and loss functions, while large face databases remain a competitive advantage.
Suning’s Deployment Strategy
Suning emphasizes real‑time performance and resource efficiency. It uses deep‑learning based detectors such as MTCNN and RSA, placing detection, tracking, and optimal photo selection on the front‑end, while modeling and matching run on central servers. In some cases, detection and tracking are offloaded to embedded devices.
Hardware Considerations
Standard security cameras mounted high often fail to capture frontal faces with sufficient resolution. For reliable recognition, the inter‑pupil distance in captured images should exceed 80 pixels, requiring careful selection of focal length and sensor resolution.
Optimal Photo Quality Scoring
Quality assessment considers pose, blur, expression, and occlusion, assigning a score to each captured face image.
Bandwidth Optimization
Face images are resized to about 150 pixels before feature extraction. JPEG compression at 75 % quality yields negligible performance loss while reducing bandwidth several‑fold; WebP further cuts bandwidth by 20‑30 %.
Algorithmic Foundations
The classic softmax loss is defined as:
Here, xi denotes the feature of sample i , wj the weight for class j , and b the bias. Face recognition is an open‑set problem, requiring feature spaces that pull same‑class samples together and push different‑class samples apart.
Margin‑based losses such as Center loss, Contrastive loss, Triplet loss, SphereFace, AM‑Softmax, and ArcFace introduce angular or additive margins to improve discriminative power.
Weights Normalization
SphereFace normalizes weights and removes bias, converting the problem to an angular constraint. This highlights three factors: weight norm, feature norm, and the angle θ plus margin.
Feature Normalization
Studies show that high‑quality frontal faces produce larger feature L2‑norms, while blurred faces yield smaller norms. Normalizing features and scaling them before softmax concentrates the distribution, strengthening gradients for low‑quality samples.
Angular Triplet Loss
Traditional triplet loss operates in Euclidean space, which can degrade performance after angular margin learning. Suning replaces the Euclidean distance constraint Dist(A,P)+margin<Dist(A,N) with an angular version angle(A,P)+margin<angle(A,N), preserving the learned feature distribution.
Hard‑example mining considers low‑quality images, under‑trained samples, and samples that are close to other classes, often reflected by small feature norms or small inter‑class angles.
Data Acquisition and Cleaning
Public datasets (MS‑Celeb‑1M, VGGFace2, CASIA‑WebFace) are noisy and biased toward Western faces. Suning built an Asian‑focused face database to improve performance in its market.
Cleaning pipeline:
Coarse filtering with simple classifiers to remove obvious noise.
Clustering based on intra‑class compactness to merge duplicate identities.
Model‑based prediction with confidence thresholds.
Manual verification of the filtered set before fine‑tuning.
Future Directions
To address challenges like spoofing, extreme lighting, and low‑light environments, Suning is researching 3D structured‑light face recognition using VCSEL (vertical‑cavity surface‑emitting laser) sensors. These provide depth information that is robust to illumination and liveness attacks.
References
Guo Y, Zhang L. One‑shot face recognition by promoting under‑represented classes. arXiv:1707.05574, 2017.
Parde CJ, Castillo C, Hill MQ, et al. Deep Convolutional Neural Network Features and the Original Image. arXiv:1611.01751, 2016.
Ranjan R, Castillo CD, Chellappa R. L2‑constrained softmax loss for discriminative face verification. arXiv:1703.09507, 2017.
Wang F, Liu W, Liu H, et al. Additive Margin Softmax for Face Verification. arXiv:1801.05599, 2018.
Deng J, Guo J, Zafeiriou S. ArcFace: Additive Angular Margin Loss for Deep Face Recognition. arXiv:1801.07698, 2018.
Wu X, He R, Sun Z, et al. A Light CNN for Deep Face Representation with Noisy Labels. arXiv:1511.02683, 2015.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Suning Technology
Official Suning Technology account. Explains cutting-edge retail technology and shares Suning's tech practices.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
