How Deep Learning Transformed Face Recognition: From Images to Real‑Time Video
This article surveys the evolution of face recognition from early statistical methods to modern deep‑learning approaches, outlines key researchers, open‑source projects, popular APIs, core processing steps, the DeepFace architecture, datasets, and experimental results, providing a comprehensive guide for practitioners and researchers.
Traditional face‑recognition systems focused on image capture, preprocessing, identity verification, and search, but recent advances have extended the technology to driver monitoring, pedestrian tracking, and real‑time video processing, shifting from statistical methods like PCA to deep‑learning models such as CNN and RCNN, with growing interest in 3‑D face recognition.
Key Researchers
Prof. Shanshi Guang (Institute of Computing Technology, Chinese Academy of Sciences)
Prof. Li Ziqing (Biometrics Research Institute, Chinese Academy of Sciences)
Prof. Su Guangda (Tsinghua University)
Prof. Tang Xiaoe (The Chinese University of Hong Kong)
Ross B. Girshick
Major Open‑Source Projects
SeetaFace Engine – a BSD‑2 licensed C++ face‑recognition engine developed by the Chinese Academy of Sciences. https://github.com/seetaface/SeetaFaceEngine
Popular APIs/SDKs
Face++ – a cloud service offering free face detection, recognition, and attribute analysis, backed by Megvii Technology.
Skybiometry – provides face detection, recognition, and grouping services.
Common Face Image Datasets
Publicly available datasets include LFW (Labeled Faces in the Wild) and YFW (YouTube Faces in the Wild). LFW is the primary benchmark, with current image‑based recognition accuracy reaching 99%.
Face‑Recognition Process
The pipeline consists of four major stages: face detection, face alignment, face verification, and face identification.
Face Detection : Detect faces in an image and draw bounding boxes; OpenCV provides Haar cascades based on the Viola‑Jones algorithm.
Face Alignment : Correct pose using 2D or 3D alignment; 3D alignment leverages 67 facial landmarks and Delaunay triangulation to produce a frontal view.
Face Verification : Pair‑matching to decide whether two faces belong to the same person, often used in small‑office access control systems.
Face Identification : Classify a detected and aligned face into one of many known identities, typically using a deep neural network.
Face‑Recognition Categories
Current methods are divided into three categories: image‑based, video‑based, and 3‑D face recognition.
DeepFace
DeepFace, introduced by Facebook, paved the way for subsequent models such as DeepID and FaceNet. It demonstrates how deep learning can achieve near‑human performance in face verification.
1. DeepFace Basic Framework
1.1 Face‑Recognition Workflow
face detection → face alignment → face verification → face identification
1.2 Face Detection
Existing Techniques
Haar Classifier : Implemented in OpenCV, based on the Viola‑Jones algorithm.
Adaboost Cascade : Refer to "Robust Real‑Time Face Detection" and related blogs.
Method Used in This Article
Fiducial‑point detector using six landmarks (eye centers, nose tip, and three mouth points) learned via LBP features and SVR.
Select six reference points.
Learn their positions with LBP‑based SVR.
Result :
1.3 Face Alignment
2D Alignment : Crop, scale, rotate, and translate the detected face to six anchor locations.
3D Alignment :
Fit a 3‑D model to the 2‑D face using 67 landmarks.
Apply Delaunay triangulation and deform the mesh to obtain a frontal view.
Result :
1.4 Face Representation (Verification)
Existing Techniques
LBP & Joint Bayesian : Combine high‑dimensional LBP features with Joint Bayesian modeling.
DeepID Series : Fuse seven Joint Bayesian models with SVM to reach 99.15% accuracy.
Method in This Article
The network is trained on a multi‑class face‑recognition task. After 3‑D alignment, images are resized to 152×152×3 and fed into the following architecture:
Conv1: 32 filters of size 11×11×3
Max‑pooling: 3×3, stride 2
Conv2: 16 filters of size 9×9×16
Local‑Conv layers (non‑shared weights) with sizes 9×9, 7×7, 5×5 (16 each)
Fully‑connected: 4096 units
Softmax: 4030 units
Subsequent local‑connection layers capture region‑specific features, while two final fully‑connected layers (F7, F8) learn high‑level correlations such as eye‑mouth relationships. The output of F8 passes through a K‑way softmax for identity probabilities.
Training minimizes cross‑entropy loss via stochastic gradient descent; ReLU activation yields sparse top‑layer features (≈75% zeros) and dropout is applied to F7.
Feature vectors are first normalized per dimension by the maximum value in the training set, then L2‑normalized.
2. Validation
2.1 Chi‑Square Distance
DeepFace features share properties with histogram‑based descriptors: all values are non‑negative, sparse, and lie in [0, 1]. The chi‑square distance is computed as shown:
2.2 Siamese Network
After training, the network processes two input images, computes the absolute difference of their feature vectors, and feeds it to a fully‑connected layer that outputs a binary same/different decision.
3. Experimental Evaluation
3.1 Datasets
Social Face Classification (SFC): 4.4 M faces, 4030 identities
LFW: 13 323 faces, 5749 identities (restricted, unrestricted, unsupervised splits)
YouTube Faces (YTF): 3425 videos, 1595 identities
Result on LFW :
Result on YTF :
DeepFace differs from later methods by aligning faces before feeding them to the CNN, which stabilizes feature locations and enables effective convolutional learning.
github源码:https://github.com/ageitgey/face_recognition#face-recognitionSigned-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
MaGe Linux Operations
Founded in 2009, MaGe Education is a top Chinese high‑end IT training brand. Its graduates earn 12K+ RMB salaries, and the school has trained tens of thousands of students. It offers high‑pay courses in Linux cloud operations, Python full‑stack, automation, data analysis, AI, and Go high‑concurrency architecture. Thanks to quality courses and a solid reputation, it has talent partnerships with numerous internet firms.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
