Artificial Intelligence 30 min read

Face Recognition with OpenCV, Python, and Deep Learning

This tutorial explains how to implement high‑accuracy face recognition using OpenCV, Python, and deep learning by leveraging dlib's deep metric learning, creating a custom dataset, encoding facial embeddings, and performing real‑time identification on images and video streams.

Qunar Tech Salon
Qunar Tech Salon
Qunar Tech Salon
Face Recognition with OpenCV, Python, and Deep Learning

Face ID has sparked a surge in facial recognition research, and this guide shows how to use OpenCV, Python, and deep learning to build a real‑time, high‑accuracy system based on deep metric learning.

Deep metric learning trains a network to output a 128‑dimensional embedding for each face; during training, triplets of images (two of the same person and one of a different person) are used to pull matching embeddings together and push non‑matching ones apart.

First, install the required libraries in a virtual environment:

$ workon myenv
$ pip install dlib face_recognition imutils opencv-python

Optionally compile dlib from source with or without CUDA support for GPU acceleration.

Create a dataset of facial images (e.g., characters from Jurassic Park) using the Bing Image Search API, resulting in a directory structure where each sub‑folder is a person’s name.

Encode the faces with encode_faces.py :

# import packages
from imutils import paths
import face_recognition, argparse, pickle, cv2, os
# parse arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--dataset", required=True)
ap.add_argument("-e", "--encodings", required=True)
args = vars(ap.parse_args())
# load images, compute embeddings, and serialize
...

The script loads each image, detects faces, computes the 128‑dim embeddings, and saves them to a pickle file.

Recognize faces in a single image with recognize_faces_image.py :

# load encodings
data = pickle.loads(open(args["encodings"], "rb").read())
# load input image and convert to RGB
image = cv2.imread(args["image"])
rgb = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
# detect faces and compute embeddings
boxes = face_recognition.face_locations(rgb, model=args["detection_method"])
encodings = face_recognition.face_encodings(rgb, boxes)
# compare with known encodings and draw boxes
...

For real‑time video, recognize_faces_video.py streams frames from a webcam, resizes them, detects faces, matches embeddings, draws bounding boxes with names, optionally writes the output video, and displays the stream:

# start video stream
vs = VideoStream(src=0).start()
while True:
    frame = vs.read()
    rgb = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB)
    rgb = imutils.resize(rgb, width=750)
    boxes = face_recognition.face_locations(rgb, model=args["detection_method"])
    encodings = face_recognition.face_encodings(rgb, boxes)
    # match and draw
    ...
    if args["display"] > 0:
        cv2.imshow("Frame", frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break

Performance notes: the CNN detector requires a GPU for real‑time speeds; on CPU it runs slowly (<0.5 FPS). On a Raspberry Pi you must use the faster HOG detector or OpenCV’s Haar cascades, but even then the frame rate is limited to 1–2 FPS.

In summary, by combining OpenCV, Python, dlib, and the face_recognition library you can build an accurate, GPU‑accelerated facial recognition pipeline that works on images, live video, and video files.

computer visionPythondeep learningface recognitionopencvdlibface_recognition
Qunar Tech Salon
Written by

Qunar Tech Salon

Qunar Tech Salon is a learning and exchange platform for Qunar engineers and industry peers. We share cutting-edge technology trends and topics, providing a free platform for mid-to-senior technical professionals to exchange and learn.

0 followers
Reader feedback

How this landed with the community

login Sign in to like

Rate this article

Was this worth your time?

Sign in to rate
Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.