Face Detection with OpenCV and Dlib in Python
This tutorial explains how to implement face, eye, and mouth detection using OpenCV's Haar cascades, Dlib's HOG and CNN detectors, and provides step‑by‑step code for both static images and real‑time video streams in Python.
This guide demonstrates how to build a face‑detection pipeline in Python using OpenCV and Dlib, covering three main approaches: Haar cascade classifiers, Dlib's Histogram of Oriented Gradients (HOG) detector, and Dlib's convolutional neural network (CNN) detector.
Installation
First install the required libraries:
pip install opencv-python
pip install dlibLocate the pre‑trained model files (paths may vary by Python version):
/usr/local/lib/python3.7/site-packages/cv2/dataHaar Cascade Detection
Load the cascade XML files and create classifiers:
cascPath = "/usr/local/lib/python3.7/site-packages/cv2/data/haarcascade_frontalface_default.xml"
eyePath = "/usr/local/lib/python3.7/site-packages/cv2/data/haarcascade_eye.xml"
smilePath = "/usr/local/lib/python3.7/site-packages/cv2/data/haarcascade_smile.xml"
faceCascade = cv2.CascadeClassifier(cascPath)
eyeCascade = cv2.CascadeClassifier(eyePath)
smileCascade = cv2.CascadeClassifier(smilePath)Detect faces in a grayscale image and draw rectangles:
faces = faceCascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, flags=cv2.CASCADE_SCALE_IMAGE)
for (x, y, w, h) in faces:
cv2.rectangle(gray, (x, y), (x+w, y+h), (255, 255, 255), 3)The same pattern is used for eyes and smiles, adjusting scaleFactor and minNeighbors as needed.
Haar Feature Theory
Haar features are simple rectangular intensity differences that can be computed efficiently with integral images, allowing rapid evaluation of millions of candidate windows.
Dlib HOG Detector
Initialize the HOG‑based detector and apply it to a grayscale frame:
face_detect = dlib.get_frontal_face_detector()
rects = face_detect(gray, 1)
for rect in rects:
(x, y, w, h) = face_utils.rect_to_bb(rect)
cv2.rectangle(gray, (x, y), (x+w, y+h), (255, 255, 255), 3)The HOG approach is fast and works well for moderate‑size faces.
Dlib CNN Detector
Download the pre‑trained CNN model ( mmod_human_face_detector.dat ) and load it:
dnnFaceDetector = dlib.cnn_face_detection_model_v1("mmod_human_face_detector.dat")
rects = dnnFaceDetector(gray, 1)
for rect in rects:
x1 = rect.rect.left()
y1 = rect.rect.top()
x2 = rect.rect.right()
y2 = rect.rect.bottom()
cv2.rectangle(gray, (x1, y1), (x2, y2), (255, 255, 255), 3)This CNN provides the highest accuracy, especially for small or occluded faces, at the cost of slower processing.
Real‑Time Video Detection
Capture video from the webcam, convert each frame to grayscale, run the chosen detector, draw bounding boxes, and display the result. Press q to quit:
video_capture = cv2.VideoCapture(0)
while True:
ret, frame = video_capture.read()
gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
rects = faceCascade.detectMultiScale(gray, scaleFactor=1.1, minNeighbors=5, flags=cv2.CASCADE_SCALE_IMAGE)
for (x, y, w, h) in rects:
cv2.rectangle(frame, (x, y), (x+w, y+h), (0, 255, 0), 2)
cv2.imshow('Video', frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
video_capture.release()
cv2.destroyAllWindows()Model Selection
When speed is critical, the HOG detector is the fastest; the Haar cascade offers comparable speed with slightly lower accuracy, while the CNN yields the best detection rates but is slower. Choose based on your real‑time requirements.
Python Programming Learning Circle
A global community of Chinese Python developers offering technical articles, columns, original video tutorials, and problem sets. Topics include web full‑stack development, web scraping, data analysis, natural language processing, image processing, machine learning, automated testing, DevOps automation, and big data.
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.