Artificial Intelligence 68 min read

Building a Smart Face‑Recognition Attendance System with FastAPI and OpenCV

This article walks through the complete design and implementation of an intelligent attendance system that uses face detection, face encoding comparison, and liveness verification with the Python face_recognition library, OpenCV, FastAPI, SQLModel, and WebSocket communication, providing end‑to‑end code samples for backend APIs, database schema, and interactive front‑end pages.

Woodpecker Software Testing

Feb 11, 2026

Building a Smart Face‑Recognition Attendance System with FastAPI and OpenCV

6.1 Requirement and Design Analysis

The core functions of the smart attendance system are accurate face recognition and anti‑spoof liveness detection, together with a web interface for querying and exporting attendance data.

Overall Implementation Idea

Face data acquisition: the front‑end can call the MediaRecorder API via JavaScript to capture webcam frames, or Python OpenCV can be used to read the camera directly. Both approaches are described later.

Face recognition: face_recognition.face_locations() returns face bounding boxes; multiple faces are supported.

Face annotation: the detected face box is drawn on the video stream using OpenCV in Python or a DIV overlay in JavaScript.

Face comparison: each enrolled face is encoded into a 128‑dimensional vector with face_recognition.face_encodings(). Comparison uses face_recognition.compare_faces() with a configurable tolerance.

Data management: three MySQL tables (users, faces, checks) are accessed via SQLModel.

Liveness detection: eye‑blink, mouth‑open, nod and shake actions are derived from the 68 facial landmarks provided by face_recognition.face_landmarks().

6.2 Python Implementation

Face‑Recognition API Usage

# Install required libraries
pip install opencv-python dlib face_recognition

# Load an image
image = face_recognition.load_image_file("../faces/conference-room.jpeg")
# Detect face locations
locations = face_recognition.face_locations(image)
# Draw a rectangle around each face
for top, right, bottom, left in locations:
    cv2.rectangle(image, (left, top), (right, bottom), (0, 0, 255), thickness=2)
cv2.imshow("Face", image)
cv2.waitKey(0)
cv2.imwrite("../faces/conference-room-2.jpeg", image)

OpenCV Camera Capture (opencv_capture.py)

import face_recognition, cv2, time
camera = cv2.VideoCapture(0)  # Default laptop camera
while True:
    ret, frame = camera.read()
    locations = face_recognition.face_locations(frame)
    if len(locations) > 0:
        print(f"Detected {len(locations)} face(s)")
        top, right, bottom, left = locations[0]
        cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=2)
    else:
        print("No face detected")
    cv2.imshow("Camera", frame)
    if cv2.waitKey(1) & 0xFF == ord('q'):
        break
    time.sleep(0.1)
camera.release()

Face Comparison and Attendance Recording (face_compare.py)

import face_recognition, cv2, time, numpy as np
from model import engine, Faces, Checks
from sqlmodel import Session, select, and_
from datetime import datetime, timedelta

def get_encodings():
    with Session(engine) as session:
        faces = select(Faces).where(Faces.facecode != None)
        results = session.execute(faces).mappings().all()
        user_encodings = []
        for r in results:
            user_encodings.append({"userid": r.Faces.userid, "facecode": eval(r.Faces.facecode)})
        return user_encodings

def insert_checks(userid):
    with Session(engine) as session:
        today = time.strftime("%Y-%m-%d")
        nowtime = time.strftime("%H:%M:%S")
        checks_date = select(Checks).where(and_(Checks.userid == userid, Checks.checkdate == today))
        result = session.execute(checks_date).first()
        if not result:
            first = Checks(userid=userid, checkdate=today, checkstart=nowtime)
            session.add(first)
            session.commit()
            return "Checkin-OK"
        else:
            now_dt = datetime.strptime(nowtime, "%H:%M:%S")
            start_dt = datetime.strptime(str(result.Checks.checkstart), "%H:%M:%S")
            status = (now_dt - start_dt).seconds >= 600
            if result.Checks.checkend:
                end_dt = datetime.strptime(str(result.Checks.checkend), "%H:%M:%S")
                status = status and (now_dt - end_dt).seconds >= 600
            if status:
                result.Checks.checkend = now_dt
                result.Checks.hours = round((now_dt - start_dt).seconds / 3600, 1)
                session.commit()
                return "Checkin-OK"
            else:
                return "Checkin-Repeated"
        return "Checkin-NOK"

def check_faces(image):
    user_encodings = get_encodings()
    coding_list = [np.array(u["facecode"]) for u in user_encodings]
    try:
        encoding = face_recognition.face_encodings(image, model='large')[0]
        match = face_recognition.compare_faces(coding_list, encoding, tolerance=0.5)
        if True in match:
            idx = match.index(True)
            userid = user_encodings[idx]["userid"]
            return insert_checks(userid)
    except Exception:
        pass
    return "Checkin-NOK"

def check_camera():
    cam = cv2.VideoCapture(0)
    while True:
        ret, frame = cam.read()
        locations = face_recognition.face_locations(frame)
        if len(locations) >= 1:
            print(f"Detected {len(locations)} face(s)")
            top, right, bottom, left = locations[0]
            cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=2)
            result = check_faces(frame)
            if "Checkin-OK" in result:
                pyttsx3.speak("Attendance successful")
            elif "Checkin-Repeated" in result:
                pyttsx3.speak("Do not repeat attendance")
            else:
                pyttsx3.speak("Attendance failed")
        cv2.imshow("Camera", frame)
        if cv2.waitKey(1) & 0xFF == ord('q'):
            break
        time.sleep(0.5)
    cam.release()

Database Model (model.py)

from sqlmodel import create_engine, Field, SQLModel
from typing import Optional
from datetime import datetime, date, time
engine = create_engine("mysql+pymysql://root:[email protected]:3306/checkin")

class Users(SQLModel, table=True):
    userid: Optional[int] = Field(default=None, primary_key=True)
    username: str
    usersex: str
    department: str
    createtime: datetime

class Faces(SQLModel, table=True):
    faceid: Optional[int] = Field(default=None, primary_key=True)
    userid: int | None = Field(default=None, foreign_key="users.userid")
    facecode: str

class Checks(SQLModel, table=True):
    checkid: Optional[int] = Field(default=None, primary_key=True)
    userid: int | None = Field(default=None, foreign_key="users.userid")
    checkdate: date
    checkstart: time
    checkend: time
    hours: float

FastAPI Endpoints for User Query and Face Upload (main_1.py)

from fastapi import FastAPI, Request, Form, UploadFile, File
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
import uvicorn
from model import engine, Users, Faces, Checks
from sqlmodel import Session, select
import face_recognition, cv2, numpy as np

app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")

@app.get('/user')
def user(request: Request):
    with Session(engine) as session:
        stmt = select(Users, Faces.facecode).join(Faces, isouter=True).order_by(Users.userid)
        results = session.execute(stmt).mappings().all()
    return templates.TemplateResponse(request=request, name="user.html", context={"results": results})

@app.post('/face/add')
def face_add(userid: int = Form(), file: UploadFile = File()):
    data = file.file.read()
    arr = np.frombuffer(data, np.uint8)
    img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
    locations = face_recognition.face_locations(img)
    if len(locations) == 0:
        return "Face-Not-Exist"
    encoding = face_recognition.face_encodings(img, model='large')[0]
    encoding_str = str(encoding.tolist())
    with Session(engine) as session:
        face = Faces(userid=userid, facecode=encoding_str)
        session.add(face)
        session.commit()
    return "Face-Added"

if __name__ == '__main__':
    uvicorn.run(app)

Front‑End Template for User List and Face Upload (user.html)

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Upload Face and Link</title>
    <style>
        table {width: 1000px; margin: auto; border: 1px solid gray; border-spacing: 0;}
        thead {font-weight: bold; height: 40px; background-color: lightseagreen;}
        td {border: 1px solid gray; text-align: center; height: 35px;}
    </style>
    <script>
        function browseImage(obj) {obj.parentNode.querySelector('input').click();}
        function doChange(obj) {let filename = obj.parentNode.querySelector('input').files[0].name; obj.parentNode.querySelectorAll('button')[0].style.display='none'; obj.parentNode.querySelector('span').textContent = filename;}
        function doAdd(obj, userid) {
            let fileInput = obj.parentNode.querySelector('input');
            let formData = new FormData();
            formData.append('userid', userid);
            formData.append('file', fileInput.files[0]);
            fetch('/face/add', {method: 'POST', body: formData})
                .then(r => r.text())
                .then(d => {if (d.includes('Face-Added')) {alert('Face added successfully'); location.reload();} else if (d.includes('Face-Not-Exist')) {alert('No valid face in image');}})
                .catch(err => console.log('Request failed', err));
        }
    </script>
</head>
<body>
    <table>
        <thead>
            <tr><td width="15%">ID</td><td width="15%">Name</td><td width="15%">Gender</td><td width="15%">Dept</td><td width="15%">Face Status</td><td width="25%">Action</td></tr>
        </thead>
        <tbody>
        {% for result in results %}
            <tr>
                <td>{{result.Users.userid}}</td>
                <td>{{result.Users.username}}</td>
                <td>{{result.Users.usersex}}</td>
                <td>{{result.Users.department}}</td>
                <td>{% if result.facecode == None %}Not Enrolled{% else %}Enrolled{% endif %}</td>
                <td>
                    <input type="file" onchange="doChange(this)" style="display:none">
                    <span></span>
                    <button onclick="browseImage(this)">Browse</button>&nbsp;&nbsp;
                    <button onclick="doAdd(this, {{result.Users.userid}})">Upload</button>
                </td>
            </tr>
        {% endfor %}
        </tbody>
    </table>
</body>
</html>

6.3 Web Front‑End Implementations

Video Capture Page (video_capture.html) – shows live webcam video and a button to capture a snapshot.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Capture Video Frame</title>
    <script>
        async function doStart() {
            const devices = await navigator.mediaDevices.enumerateDevices();
            const videoDevices = devices.filter(d => d.kind === 'videoinput');
            const stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:640}, height:{ideal:360}}});
            document.getElementById('preview').srcObject = stream;
        }
        function doCapture() {
            const preview = document.getElementById('preview');
            const canvas = document.createElement('canvas');
            const ctx = canvas.getContext('2d');
            canvas.width = preview.videoWidth;
            canvas.height = preview.videoHeight;
            ctx.drawImage(preview, 0, 0, canvas.width, canvas.height);
            const imgUrl = canvas.toDataURL('image/jpeg', 0.9);
            const img = document.createElement('img');
            img.src = imgUrl;
            img.style.width = '144px';
            img.style.height = '81px';
            img.style.marginLeft = '15px';
            document.getElementById('screenshot').appendChild(img);
        }
        document.addEventListener('DOMContentLoaded', doStart);
    </script>
</head>
<body>
    <video id="preview" autoplay muted width="640" height="360"></video><br>
    <button onclick="doCapture()">Capture</button>
    <div id="screenshot" style="margin-top:20px"></div>
</body>
</html>

WebSocket Real‑Time Detection (ws_detect.py)

import websockets, numpy as np, json, face_recognition, cv2, asyncio

async def detect_image(websocket):
    try:
        while True:
            image_data = await websocket.recv()
            img_arr = np.frombuffer(image_data, np.uint8)
            frame = cv2.imdecode(img_arr, cv2.IMREAD_COLOR)
            locations = face_recognition.face_locations(frame)
            loc_list = []
            for loc in locations:
                top, right, bottom, left = loc
                x = left
                y = top
                w = right - left
                h = bottom - top
                loc_list.append({"x": x, "y": y, "width": w, "height": h})
            await websocket.send(json.dumps(loc_list))
    except websockets.ConnectionClosed:
        print("WebSocket connection closed.")
    except Exception as e:
        print(f"Error: {e}")

async def main():
    server = await websockets.serve(detect_image, "localhost", 8765)
    print("WebSocket server started at ws://localhost:8765")
    await server.wait_closed()

if __name__ == "__main__":
    asyncio.run(main())

Front‑End Page Using WebSocket (detect.html) – streams video, sends frames every 500 ms, and draws red rectangles at the positions returned by the server.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Face Detection and Annotation</title>
    <script>
        document.addEventListener('DOMContentLoaded', function(){
            const preview = document.getElementById('preview');
            const ws = new WebSocket('ws://localhost:8765');
            ws.onopen = () => doStart();
            ws.onmessage = (event) => {
                const oldRects = document.getElementsByClassName('rectDiv');
                while(oldRects.length) oldRects[0].remove();
                const locations = JSON.parse(event.data);
                const left = preview.offsetLeft;
                const top = preview.offsetTop;
                for (let i in locations) {
                    const r = document.createElement('div');
                    r.className = 'rectDiv';
                    r.style.width = locations[i].width + 'px';
                    r.style.height = locations[i].height + 'px';
                    r.style.position = 'fixed';
                    r.style.border = 'solid 1px red';
                    r.style.left = (locations[i].x + left) + 'px';
                    r.style.top = (locations[i].y + top) + 'px';
                    document.body.appendChild(r);
                }
            };
            async function doStart(){
                const stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:960}, height:{ideal:540}}});
                preview.srcObject = stream;
                const mediaRecorder = new MediaRecorder(stream, {mimeType:'video/mp4'});
                mediaRecorder.start(200);
                setInterval(doCapture, 500);
            }
            function doCapture(){
                const canvas = document.createElement('canvas');
                const ctx = canvas.getContext('2d');
                canvas.width = preview.videoWidth;
                canvas.height = preview.videoHeight;
                ctx.drawImage(preview, 0, 0, canvas.width, canvas.height);
                canvas.toBlob(blob => { if (ws && ws.readyState===WebSocket.OPEN) ws.send(blob); }, 'image/jpeg', 0.8);
            }
        });
    </script>
</head>
<body>
    <div id="main" style="border:1px solid red; width:962px; height:542px; margin:auto">
        <video id="preview" autoplay muted width="960" height="540"></video>
    </div>
</body>
</html>

6.4 Liveness Detection to Prevent Spoofing

The system uses the 68 facial landmarks to compute eye‑aspect ratio (EAR) for blink detection, mouth‑aspect ratio for open‑mouth detection, and geometric changes for nod and shake detection. A simple statistical variance of EAR over a short sequence determines whether the face is live.

EAR Calculation (livecheck.py)

import face_recognition, numpy as np
from pycheckin import check_faces

def start_check(live_images):
    numbers = []
    for img in live_images:
        landmarks = face_recognition.face_landmarks(img)
        ear = calc_ear(landmarks)
        numbers.append(ear)
    ear_ratio = np.std(numbers) / np.average(numbers)
    print(ear_ratio)
    if ear_ratio > 0.2:
        idx = numbers.index(max(numbers))
        return check_faces(live_images[idx])
    else:
        return "Checkin-Not-Live"

def calc_ear(landmarks):
    left_eye = np.array(landmarks[0]['left_eye'])
    right_eye = np.array(landmarks[0]['right_eye'])
    left_ear = (np.linalg.norm(left_eye[1]-left_eye[5]) + np.linalg.norm(left_eye[2]-left_eye[4])) / (2 * np.linalg.norm(left_eye[0]-left_eye[3]))
    right_ear = (np.linalg.norm(right_eye[1]-right_eye[5]) + np.linalg.norm(right_eye[2]-right_eye[4])) / (2 * np.linalg.norm(right_eye[0]-right_eye[3]))
    return (left_ear + right_ear) / 2

Front‑End Liveness Demo (Live.py) – captures eight rapid frames while prompting the user to blink, then sends the base64 images to /checkin/live for liveness verification and attendance recording.

<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>Liveness Detection Attendance</title>
    <style>
        body {font-family:Arial; text-align:center; padding:20px;}
        video {border:2px solid #333; border-radius:10px; background:#000;}
        button {margin-top:20px; padding:10px 30px; font-size:18px; background:#4CAF50; color:#fff; border:none; border-radius:5px; cursor:pointer;}
        button:hover {background:#45a049;}
        #screenshot {margin-top:20px; display:flex; flex-wrap:wrap; justify-content:center; max-width:1400px; margin:auto;}
        #screenshot img {width:320px; height:180px; margin:5px; border:1px solid #ccc; border-radius:5px; box-shadow:0 2px 5px rgba(0,0,0,0.2);}
    </style>
</head>
<body>
    <h1>Liveness Detection Attendance System</h1>
    <video id="preview" autoplay muted width="640" height="360"></video><br>
    <button id="startButton">Start Attendance</button>
    <button id="stopCameraButton" style="background-color:#f44336; margin-left:20px;">Stop Camera</button>
    <div id="screenshot"></div>
    <script>
        const synth = window.speechSynthesis;
        let captureInterval = null, captureCount = 0, captureImages = [], stream = null;
        const preview = document.getElementById('preview');
        const startBtn = document.getElementById('startButton');
        const stopCamBtn = document.getElementById('stopCameraButton');
        const screenshotDiv = document.getElementById('screenshot');
        function speak(text){let u=new SpeechSynthesisUtterance(text);synth.cancel();synth.speak(u);}
        async function initCamera(){
            try{
                stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:960}, height:{ideal:540}}});
                preview.srcObject = stream;
            }catch(e){alert('Cannot access camera');}
        }
        function captureImage(){
            const canvas=document.createElement('canvas');
            const ctx=canvas.getContext('2d');
            canvas.width=preview.videoWidth; canvas.height=preview.videoHeight;
            ctx.drawImage(preview,0,0,canvas.width,canvas.height);
            const imgUrl=canvas.toDataURL('image/jpeg',0.75);
            const img=document.createElement('img'); img.src=imgUrl; screenshotDiv.appendChild(img);
            captureImages.push(imgUrl.split(',')[1]);
            captureCount++;
            if(captureCount>=8){stopCapture();}
        }
        function startCapture(){
            screenshotDiv.innerHTML=''; captureCount=0; captureImages=[]; speak('Please blink quickly');
            setTimeout(()=>{captureInterval=setInterval(captureImage,200);},1500);
        }
        function stopCapture(){clearInterval(captureInterval); captureInterval=null; doCheckin();}
        function doCheckin(){
            speak('Processing');
            fetch('/checkin/live',{method:'POST', headers:{'Content-Type':'application/json'}, body:JSON.stringify(captureImages)})
                .then(r=>r.text())
                .then(d=>{if(d.includes('Liveness-Pass')||d.includes('Checkin-OK')){speak('Attendance successful'); setTimeout(()=>location.reload(),3000);} else if(d.includes('Liveness-Fail')||d.includes('Checkin-Not-Live')){speak('Liveness failed');} else {speak('Check failed: '+d);}})
                .catch(()=>speak('Network request failed'));
        }
        function stopCamera(){
            if(stream){stream.getTracks().forEach(t=>t.stop()); stream=null;}
            if(captureInterval){clearInterval(captureInterval); captureInterval=null;}
            preview.srcObject=null;
        }
        startBtn.addEventListener('click', startCapture);
        stopCamBtn.addEventListener('click',()=>{stopCamera(); alert('Camera stopped');});
        window.addEventListener('beforeunload', stopCamera);
        initCamera();
    </script>
</body>
</html>

All components together provide a complete pipeline: data collection, face encoding storage, real‑time detection, liveness verification, attendance logging, and web‑based management.

Original Source

Signed-in readers can open the original source through BestHub's protected redirect.

Republication Notice

This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.

Python WebSocket face recognition FastAPI OpenCV liveness detection attendance system SQLModel

Written by

Woodpecker Software Testing

The Woodpecker Software Testing public account shares software testing knowledge, connects testing enthusiasts, founded by Gu Xiang, website: www.3testing.com. Author of five books, including "Mastering JMeter Through Case Studies".

0 followers

Reader feedback

How this landed with the community

Rate this article

Was this worth your time?

Discussion

0 Comments

Thoughtful readers leave field notes, pushback, and hard-won operational detail here.