Building a Smart Face‑Recognition Attendance System with FastAPI and OpenCV
This article walks through the complete design and implementation of an intelligent attendance system that uses face detection, face encoding comparison, and liveness verification with the Python face_recognition library, OpenCV, FastAPI, SQLModel, and WebSocket communication, providing end‑to‑end code samples for backend APIs, database schema, and interactive front‑end pages.
6.1 Requirement and Design Analysis
The core functions of the smart attendance system are accurate face recognition and anti‑spoof liveness detection, together with a web interface for querying and exporting attendance data.
Overall Implementation Idea
Face data acquisition: the front‑end can call the MediaRecorder API via JavaScript to capture webcam frames, or Python OpenCV can be used to read the camera directly. Both approaches are described later.
Face recognition: face_recognition.face_locations() returns face bounding boxes; multiple faces are supported.
Face annotation: the detected face box is drawn on the video stream using OpenCV in Python or a DIV overlay in JavaScript.
Face comparison: each enrolled face is encoded into a 128‑dimensional vector with face_recognition.face_encodings(). Comparison uses face_recognition.compare_faces() with a configurable tolerance.
Data management: three MySQL tables (users, faces, checks) are accessed via SQLModel.
Liveness detection: eye‑blink, mouth‑open, nod and shake actions are derived from the 68 facial landmarks provided by face_recognition.face_landmarks().
6.2 Python Implementation
Face‑Recognition API Usage
# Install required libraries
pip install opencv-python dlib face_recognition
# Load an image
image = face_recognition.load_image_file("../faces/conference-room.jpeg")
# Detect face locations
locations = face_recognition.face_locations(image)
# Draw a rectangle around each face
for top, right, bottom, left in locations:
cv2.rectangle(image, (left, top), (right, bottom), (0, 0, 255), thickness=2)
cv2.imshow("Face", image)
cv2.waitKey(0)
cv2.imwrite("../faces/conference-room-2.jpeg", image)OpenCV Camera Capture (opencv_capture.py)
import face_recognition, cv2, time
camera = cv2.VideoCapture(0) # Default laptop camera
while True:
ret, frame = camera.read()
locations = face_recognition.face_locations(frame)
if len(locations) > 0:
print(f"Detected {len(locations)} face(s)")
top, right, bottom, left = locations[0]
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=2)
else:
print("No face detected")
cv2.imshow("Camera", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
time.sleep(0.1)
camera.release()Face Comparison and Attendance Recording (face_compare.py)
import face_recognition, cv2, time, numpy as np
from model import engine, Faces, Checks
from sqlmodel import Session, select, and_
from datetime import datetime, timedelta
def get_encodings():
with Session(engine) as session:
faces = select(Faces).where(Faces.facecode != None)
results = session.execute(faces).mappings().all()
user_encodings = []
for r in results:
user_encodings.append({"userid": r.Faces.userid, "facecode": eval(r.Faces.facecode)})
return user_encodings
def insert_checks(userid):
with Session(engine) as session:
today = time.strftime("%Y-%m-%d")
nowtime = time.strftime("%H:%M:%S")
checks_date = select(Checks).where(and_(Checks.userid == userid, Checks.checkdate == today))
result = session.execute(checks_date).first()
if not result:
first = Checks(userid=userid, checkdate=today, checkstart=nowtime)
session.add(first)
session.commit()
return "Checkin-OK"
else:
now_dt = datetime.strptime(nowtime, "%H:%M:%S")
start_dt = datetime.strptime(str(result.Checks.checkstart), "%H:%M:%S")
status = (now_dt - start_dt).seconds >= 600
if result.Checks.checkend:
end_dt = datetime.strptime(str(result.Checks.checkend), "%H:%M:%S")
status = status and (now_dt - end_dt).seconds >= 600
if status:
result.Checks.checkend = now_dt
result.Checks.hours = round((now_dt - start_dt).seconds / 3600, 1)
session.commit()
return "Checkin-OK"
else:
return "Checkin-Repeated"
return "Checkin-NOK"
def check_faces(image):
user_encodings = get_encodings()
coding_list = [np.array(u["facecode"]) for u in user_encodings]
try:
encoding = face_recognition.face_encodings(image, model='large')[0]
match = face_recognition.compare_faces(coding_list, encoding, tolerance=0.5)
if True in match:
idx = match.index(True)
userid = user_encodings[idx]["userid"]
return insert_checks(userid)
except Exception:
pass
return "Checkin-NOK"
def check_camera():
cam = cv2.VideoCapture(0)
while True:
ret, frame = cam.read()
locations = face_recognition.face_locations(frame)
if len(locations) >= 1:
print(f"Detected {len(locations)} face(s)")
top, right, bottom, left = locations[0]
cv2.rectangle(frame, (left, top), (right, bottom), (0, 0, 255), thickness=2)
result = check_faces(frame)
if "Checkin-OK" in result:
pyttsx3.speak("Attendance successful")
elif "Checkin-Repeated" in result:
pyttsx3.speak("Do not repeat attendance")
else:
pyttsx3.speak("Attendance failed")
cv2.imshow("Camera", frame)
if cv2.waitKey(1) & 0xFF == ord('q'):
break
time.sleep(0.5)
cam.release()Database Model (model.py)
from sqlmodel import create_engine, Field, SQLModel
from typing import Optional
from datetime import datetime, date, time
engine = create_engine("mysql+pymysql://root:[email protected]:3306/checkin")
class Users(SQLModel, table=True):
userid: Optional[int] = Field(default=None, primary_key=True)
username: str
usersex: str
department: str
createtime: datetime
class Faces(SQLModel, table=True):
faceid: Optional[int] = Field(default=None, primary_key=True)
userid: int | None = Field(default=None, foreign_key="users.userid")
facecode: str
class Checks(SQLModel, table=True):
checkid: Optional[int] = Field(default=None, primary_key=True)
userid: int | None = Field(default=None, foreign_key="users.userid")
checkdate: date
checkstart: time
checkend: time
hours: floatFastAPI Endpoints for User Query and Face Upload (main_1.py)
from fastapi import FastAPI, Request, Form, UploadFile, File
from fastapi.templating import Jinja2Templates
from fastapi.staticfiles import StaticFiles
import uvicorn
from model import engine, Users, Faces, Checks
from sqlmodel import Session, select
import face_recognition, cv2, numpy as np
app = FastAPI()
app.mount("/static", StaticFiles(directory="static"), name="static")
templates = Jinja2Templates(directory="templates")
@app.get('/user')
def user(request: Request):
with Session(engine) as session:
stmt = select(Users, Faces.facecode).join(Faces, isouter=True).order_by(Users.userid)
results = session.execute(stmt).mappings().all()
return templates.TemplateResponse(request=request, name="user.html", context={"results": results})
@app.post('/face/add')
def face_add(userid: int = Form(), file: UploadFile = File()):
data = file.file.read()
arr = np.frombuffer(data, np.uint8)
img = cv2.imdecode(arr, cv2.IMREAD_COLOR)
locations = face_recognition.face_locations(img)
if len(locations) == 0:
return "Face-Not-Exist"
encoding = face_recognition.face_encodings(img, model='large')[0]
encoding_str = str(encoding.tolist())
with Session(engine) as session:
face = Faces(userid=userid, facecode=encoding_str)
session.add(face)
session.commit()
return "Face-Added"
if __name__ == '__main__':
uvicorn.run(app)Front‑End Template for User List and Face Upload (user.html)
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Upload Face and Link</title>
<style>
table {width: 1000px; margin: auto; border: 1px solid gray; border-spacing: 0;}
thead {font-weight: bold; height: 40px; background-color: lightseagreen;}
td {border: 1px solid gray; text-align: center; height: 35px;}
</style>
<script>
function browseImage(obj) {obj.parentNode.querySelector('input').click();}
function doChange(obj) {let filename = obj.parentNode.querySelector('input').files[0].name; obj.parentNode.querySelectorAll('button')[0].style.display='none'; obj.parentNode.querySelector('span').textContent = filename;}
function doAdd(obj, userid) {
let fileInput = obj.parentNode.querySelector('input');
let formData = new FormData();
formData.append('userid', userid);
formData.append('file', fileInput.files[0]);
fetch('/face/add', {method: 'POST', body: formData})
.then(r => r.text())
.then(d => {if (d.includes('Face-Added')) {alert('Face added successfully'); location.reload();} else if (d.includes('Face-Not-Exist')) {alert('No valid face in image');}})
.catch(err => console.log('Request failed', err));
}
</script>
</head>
<body>
<table>
<thead>
<tr><td width="15%">ID</td><td width="15%">Name</td><td width="15%">Gender</td><td width="15%">Dept</td><td width="15%">Face Status</td><td width="25%">Action</td></tr>
</thead>
<tbody>
{% for result in results %}
<tr>
<td>{{result.Users.userid}}</td>
<td>{{result.Users.username}}</td>
<td>{{result.Users.usersex}}</td>
<td>{{result.Users.department}}</td>
<td>{% if result.facecode == None %}Not Enrolled{% else %}Enrolled{% endif %}</td>
<td>
<input type="file" onchange="doChange(this)" style="display:none">
<span></span>
<button onclick="browseImage(this)">Browse</button>
<button onclick="doAdd(this, {{result.Users.userid}})">Upload</button>
</td>
</tr>
{% endfor %}
</tbody>
</table>
</body>
</html>6.3 Web Front‑End Implementations
Video Capture Page (video_capture.html) – shows live webcam video and a button to capture a snapshot.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Capture Video Frame</title>
<script>
async function doStart() {
const devices = await navigator.mediaDevices.enumerateDevices();
const videoDevices = devices.filter(d => d.kind === 'videoinput');
const stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:640}, height:{ideal:360}}});
document.getElementById('preview').srcObject = stream;
}
function doCapture() {
const preview = document.getElementById('preview');
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = preview.videoWidth;
canvas.height = preview.videoHeight;
ctx.drawImage(preview, 0, 0, canvas.width, canvas.height);
const imgUrl = canvas.toDataURL('image/jpeg', 0.9);
const img = document.createElement('img');
img.src = imgUrl;
img.style.width = '144px';
img.style.height = '81px';
img.style.marginLeft = '15px';
document.getElementById('screenshot').appendChild(img);
}
document.addEventListener('DOMContentLoaded', doStart);
</script>
</head>
<body>
<video id="preview" autoplay muted width="640" height="360"></video><br>
<button onclick="doCapture()">Capture</button>
<div id="screenshot" style="margin-top:20px"></div>
</body>
</html>WebSocket Real‑Time Detection (ws_detect.py)
import websockets, numpy as np, json, face_recognition, cv2, asyncio
async def detect_image(websocket):
try:
while True:
image_data = await websocket.recv()
img_arr = np.frombuffer(image_data, np.uint8)
frame = cv2.imdecode(img_arr, cv2.IMREAD_COLOR)
locations = face_recognition.face_locations(frame)
loc_list = []
for loc in locations:
top, right, bottom, left = loc
x = left
y = top
w = right - left
h = bottom - top
loc_list.append({"x": x, "y": y, "width": w, "height": h})
await websocket.send(json.dumps(loc_list))
except websockets.ConnectionClosed:
print("WebSocket connection closed.")
except Exception as e:
print(f"Error: {e}")
async def main():
server = await websockets.serve(detect_image, "localhost", 8765)
print("WebSocket server started at ws://localhost:8765")
await server.wait_closed()
if __name__ == "__main__":
asyncio.run(main())Front‑End Page Using WebSocket (detect.html) – streams video, sends frames every 500 ms, and draws red rectangles at the positions returned by the server.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Face Detection and Annotation</title>
<script>
document.addEventListener('DOMContentLoaded', function(){
const preview = document.getElementById('preview');
const ws = new WebSocket('ws://localhost:8765');
ws.onopen = () => doStart();
ws.onmessage = (event) => {
const oldRects = document.getElementsByClassName('rectDiv');
while(oldRects.length) oldRects[0].remove();
const locations = JSON.parse(event.data);
const left = preview.offsetLeft;
const top = preview.offsetTop;
for (let i in locations) {
const r = document.createElement('div');
r.className = 'rectDiv';
r.style.width = locations[i].width + 'px';
r.style.height = locations[i].height + 'px';
r.style.position = 'fixed';
r.style.border = 'solid 1px red';
r.style.left = (locations[i].x + left) + 'px';
r.style.top = (locations[i].y + top) + 'px';
document.body.appendChild(r);
}
};
async function doStart(){
const stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:960}, height:{ideal:540}}});
preview.srcObject = stream;
const mediaRecorder = new MediaRecorder(stream, {mimeType:'video/mp4'});
mediaRecorder.start(200);
setInterval(doCapture, 500);
}
function doCapture(){
const canvas = document.createElement('canvas');
const ctx = canvas.getContext('2d');
canvas.width = preview.videoWidth;
canvas.height = preview.videoHeight;
ctx.drawImage(preview, 0, 0, canvas.width, canvas.height);
canvas.toBlob(blob => { if (ws && ws.readyState===WebSocket.OPEN) ws.send(blob); }, 'image/jpeg', 0.8);
}
});
</script>
</head>
<body>
<div id="main" style="border:1px solid red; width:962px; height:542px; margin:auto">
<video id="preview" autoplay muted width="960" height="540"></video>
</div>
</body>
</html>6.4 Liveness Detection to Prevent Spoofing
The system uses the 68 facial landmarks to compute eye‑aspect ratio (EAR) for blink detection, mouth‑aspect ratio for open‑mouth detection, and geometric changes for nod and shake detection. A simple statistical variance of EAR over a short sequence determines whether the face is live.
EAR Calculation (livecheck.py)
import face_recognition, numpy as np
from pycheckin import check_faces
def start_check(live_images):
numbers = []
for img in live_images:
landmarks = face_recognition.face_landmarks(img)
ear = calc_ear(landmarks)
numbers.append(ear)
ear_ratio = np.std(numbers) / np.average(numbers)
print(ear_ratio)
if ear_ratio > 0.2:
idx = numbers.index(max(numbers))
return check_faces(live_images[idx])
else:
return "Checkin-Not-Live"
def calc_ear(landmarks):
left_eye = np.array(landmarks[0]['left_eye'])
right_eye = np.array(landmarks[0]['right_eye'])
left_ear = (np.linalg.norm(left_eye[1]-left_eye[5]) + np.linalg.norm(left_eye[2]-left_eye[4])) / (2 * np.linalg.norm(left_eye[0]-left_eye[3]))
right_ear = (np.linalg.norm(right_eye[1]-right_eye[5]) + np.linalg.norm(right_eye[2]-right_eye[4])) / (2 * np.linalg.norm(right_eye[0]-right_eye[3]))
return (left_ear + right_ear) / 2Front‑End Liveness Demo (Live.py) – captures eight rapid frames while prompting the user to blink, then sends the base64 images to /checkin/live for liveness verification and attendance recording.
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Liveness Detection Attendance</title>
<style>
body {font-family:Arial; text-align:center; padding:20px;}
video {border:2px solid #333; border-radius:10px; background:#000;}
button {margin-top:20px; padding:10px 30px; font-size:18px; background:#4CAF50; color:#fff; border:none; border-radius:5px; cursor:pointer;}
button:hover {background:#45a049;}
#screenshot {margin-top:20px; display:flex; flex-wrap:wrap; justify-content:center; max-width:1400px; margin:auto;}
#screenshot img {width:320px; height:180px; margin:5px; border:1px solid #ccc; border-radius:5px; box-shadow:0 2px 5px rgba(0,0,0,0.2);}
</style>
</head>
<body>
<h1>Liveness Detection Attendance System</h1>
<video id="preview" autoplay muted width="640" height="360"></video><br>
<button id="startButton">Start Attendance</button>
<button id="stopCameraButton" style="background-color:#f44336; margin-left:20px;">Stop Camera</button>
<div id="screenshot"></div>
<script>
const synth = window.speechSynthesis;
let captureInterval = null, captureCount = 0, captureImages = [], stream = null;
const preview = document.getElementById('preview');
const startBtn = document.getElementById('startButton');
const stopCamBtn = document.getElementById('stopCameraButton');
const screenshotDiv = document.getElementById('screenshot');
function speak(text){let u=new SpeechSynthesisUtterance(text);synth.cancel();synth.speak(u);}
async function initCamera(){
try{
stream = await navigator.mediaDevices.getUserMedia({audio:false, video:{width:{ideal:960}, height:{ideal:540}}});
preview.srcObject = stream;
}catch(e){alert('Cannot access camera');}
}
function captureImage(){
const canvas=document.createElement('canvas');
const ctx=canvas.getContext('2d');
canvas.width=preview.videoWidth; canvas.height=preview.videoHeight;
ctx.drawImage(preview,0,0,canvas.width,canvas.height);
const imgUrl=canvas.toDataURL('image/jpeg',0.75);
const img=document.createElement('img'); img.src=imgUrl; screenshotDiv.appendChild(img);
captureImages.push(imgUrl.split(',')[1]);
captureCount++;
if(captureCount>=8){stopCapture();}
}
function startCapture(){
screenshotDiv.innerHTML=''; captureCount=0; captureImages=[]; speak('Please blink quickly');
setTimeout(()=>{captureInterval=setInterval(captureImage,200);},1500);
}
function stopCapture(){clearInterval(captureInterval); captureInterval=null; doCheckin();}
function doCheckin(){
speak('Processing');
fetch('/checkin/live',{method:'POST', headers:{'Content-Type':'application/json'}, body:JSON.stringify(captureImages)})
.then(r=>r.text())
.then(d=>{if(d.includes('Liveness-Pass')||d.includes('Checkin-OK')){speak('Attendance successful'); setTimeout(()=>location.reload(),3000);} else if(d.includes('Liveness-Fail')||d.includes('Checkin-Not-Live')){speak('Liveness failed');} else {speak('Check failed: '+d);}})
.catch(()=>speak('Network request failed'));
}
function stopCamera(){
if(stream){stream.getTracks().forEach(t=>t.stop()); stream=null;}
if(captureInterval){clearInterval(captureInterval); captureInterval=null;}
preview.srcObject=null;
}
startBtn.addEventListener('click', startCapture);
stopCamBtn.addEventListener('click',()=>{stopCamera(); alert('Camera stopped');});
window.addEventListener('beforeunload', stopCamera);
initCamera();
</script>
</body>
</html>All components together provide a complete pipeline: data collection, face encoding storage, real‑time detection, liveness verification, attendance logging, and web‑based management.
Signed-in readers can open the original source through BestHub's protected redirect.
This article has been distilled and summarized from source material, then republished for learning and reference. If you believe it infringes your rights, please contactand we will review it promptly.
Woodpecker Software Testing
The Woodpecker Software Testing public account shares software testing knowledge, connects testing enthusiasts, founded by Gu Xiang, website: www.3testing.com. Author of five books, including "Mastering JMeter Through Case Studies".
How this landed with the community
Was this worth your time?
0 Comments
Thoughtful readers leave field notes, pushback, and hard-won operational detail here.
