Spaces:

FocusGuard
/

final_test

Sleeping

App Files Files Community

final_test / models /README.md

Abdelrahman Almatrooshi

Deploy snapshot from main b7a59b11809483dfc959f196f1930240f2662c49

22a6915 22 days ago

preview code

raw

history blame contribute delete

3.07 kB

models

Feature extraction, geometric scoring, and ML model training. Shared modules at the top level handle the core computer vision pipeline; subdirectories contain model-specific training and sweep scripts.

Inference pipeline

Webcam frame
  |
  v
MediaPipe Face Mesh (face_mesh.py) --> 478 landmarks
  |
  +---> HeadPoseEstimator (head_pose.py)    --> yaw, pitch, roll, s_face
  +---> EyeBehaviourScorer (eye_scorer.py)  --> EAR, s_eye, MAR
  +---> GazeRatio (eye_scorer.py)           --> h_gaze, v_gaze, gaze_offset
  +---> TemporalTracker (collect_features.py) --> PERCLOS, blink_rate, closure_dur
  |
  v
17-feature vector --> clip --> select 10 --> ML model or geometric scorer

Shared modules

File	Purpose
`face_mesh.py`	MediaPipe Face Landmarker wrapper (478 landmarks including 10 iris points)
`head_pose.py`	`HeadPoseEstimator`: solvePnP on 6 anatomical landmarks (nose tip, chin, eye corners, mouth corners), cosine-decay face orientation score with max_angle=22 deg and roll down-weighted 50%
`eye_scorer.py`	`EyeBehaviourScorer`: EAR from 6 landmarks per eye (open=0.30, closed=0.16), iris-based gaze scoring (cosine decay, max_offset=0.28), MAR yawn detection (threshold=0.55)
`collect_features.py`	17-feature extraction with `TemporalTracker` (PERCLOS over 60 frames, blink rate over 30s window); webcam labelling CLI for data collection
`gaze_calibration.py`	`GazeCalibration`: 9-point polynomial (degree-2) mapping from raw L2CS gaze angles to normalised screen coordinates, with IQR outlier filtering and centre-point bias correction
`gaze_eye_fusion.py`	`GazeEyeFusion`: fuses calibrated gaze position with EAR for continuous focus scoring; sustained eye closure veto (4+ frames)

Subdirectories

Directory	Contents
`mlp/`	PyTorch MLP (10-64-32-2, ~2,850 params): training, evaluation, Optuna sweep
`xgboost/`	XGBoost (600 trees, depth 8, lr 0.1489): training, evaluation, ClearML + Optuna sweeps
`L2CS-Net/`	Vendored L2CS-Net gaze estimator (ResNet50 pretrained on Gaze360)

Data collection

python -m models.collect_features --name <participant>

Records a webcam session with real-time binary labelling (spacebar toggles focused/unfocused). Outputs .npz files to data/collected_<participant>/ containing the 17-feature vector and labels per frame. Quality guidance is displayed during recording (class balance warnings, transition count).

9 participants each recorded 5-10 minute sessions across varied environments, totalling 144,793 frames (61.5% focused, 38.5% unfocused). Only extracted feature vectors are stored; raw video is never saved.

Geometric scoring formulas

Face orientation score: s_face = 0.5 * (1 + cos(pi * min(d / 22, 1))) where d = sqrt(yaw^2 + pitch^2 + (0.5*roll)^2)

Eye behaviour score: s_eye = ear_score * gaze_score, where EAR is linearly mapped [0.16, 0.30] to [0, 1] and gaze uses the same cosine decay with max_offset=0.28