# models/

Feature extraction + training for FocusGuard (17 features → 10 for MLP/XGB; geometric/hybrid/L2CS paths — see root `README.md`).

## What is here

- `face_mesh.py`: MediaPipe landmarks
- `head_pose.py`: yaw/pitch/roll and face-orientation scores
- `eye_scorer.py`: EAR, gaze offsets, MAR
- `collect_features.py`: writes per-session `.npz` feature files
- `mlp/`: MLP training and utilities
- `xgboost/`: XGBoost training and utilities

## 1) Setup

From repo root:

```bash
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
```

## 2) Collect training data (if needed)

```bash
python -m models.collect_features --name <participant_name>
```

This writes files under `data/collected_<participant_name>/`.

## 3) Train models

Both scripts read config from `config/default.yaml` (split ratios, seeds, hyperparameters).

### MLP

```bash
python -m models.mlp.train
```

Outputs:
- checkpoint: `checkpoints/mlp_best.pt` (best by validation F1)
- scaler/meta: `checkpoints/scaler_mlp.joblib`, `checkpoints/meta_mlp.npz`
- log: `evaluation/logs/face_orientation_training_log.json`

### XGBoost

```bash
python -m models.xgboost.train
```

Outputs:
- checkpoint: `checkpoints/xgboost_face_orientation_best.json`
- log: `evaluation/logs/xgboost_face_orientation_training_log.json`

## 4) Run evaluation after training

```bash
python -m evaluation.justify_thresholds
python -m evaluation.grouped_split_benchmark --quick
python -m evaluation.feature_importance --quick --skip-lofo
```

Generated reports:
- `evaluation/THRESHOLD_JUSTIFICATION.md`
- `evaluation/GROUPED_SPLIT_BENCHMARK.md`
- `evaluation/feature_selection_justification.md`

## 5) Optional: ClearML tracking

Run training with ClearML logging:

```bash
USE_CLEARML=1 python -m models.mlp.train
USE_CLEARML=1 python -m models.xgboost.train
```

Remote execution via agent queue:

```bash
USE_CLEARML=1 CLEARML_QUEUE=gpu python -m models.mlp.train
USE_CLEARML=1 CLEARML_QUEUE=gpu python -m models.xgboost.train
```