Spaces:
Sleeping
models/xgboost
Gradient-boosted tree ensemble for binary focus classification. Primary ML model in the FocusGuard pipeline. Uses the same 10 selected features as the MLP.
Configuration
Final hyperparameters selected from a 40-trial Optuna sweep:
| Parameter | Value | Source |
|---|---|---|
| n_estimators | 600 | xgboost.n_estimators |
| max_depth | 8 | xgboost.max_depth |
| learning_rate | 0.1489 | xgboost.learning_rate |
| subsample | 0.9625 | xgboost.subsample |
| colsample_bytree | 0.9013 | xgboost.colsample_bytree |
| reg_alpha | 1.1407 | xgboost.reg_alpha |
| reg_lambda | 2.4181 | xgboost.reg_lambda |
| eval_metric | logloss | xgboost.eval_metric |
Training
python -m models.xgboost.train
Reads all parameters from config/default.yaml. Uses XGBClassifier with early stopping on validation logloss.
Results
Pooled random split (70/15/15)
| Accuracy | F1 | ROC-AUC |
|---|---|---|
| 95.87% | 0.959 | 0.991 |
LOPO cross-validation (9 participants)
| Metric | Value |
|---|---|
| LOPO AUC | 0.870 |
| Optimal threshold (Youden's J) | 0.280 |
| F1 at optimal threshold | 0.855 |
| F1 at default 0.50 | 0.832 |
| Improvement from threshold tuning | +2.3 pp |
The ~12 pp drop from pooled to LOPO reflects temporal data leakage and underscores why person-independent evaluation matters.
Per-person LOPO (at t* = 0.280)
| Held-out | Acc | F1 | Prec | Rec |
|---|---|---|---|---|
| Abdelrahman | 0.864 | 0.900 | 0.904 | 0.896 |
| Jarek | 0.872 | 0.903 | 0.902 | 0.904 |
| Junhao | 0.890 | 0.901 | 0.841 | 0.971 |
| Kexin | 0.738 | 0.747 | 0.778 | 0.717 |
| Langyuan | 0.655 | 0.677 | 0.548 | 0.888 |
| Mohamed | 0.881 | 0.894 | 0.843 | 0.952 |
| Yingtao | 0.855 | 0.909 | 0.926 | 0.894 |
| Ayten | 0.841 | 0.905 | 0.861 | 0.954 |
| Saba | 0.923 | 0.925 | 0.956 | 0.896 |
| Mean +/- std | .835 +/- .080 | .862 +/- .082 | .840 +/- .115 | .897 +/- .070 |
95% CI for mean F1: [0.799, 0.926]
Feature importance (XGBoost gain)
Top 5: s_face (10.27), ear_right (9.54), head_deviation (8.83), ear_avg (6.96), perclos (5.68)
ClearML integration
USE_CLEARML=1 python -m models.xgboost.train
Same enrichment as MLP: hyperparameters, per-round scalars, confusion matrices, ROC curves, model registration, dataset stats, and reproducibility artifacts.
Sweeps
ClearML HPO (remote)
USE_CLEARML=1 python -m models.xgboost.sweep
Launches a HyperParameterOptimizer controller on ClearML that clones the base training task and runs grid/random search across workers.
Local Optuna sweep
python -m models.xgboost.sweep_local
40-trial TPE sampler, optimising LOPO F1. Search space: n_estimators [100-1000], max_depth [3-10], learning_rate [0.01-0.3], subsample [0.6-1.0], colsample_bytree [0.6-1.0], reg_alpha/lambda [0-5].
Export local sweep results from ClearML
python -m models.xgboost.fetch_sweep_results
Writes models/xgboost/sweep_results_all_40.csv using the metrics logged by sweep_local.py (val_loss, val_accuracy, val_f1) plus each trial's hyperparameters.
Default export behavior is strict and deterministic:
- only tasks named like
XGBoost Sweep Trial #... - only the latest 40 matching tasks
- skips tasks with missing or zero
val_loss/val_accuracy/val_f1 - ranks by
val_f1(thenval_loss, thenval_accuracy)
Useful options:
python -m models.xgboost.fetch_sweep_results --limit 0 --keep-zero-metrics
python -m models.xgboost.fetch_sweep_results --name-prefix "XGBoost Sweep Trial #" --limit 40
python -m models.xgboost.fetch_sweep_results --compute-missing-val-accuracy --sort-by val_f1
Outputs
| File | Location |
|---|---|
| Best model | checkpoints/xgboost_face_orientation_best.json |
| Scaler | checkpoints/scaler_xgboost.joblib |
| Test predictions | evaluation/logs/xgboost_test_predictions.csv |
| Test metrics | evaluation/logs/xgboost_test_metrics_summary.json |
| Feature importance | evaluation/logs/xgboost_feature_importance.json |