final_test / models /xgboost /README.md
Abdelrahman Almatrooshi
Deploy snapshot from main b7a59b11809483dfc959f196f1930240f2662c49
22a6915

models/xgboost

Gradient-boosted tree ensemble for binary focus classification. Primary ML model in the FocusGuard pipeline. Uses the same 10 selected features as the MLP.

Configuration

Final hyperparameters selected from a 40-trial Optuna sweep:

Parameter Value Source
n_estimators 600 xgboost.n_estimators
max_depth 8 xgboost.max_depth
learning_rate 0.1489 xgboost.learning_rate
subsample 0.9625 xgboost.subsample
colsample_bytree 0.9013 xgboost.colsample_bytree
reg_alpha 1.1407 xgboost.reg_alpha
reg_lambda 2.4181 xgboost.reg_lambda
eval_metric logloss xgboost.eval_metric

Training

python -m models.xgboost.train

Reads all parameters from config/default.yaml. Uses XGBClassifier with early stopping on validation logloss.

Results

Pooled random split (70/15/15)

Accuracy F1 ROC-AUC
95.87% 0.959 0.991

LOPO cross-validation (9 participants)

Metric Value
LOPO AUC 0.870
Optimal threshold (Youden's J) 0.280
F1 at optimal threshold 0.855
F1 at default 0.50 0.832
Improvement from threshold tuning +2.3 pp

The ~12 pp drop from pooled to LOPO reflects temporal data leakage and underscores why person-independent evaluation matters.

Per-person LOPO (at t* = 0.280)

Held-out Acc F1 Prec Rec
Abdelrahman 0.864 0.900 0.904 0.896
Jarek 0.872 0.903 0.902 0.904
Junhao 0.890 0.901 0.841 0.971
Kexin 0.738 0.747 0.778 0.717
Langyuan 0.655 0.677 0.548 0.888
Mohamed 0.881 0.894 0.843 0.952
Yingtao 0.855 0.909 0.926 0.894
Ayten 0.841 0.905 0.861 0.954
Saba 0.923 0.925 0.956 0.896
Mean +/- std .835 +/- .080 .862 +/- .082 .840 +/- .115 .897 +/- .070

95% CI for mean F1: [0.799, 0.926]

Feature importance (XGBoost gain)

Top 5: s_face (10.27), ear_right (9.54), head_deviation (8.83), ear_avg (6.96), perclos (5.68)

ClearML integration

USE_CLEARML=1 python -m models.xgboost.train

Same enrichment as MLP: hyperparameters, per-round scalars, confusion matrices, ROC curves, model registration, dataset stats, and reproducibility artifacts.

Sweeps

ClearML HPO (remote)

USE_CLEARML=1 python -m models.xgboost.sweep

Launches a HyperParameterOptimizer controller on ClearML that clones the base training task and runs grid/random search across workers.

Local Optuna sweep

python -m models.xgboost.sweep_local

40-trial TPE sampler, optimising LOPO F1. Search space: n_estimators [100-1000], max_depth [3-10], learning_rate [0.01-0.3], subsample [0.6-1.0], colsample_bytree [0.6-1.0], reg_alpha/lambda [0-5].

Export local sweep results from ClearML

python -m models.xgboost.fetch_sweep_results

Writes models/xgboost/sweep_results_all_40.csv using the metrics logged by sweep_local.py (val_loss, val_accuracy, val_f1) plus each trial's hyperparameters.

Default export behavior is strict and deterministic:

  • only tasks named like XGBoost Sweep Trial #...
  • only the latest 40 matching tasks
  • skips tasks with missing or zero val_loss / val_accuracy / val_f1
  • ranks by val_f1 (then val_loss, then val_accuracy)

Useful options:

python -m models.xgboost.fetch_sweep_results --limit 0 --keep-zero-metrics
python -m models.xgboost.fetch_sweep_results --name-prefix "XGBoost Sweep Trial #" --limit 40
python -m models.xgboost.fetch_sweep_results --compute-missing-val-accuracy --sort-by val_f1

Outputs

File Location
Best model checkpoints/xgboost_face_orientation_best.json
Scaler checkpoints/scaler_xgboost.joblib
Test predictions evaluation/logs/xgboost_test_predictions.csv
Test metrics evaluation/logs/xgboost_test_metrics_summary.json
Feature importance evaluation/logs/xgboost_feature_importance.json