Spaces:
Running
evaluation/
Training logs, threshold/weight analysis, metrics. LOPO (9 folds) + Youden’s J + weight grid search — see justify_thresholds.py.
Contents: logs/ (JSON from training runs), plots/ (ROC, weight search, EAR/MAR), justify_thresholds.py, feature_importance.py, and the generated markdown reports.
Logs: MLP writes face_orientation_training_log.json, XGBoost writes xgboost_face_orientation_training_log.json. Paths: evaluation/logs/.
Threshold report: Generate THRESHOLD_JUSTIFICATION.md and plots with:
python -m evaluation.justify_thresholds
(LOPO over 9 participants, Youden’s J, weight grid search; ~10–15 min.) Outputs go to plots/ and the markdown file.
Feature importance: Run python -m evaluation.feature_importance for full XGBoost gain + leave-one-feature-out LOPO (slow).
Fast iteration mode: python -m evaluation.feature_importance --quick --skip-lofo (channel ablation + gain only).
Grouped benchmark: Run python -m evaluation.grouped_split_benchmark for full run, or python -m evaluation.grouped_split_benchmark --quick for faster approximate numbers.
Who writes here: models.mlp.train, models.xgboost.train, evaluation.justify_thresholds, evaluation.feature_importance, and the notebooks.