Spaces:
Sleeping
Sleeping
Switch Space trainer defaults to math_conjecture_sota profile and remove DeepSeek references
9a4f619 verified A newer version of the Gradio SDK is available: 6.11.0
Space Trainer Validation Log
Date (UTC): 2026-02-28 10:24:36 UTC
Scope Reviewed
Reviewed the full space_trainer/ implementation surface used by the Hugging Face Space runtime:
space_trainer/app.pyspace_trainer/README.mdspace_trainer/PRODUCTION.mdspace_trainer/.env.examplespace_trainer/requirements.txtspace_trainer/configs/math_conjecture_sota.yamlspace_trainer/scripts/preflight_check.pyspace_trainer/scripts/train_sota.pyspace_trainer/scripts/eval_sota.pyspace_trainer/tests/test_core_utils.py- Existing workspace runtime/run artifacts under
space_trainer/workspace/
Issues Found
- UI result badge mapping treated
preflight passedas neutral because_was converted to spaces before class lookup. - Unit tests failed when run from repository root due import path assumptions (
ModuleNotFoundError: app).
Fixes Applied
space_trainer/app.py
- Normalized run result strings in
_run_result_badge_class()to handle underscore/space/hyphen variants. - Updated recent-runs badge rendering to classify by raw result key and only prettify the display label.
- Kept Gradio theme/css/head in
launch()(Gradio 6.6 recommended path), and set queue configuration once at module load withdemo.queue(default_concurrency_limit=1).
space_trainer/tests/test_core_utils.py
- Added deterministic
sys.pathinsertion forspace_trainer/root so tests pass from both:- repo root (
python -m unittest discover -s space_trainer/tests -v) space_trainer/directory (python -m unittest discover -s tests -v)
- repo root (
- Added regression test for preflight badge-class normalization.
Validation Commands and Results
- Preflight checks:
- Command:
.venv/bin/python space_trainer/scripts/preflight_check.py --json - Result: PASS (
"ok": true)
- Unit tests from repo root:
- Command:
.venv/bin/python -m unittest discover -s space_trainer/tests -v - Result: PASS (
Ran 15 tests,OK)
- Unit tests from
space_trainer/:
- Command:
../.venv/bin/python -m unittest discover -s tests -v - Result: PASS (
Ran 15 tests,OK)
- Python syntax compile check:
- Command:
../.venv/bin/python -m py_compile app.py scripts/preflight_check.py scripts/train_sota.py scripts/eval_sota.py tests/test_core_utils.py - Result: PASS
- Gradio app object/config smoke check:
- Command:
../.venv/bin/python - <<'PY' ... app.demo.get_config_file() ... PY - Result: PASS (
mode=blocks,components=44,dependencies=3,queue_set=True)
Environment Notes
- CUDA warning appears in this environment (
cudaGetDeviceCountOS unsupported). This is expected on non-GPU hosts and handled by app CPU fallback logic. - Fast tokenizer fallback warning (
protobuf missing) is already handled by project fallback code and validated by tests. - Direct local
app.pyserver launch in this sandbox cannot bind any Gradio ports (Cannot find empty port...). This is an execution-environment limitation, not a code-level validation failure.
Current Status
- UI telemetry classification bug fixed.
- Test reliability improved.
- Preflight + tests + compile checks are passing.
- Space runtime code path is consistent and ready for deployment validation inside Hugging Face Spaces.
Rewrite Session
Date (UTC): 2026-02-28 11:56:17 UTC
Objective
- Reprogram
app.pyfrom scratch. - Switch UI to a full monochrome theme.
- Preserve full end-to-end pipeline functionality in a newly structured implementation.
Implementation Summary
- Replaced
space_trainer/app.pyentirely with a new architecture and new UI/CSS/HTML structure. - Kept all major operational capabilities:
- dataset download and cache handling
- runtime config generation
- staged training subprocess orchestration
- optional post-training evaluation fallback path
- quality gate + push status surfacing
- continuous auto-restart with cooldown and circuit breaker
- cancellation controls
- run history persistence and recent-runs panel
- Kept compatibility for existing tests and tooling contracts (e.g., helper function names used by tests and preflight checks).
Monochrome Redesign
- New monochrome command-center visual language with grayscale-only palette.
- New telemetry card layout, stage timeline, recent-runs view, and loss sparkline styling.
- New hero header and runtime timestamp script in
UI_HEAD.
Verification Executed
- Syntax check:
../.venv/bin/python -m py_compile app.py- Result: PASS
- Preflight:
../.venv/bin/python scripts/preflight_check.py --json- Result: PASS (
"ok": true)
- Unit tests:
../.venv/bin/python -m unittest discover -s tests -v- Result: PASS (
Ran 15 tests,OK)
- Gradio config smoke check:
../.venv/bin/python - <<'PY' ... app.demo.get_config_file() ... PY- Result: PASS (
mode=blocks,components=44,dependencies=3,stage_count=4)
Footer + Continuous Enforcement Session
Date (UTC): 2026-02-28 12:45:36 UTC
Requested Changes
- Remove default Gradio footer controls (
Use via API, logo, settings) from footer area. - Place API/settings access in a better UI location.
- Ensure training runs in continuous mode.
Implementation
- Footer controls removed from Gradio launch:
- Added
footer_links=[]indemo.launch(...).
- API/settings moved into hero section:
- Added
.mono-link-rowwith:/gradio_api/docshttps://huggingface.co/spaces/NorthernTribe-Research/math_trainer/settings
- Added matching CSS styles for the new header links.
- Continuous mode enforced:
- Runtime enforcement in
run_pipeline(...):continuous_mode = not bool(preflight_only)
- UI control locked to enforced-on:
Continuous Auto-Restart (Enforced)withinteractive=False.
Verification
../.venv/bin/python -m py_compile app.py-> PASS../.venv/bin/python scripts/preflight_check.py --json-> PASS../.venv/bin/python -m unittest discover -s tests -v-> PASS (Ran 15 tests,OK)
Deployment
- Space:
NorthernTribe-Research/math_trainer - Commit:
c8a24f966d710173764da0355e56632af9e66c40 - Runtime after deploy:
RUNNING https://northerntribe-research-math-trainer.hf.space/config->200JSON