ArchCoder commited on
Commit
4393177
Β·
verified Β·
1 Parent(s): 8f5aed5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +76 -6
README.md CHANGED
@@ -1,11 +1,81 @@
1
  ---
2
- title: Openenv
3
- emoji: πŸ‘
4
  colorFrom: yellow
5
- colorTo: indigo
6
  sdk: docker
7
- pinned: false
8
- license: mit
 
 
 
 
 
 
 
 
9
  ---
10
 
11
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ title: PyTorch Debug Env
3
+ emoji: πŸ”₯
4
  colorFrom: yellow
5
+ colorTo: red
6
  sdk: docker
7
+ app_port: 7860
8
+ short_description: RL environment for diagnosing broken PyTorch training jobs
9
+ tags:
10
+ - openenv
11
+ - pytorch
12
+ - reinforcement-learning
13
+ - debugging
14
+ - ml-training
15
+ - agent
16
+ pinned: true
17
  ---
18
 
19
+ # PyTorch Debug Env πŸ”₯
20
+
21
+ A complete [OpenEnv](https://meta-pytorch.org/OpenEnv/) environment for the **Meta PyTorch Hackathon** where an AI agent investigates and diagnoses broken PyTorch training jobs.
22
+
23
+ ## Quick Start
24
+
25
+ ```python
26
+ from openenv import AutoEnv, AutoAction
27
+
28
+ env = AutoEnv.from_env("ArchCoder/pytorch-debug-env")
29
+ Action = AutoAction.from_env("ArchCoder/pytorch-debug-env")
30
+
31
+ with env.sync() as client:
32
+ result = client.reset(task_id="easy")
33
+ action = Action(
34
+ current_hypothesis={
35
+ "bug_type": "missing_zero_grad",
36
+ "affected_file": "train.py",
37
+ "confidence": 0.7
38
+ },
39
+ commit_diagnosis=False
40
+ )
41
+ step_result = client.step(action)
42
+ ```
43
+
44
+ ## API Endpoints
45
+
46
+ | Endpoint | Method | Description |
47
+ |----------|--------|-------------|
48
+ | `/` | GET | Environment info |
49
+ | `/health` | GET | Health check |
50
+ | `/reset?task_id=easy` | POST | Start new episode |
51
+ | `/step` | POST | Submit hypothesis + action |
52
+ | `/state` | GET | Current episode state |
53
+
54
+ ## Tasks
55
+
56
+ | Task | Difficulty | Description |
57
+ |------|-----------|-------------|
58
+ | `easy` | ⭐ | Single-file bug β€” missing `zero_grad`, wrong loss |
59
+ | `medium` | ⭐⭐ | Multi-file root cause β€” data leakage, scheduler mismatch |
60
+ | `hard` | ⭐⭐⭐ | Silent failure β€” memory leak, AMP overflow, red herrings |
61
+
62
+ ## Reward Structure
63
+
64
+ - **Hypothesis delta** (60%) β€” reward for improving your bug hypothesis each step
65
+ - **Investigation** (20%) β€” reward for inspecting the right files
66
+ - **Final diagnosis** (20%) β€” accuracy of committed diagnosis vs ground truth
67
+
68
+ Scores range from `0.0` to `1.0`. Partial credit for correct bug category on hard tasks.
69
+
70
+ ## Environment State
71
+
72
+ Each episode provides a synthetic PyTorch repo with:
73
+ - Source files (`train.py`, `model/`, `data/`, `config/`)
74
+ - Loss curves and GPU memory profiles
75
+ - Training logs with realistic noise and red herrings
76
+
77
+ The agent reveals files progressively across up to 5–6 steps, refining its hypothesis before committing a final diagnosis.
78
+
79
+ ## Author
80
+
81
+ **Priyansh Saxena** β€” IIIT Gwalior