File size: 4,827 Bytes
34d2fe8 5dd1bb4 34d2fe8 5dd1bb4 34d2fe8 5dd1bb4 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 | ---
title: SQLEnv
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web
---
# SQLEnv: Teaching Agents to Explore Databases


SQLEnv is an interactive RL environment for text-to-SQL reasoning. Instead of producing one-shot SQL, agents learn to think like data analysts: inspect schema, sample rows, run exploratory queries, and submit a final answer with confidence.
Built for the [OpenEnv Challenge](https://github.com/meta-pytorch/OpenEnv), this project packages environment runtime, dense rewards, evaluation, and training hooks so others can reproduce results and iterate quickly.
## Quick Start
Run these three commands to install, validate, and smoke-test the environment:
```bash
uv sync
uv run openenv validate --verbose
uv run pytest tests/ -v
```
Local server run:
```bash
uv run uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
```
Docker run:
```bash
docker build -t sql-env:latest -f server/Dockerfile .
docker run -p 8000:8000 sql-env:latest
```
## Why SQLEnv
Static text-to-SQL benchmarks reward final outputs, not reasoning quality. SQLEnv turns SQL generation into an interactive decision process with feedback at each step, making it suitable for RL training and behavior analysis.
## Architecture
```text
+-------------+ WebSocket +----------------------+ SQLite
| RL Agent | <------------------> | SQLEnvClient | <----------------+
| (GRPO/TRL) | | (client.py) | |
+-------------+ +----------+-----------+ |
HTTP/WebSocket |
| |
v |
+--------------------------+ |
| FastAPI Server | |
| (server.app:app) | |
+------------+-------------+ |
| |
v |
+--------------------------+ |
| SQLEnvironment |------------+
| step/reset/reward/verify |
+--------------------------+
```
## How It Works
Each episode begins with a natural language question mapped to a hidden Spider database. The agent acts through four environment actions:
| Action | Purpose | Typical Output |
|--------|---------|----------------|
| `DESCRIBE table_name` | Inspect schema and column metadata | Column names, types, row count |
| `SAMPLE table_name` | Inspect representative rows | Small row sample |
| `QUERY sql_string` | Execute read-only SQL in sandbox | Query result rows or SQL error |
| `ANSWER value` | Submit final answer | Terminal reward and completion |
Episode flow:
1. `reset()` returns question context and available tables.
2. `step()` executes one exploration action at a time.
3. `ANSWER` ends the episode with correctness-based terminal reward.
## Train an Agent
Use the GRPO training pipeline artifacts from F006 and run the notebook workflow:
- Notebook: `notebooks/train_grpo.ipynb`
- Training support modules: `training/`
- Evaluation utilities: `evaluation/`
This setup is designed for Colab and local CPU/GPU environments.
## HuggingFace Space
- Live Space: `https://huggingface.co/spaces/<your-org-or-user>/sql-env` (update after push)
- Health check: `curl https://<space-url>/health`
- Deploy command: `uv run openenv push`
## Project Structure
```text
sql-env/
|- __init__.py
|- client.py
|- models.py
|- openenv.yaml
|- server/
| |- app.py
| |- sql_environment.py
| |- reward.py
| |- verifier.py
| `- Dockerfile
|- data/
| |- databases/
| `- questions/
|- training/
|- evaluation/
|- notebooks/
| `- train_grpo.ipynb
|- specs/
|- docs/
`- tests/
```
## Deployment Checklist
1. `uv run openenv validate --verbose`
2. `uv run openenv build`
3. `uv run openenv push`
4. Verify `/health` and run one full episode through the client.
## Links
- OpenEnv framework: https://github.com/meta-pytorch/OpenEnv
- OpenEnv docs: https://meta-pytorch.org/OpenEnv/
- Spider dataset: https://huggingface.co/datasets/xlangai/spider
- TRL OpenEnv docs: https://huggingface.co/docs/trl/openenv
- Verification plan: `specs/F007-VERIFICATION_SPEC.md`
|