Spaces:

hjerpe
/

sql_env

Runtime error

File size: 4,827 Bytes

34d2fe8
5dd1bb4
 
 
 
34d2fe8
 
5dd1bb4
34d2fe8
 
5dd1bb4

---
title: SQLEnv
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web
---

# SQLEnv: Teaching Agents to Explore Databases

![Python](https://img.shields.io/badge/python-3.12-blue.svg)
![License](https://img.shields.io/badge/license-MIT-green.svg)

SQLEnv is an interactive RL environment for text-to-SQL reasoning. Instead of producing one-shot SQL, agents learn to think like data analysts: inspect schema, sample rows, run exploratory queries, and submit a final answer with confidence.

Built for the [OpenEnv Challenge](https://github.com/meta-pytorch/OpenEnv), this project packages environment runtime, dense rewards, evaluation, and training hooks so others can reproduce results and iterate quickly.

## Quick Start

Run these three commands to install, validate, and smoke-test the environment:

```bash
uv sync
uv run openenv validate --verbose
uv run pytest tests/ -v
```

Local server run:

```bash
uv run uvicorn server.app:app --reload --host 0.0.0.0 --port 8000
```

Docker run:

```bash
docker build -t sql-env:latest -f server/Dockerfile .
docker run -p 8000:8000 sql-env:latest
```

## Why SQLEnv

Static text-to-SQL benchmarks reward final outputs, not reasoning quality. SQLEnv turns SQL generation into an interactive decision process with feedback at each step, making it suitable for RL training and behavior analysis.

## Architecture

```text
+-------------+      WebSocket       +----------------------+      SQLite
| RL Agent    | <------------------> | SQLEnvClient         | <----------------+
| (GRPO/TRL)  |                      | (client.py)          |                 |
+-------------+                      +----------+-----------+                 |
                                              HTTP/WebSocket                  |
                                                     |                         |
                                                     v                         |
                                       +--------------------------+            |
                                       | FastAPI Server           |            |
                                       | (server.app:app)         |            |
                                       +------------+-------------+            |
                                                    |                          |
                                                    v                          |
                                       +--------------------------+            |
                                       | SQLEnvironment           |------------+
                                       | step/reset/reward/verify |
                                       +--------------------------+
```

## How It Works

Each episode begins with a natural language question mapped to a hidden Spider database. The agent acts through four environment actions:

| Action | Purpose | Typical Output |
|--------|---------|----------------|
| `DESCRIBE table_name` | Inspect schema and column metadata | Column names, types, row count |
| `SAMPLE table_name` | Inspect representative rows | Small row sample |
| `QUERY sql_string` | Execute read-only SQL in sandbox | Query result rows or SQL error |
| `ANSWER value` | Submit final answer | Terminal reward and completion |

Episode flow:
1. `reset()` returns question context and available tables.
2. `step()` executes one exploration action at a time.
3. `ANSWER` ends the episode with correctness-based terminal reward.

## Train an Agent

Use the GRPO training pipeline artifacts from F006 and run the notebook workflow:

- Notebook: `notebooks/train_grpo.ipynb`
- Training support modules: `training/`
- Evaluation utilities: `evaluation/`

This setup is designed for Colab and local CPU/GPU environments.

## HuggingFace Space

- Live Space: `https://huggingface.co/spaces/<your-org-or-user>/sql-env` (update after push)
- Health check: `curl https://<space-url>/health`
- Deploy command: `uv run openenv push`

## Project Structure

```text
sql-env/
|- __init__.py
|- client.py
|- models.py
|- openenv.yaml
|- server/
|  |- app.py
|  |- sql_environment.py
|  |- reward.py
|  |- verifier.py
|  `- Dockerfile
|- data/
|  |- databases/
|  `- questions/
|- training/
|- evaluation/
|- notebooks/
|  `- train_grpo.ipynb
|- specs/
|- docs/
`- tests/
```

## Deployment Checklist

1. `uv run openenv validate --verbose`
2. `uv run openenv build`
3. `uv run openenv push`
4. Verify `/health` and run one full episode through the client.

## Links

- OpenEnv framework: https://github.com/meta-pytorch/OpenEnv
- OpenEnv docs: https://meta-pytorch.org/OpenEnv/
- Spider dataset: https://huggingface.co/datasets/xlangai/spider
- TRL OpenEnv docs: https://huggingface.co/docs/trl/openenv
- Verification plan: `specs/F007-VERIFICATION_SPEC.md`