Spaces:

hjerpe
/

sql_env

Runtime error

App Files Files Community

sql_env / README.md

hjerpe

Upload folder using huggingface_hub

5dd1bb4 verified 4 days ago

preview code

raw

history blame contribute delete

4.83 kB

metadata

title: SQLEnv
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web

SQLEnv: Teaching Agents to Explore Databases

SQLEnv is an interactive RL environment for text-to-SQL reasoning. Instead of producing one-shot SQL, agents learn to think like data analysts: inspect schema, sample rows, run exploratory queries, and submit a final answer with confidence.

Built for the OpenEnv Challenge, this project packages environment runtime, dense rewards, evaluation, and training hooks so others can reproduce results and iterate quickly.

Quick Start

Run these three commands to install, validate, and smoke-test the environment:

uv sync
uv run openenv validate --verbose
uv run pytest tests/ -v

Local server run:

uv run uvicorn server.app:app --reload --host 0.0.0.0 --port 8000

Docker run:

docker build -t sql-env:latest -f server/Dockerfile .
docker run -p 8000:8000 sql-env:latest

Why SQLEnv

Static text-to-SQL benchmarks reward final outputs, not reasoning quality. SQLEnv turns SQL generation into an interactive decision process with feedback at each step, making it suitable for RL training and behavior analysis.

Architecture

+-------------+      WebSocket       +----------------------+      SQLite
| RL Agent    | <------------------> | SQLEnvClient         | <----------------+
| (GRPO/TRL)  |                      | (client.py)          |                 |
+-------------+                      +----------+-----------+                 |
                                              HTTP/WebSocket                  |
                                                     |                         |
                                                     v                         |
                                       +--------------------------+            |
                                       | FastAPI Server           |            |
                                       | (server.app:app)         |            |
                                       +------------+-------------+            |
                                                    |                          |
                                                    v                          |
                                       +--------------------------+            |
                                       | SQLEnvironment           |------------+
                                       | step/reset/reward/verify |
                                       +--------------------------+

How It Works

Each episode begins with a natural language question mapped to a hidden Spider database. The agent acts through four environment actions:

Action	Purpose	Typical Output
`DESCRIBE table_name`	Inspect schema and column metadata	Column names, types, row count
`SAMPLE table_name`	Inspect representative rows	Small row sample
`QUERY sql_string`	Execute read-only SQL in sandbox	Query result rows or SQL error
`ANSWER value`	Submit final answer	Terminal reward and completion

Episode flow:

reset() returns question context and available tables.
step() executes one exploration action at a time.
ANSWER ends the episode with correctness-based terminal reward.

Train an Agent

Use the GRPO training pipeline artifacts from F006 and run the notebook workflow:

Notebook: notebooks/train_grpo.ipynb
Training support modules: training/
Evaluation utilities: evaluation/

This setup is designed for Colab and local CPU/GPU environments.

HuggingFace Space

Live Space: https://huggingface.co/spaces/<your-org-or-user>/sql-env (update after push)
Health check: curl https://<space-url>/health
Deploy command: uv run openenv push

Project Structure

sql-env/
|- __init__.py
|- client.py
|- models.py
|- openenv.yaml
|- server/
|  |- app.py
|  |- sql_environment.py
|  |- reward.py
|  |- verifier.py
|  `- Dockerfile
|- data/
|  |- databases/
|  `- questions/
|- training/
|- evaluation/
|- notebooks/
|  `- train_grpo.ipynb
|- specs/
|- docs/
`- tests/

Deployment Checklist

uv run openenv validate --verbose
uv run openenv build
uv run openenv push
Verify /health and run one full episode through the client.