sql_env / README.md
hjerpe's picture
Upload folder using huggingface_hub
5dd1bb4 verified
metadata
title: SQLEnv
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: docker
pinned: false
base_path: /web

SQLEnv: Teaching Agents to Explore Databases

Python License

SQLEnv is an interactive RL environment for text-to-SQL reasoning. Instead of producing one-shot SQL, agents learn to think like data analysts: inspect schema, sample rows, run exploratory queries, and submit a final answer with confidence.

Built for the OpenEnv Challenge, this project packages environment runtime, dense rewards, evaluation, and training hooks so others can reproduce results and iterate quickly.

Quick Start

Run these three commands to install, validate, and smoke-test the environment:

uv sync
uv run openenv validate --verbose
uv run pytest tests/ -v

Local server run:

uv run uvicorn server.app:app --reload --host 0.0.0.0 --port 8000

Docker run:

docker build -t sql-env:latest -f server/Dockerfile .
docker run -p 8000:8000 sql-env:latest

Why SQLEnv

Static text-to-SQL benchmarks reward final outputs, not reasoning quality. SQLEnv turns SQL generation into an interactive decision process with feedback at each step, making it suitable for RL training and behavior analysis.

Architecture

+-------------+      WebSocket       +----------------------+      SQLite
| RL Agent    | <------------------> | SQLEnvClient         | <----------------+
| (GRPO/TRL)  |                      | (client.py)          |                 |
+-------------+                      +----------+-----------+                 |
                                              HTTP/WebSocket                  |
                                                     |                         |
                                                     v                         |
                                       +--------------------------+            |
                                       | FastAPI Server           |            |
                                       | (server.app:app)         |            |
                                       +------------+-------------+            |
                                                    |                          |
                                                    v                          |
                                       +--------------------------+            |
                                       | SQLEnvironment           |------------+
                                       | step/reset/reward/verify |
                                       +--------------------------+

How It Works

Each episode begins with a natural language question mapped to a hidden Spider database. The agent acts through four environment actions:

Action Purpose Typical Output
DESCRIBE table_name Inspect schema and column metadata Column names, types, row count
SAMPLE table_name Inspect representative rows Small row sample
QUERY sql_string Execute read-only SQL in sandbox Query result rows or SQL error
ANSWER value Submit final answer Terminal reward and completion

Episode flow:

  1. reset() returns question context and available tables.
  2. step() executes one exploration action at a time.
  3. ANSWER ends the episode with correctness-based terminal reward.

Train an Agent

Use the GRPO training pipeline artifacts from F006 and run the notebook workflow:

  • Notebook: notebooks/train_grpo.ipynb
  • Training support modules: training/
  • Evaluation utilities: evaluation/

This setup is designed for Colab and local CPU/GPU environments.

HuggingFace Space

  • Live Space: https://huggingface.co/spaces/<your-org-or-user>/sql-env (update after push)
  • Health check: curl https://<space-url>/health
  • Deploy command: uv run openenv push

Project Structure

sql-env/
|- __init__.py
|- client.py
|- models.py
|- openenv.yaml
|- server/
|  |- app.py
|  |- sql_environment.py
|  |- reward.py
|  |- verifier.py
|  `- Dockerfile
|- data/
|  |- databases/
|  `- questions/
|- training/
|- evaluation/
|- notebooks/
|  `- train_grpo.ipynb
|- specs/
|- docs/
`- tests/

Deployment Checklist

  1. uv run openenv validate --verbose
  2. uv run openenv build
  3. uv run openenv push
  4. Verify /health and run one full episode through the client.

Links