--- title: API Debug Env emoji: πŸ”§ colorFrom: red colorTo: yellow sdk: docker app_port: 8000 tags: - openenv --- # πŸ”§ API Integration Debugging Environment > A real-world OpenEnv environment where an AI agent diagnoses and fixes broken API integrations across multi-service systems with **cascading failures**, **dynamic state**, and **multi-dimensional rubric grading**. [![OpenEnv](https://img.shields.io/badge/OpenEnv-v0.2.2-blue)](https://github.com/meta-pytorch/OpenEnv) [![Python](https://img.shields.io/badge/Python-3.10%2B-green)](https://python.org) [![Tests](https://img.shields.io/badge/Tests-70%20passed-brightgreen)]() [![HF Space](https://img.shields.io/badge/HF%20Space-Live-orange)](https://huggingface.co/spaces/yadnyeshkolte/api-debug-env) --- ## Table of Contents - [Motivation β€” Why API Debugging?](#motivation--why-api-debugging) - [Environment Overview](#environment-overview) - [Key Design Features](#key-design-features) - [Tasks (Easy / Medium / Hard)](#tasks) - [Multi-Dimensional Grading Rubric](#multi-dimensional-grading-rubric) - [Reward Shaping](#reward-shaping) - [Action & Observation Spaces](#action--observation-spaces) - [Example Transcript](#example-transcript) - [Setup & Usage](#setup--usage) - [API Endpoints](#api-endpoints) - [Running Inference](#running-inference) - [Running Tests](#running-tests) - [Project Structure](#project-structure) - [Design Philosophy](#design-philosophy) --- ## Motivation β€” Why API Debugging? API integration failures are one of the **most common and expensive issues** in production software engineering. When microservices communicate β€” Service A calls Service B which calls Service C β€” a single misconfiguration can cascade through the entire system, producing confusing error chains that take hours to diagnose. Real-world API debugging requires: - **Structured diagnosis** β€” reading error logs and configs across multiple services - **Dependency awareness** β€” understanding which upstream failure is causing downstream errors - **Strategic reasoning** β€” fixing root causes first to unmask hidden downstream bugs - **Precision** β€” submitting exact configuration corrections, not approximate guesses This environment simulates **real-world cascading API failures** with dynamic state that changes as the agent acts β€” not a static lookup puzzle. --- ## Environment Overview ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ Agent Debugging Loop β”‚ β”‚ β”‚ β”‚ 1. reset(task_id) β†’ Initial observation with broken state β”‚ β”‚ 2. step(inspect_logs) β†’ Error logs with diagnostic clues β”‚ β”‚ 3. step(inspect_config)β†’ Current (broken) service configuration β”‚ β”‚ 4. step(inspect_endpoint) β†’ Simulated API response (401, 504..) β”‚ β”‚ 5. step(submit_fix) β†’ Strict fix validation + cascade update β”‚ β”‚ 6. grade() β†’ Multi-dimensional rubric score [0,1] β”‚ β”‚ β”‚ β”‚ State updates dynamically: service health changes, new logs β”‚ β”‚ appear, error cascades resolve as the agent fixes issues. β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ ``` The agent interacts through the standard OpenEnv API: - **`reset()`** β†’ returns initial observation with broken service state - **`step(action)`** β†’ executes one debugging action, returns observation + reward - **`state()`** β†’ returns current environment state (episode_id, step_count) - **`grade()`** β†’ returns final score using multi-dimensional rubric --- ## Key Design Features ### 1. Cascading Failures with Service Dependency Graphs Each task models a real multi-service ecosystem. Services depend on each other, and a bug in an upstream service **cascades** to all downstream services: ``` Hard Task Dependency Graph: order_service ──┬──→ inventory_service ──┬──→ shipping_service β”‚ └──→ auth_service └──→ api_gateway [ERROR] [DEGRADED] [HEALTHY] ``` - Fixing `order_service`'s wrong URL unmasks `inventory_service`'s timeout issue - Fixing `inventory_service`'s expired token allows `shipping_service` to respond - **Some issues are intentionally masked by upstream failures** β€” the agent must fix in the right order ### 2. Dynamic State Unlike static environments, the state **changes as the agent acts**: | What changes | How | |---|---| | **Service health** | Fixing issues updates service status: `error` β†’ `degraded` β†’ `healthy` | | **Logs** | After a fix, re-inspecting logs shows **new entries** (e.g., "Authorization header set. Retrying...") | | **Error traces** | The cascade chain shrinks as upstream issues are resolved | | **Endpoint responses** | `inspect_endpoint` returns different HTTP errors based on current fix state | ### 3. Seed-Based Scenario Randomization Each difficulty level has an **expanded issue pool** (more issues than are selected per episode): | Difficulty | Pool Size | Selected Per Episode | |---|---|---| | Easy | 4 issues | 2 | | Medium | 5 issues | 3 | | Hard | 7 issues | 5 | Passing a `seed` to `reset()` produces a **deterministic but varied** scenario β€” different seeds select different subsets from the pool and randomize log order. This prevents agents from memorizing fixed patterns. ### 4. Strict Fix Validation with Partial Credit The grader validates both **keys and values** of submitted fixes: - **Exact match** β†’ Full credit (+0.25 reward) - **Right key, close value** (e.g., timeout=7 when expected=10) β†’ Partial credit (+0.03) - **Right key, wrong value** (e.g., timeout=100 when expected=10) β†’ Rejected - **Wrong key entirely** β†’ Penalized (-0.1) - **Bearer token pattern matching** β€” `Bearer ` is accepted - **Numeric tolerance** β€” strict 10% tolerance - **Boolean coercion** β€” `"true"`, `"1"`, `"yes"` all match `True` --- ## Tasks ### Easy: Payment API Integration (2 issues, 15 max steps) **Scenario**: A payment processing client is failing to connect to the payment gateway. The agent must diagnose authentication and protocol errors. - **Services**: `payment_client`, `payment_gateway` - **Issue pool** (4 possible, 2 selected): - Missing `Authorization` header (HTTP 401) - Wrong `Content-Type` header β€” `text/plain` instead of `application/json` (HTTP 415) - Timeout too low for payment processing (HTTP 504) - Base URL pointing to deprecated v1 endpoint (HTTP 301) - **Dependencies**: None β€” straightforward diagnosis ### Medium: Webhook Event Chain (3 issues, 25 max steps) **Scenario**: A webhook notification system is dropping events across a 3-service chain. Events flow from sender β†’ receiver β†’ notification service, but multiple configuration issues are causing failures. - **Services**: `webhook_sender`, `webhook_receiver`, `notification_service` - **Issue pool** (5 possible, 3 selected): - Rate limit mismatch (sender at 100/s, receiver accepts 10/s) β†’ 429 errors - Insufficient retry config (only 1 retry, no backoff, 429 not in retry list) - Empty webhook signature header β†’ receiver drops all events as unsigned - Wrong target URL (`/webhook` vs `/hooks/incoming`) β†’ 404 errors - Payload compression enabled but receiver doesn't support gzip β†’ 415 errors - **Dependencies**: Retry issue is **masked** by rate limit β€” must fix rate limit first to see the retry problem ### Hard: E-Commerce Order Pipeline (5 issues, 40 max steps) **Scenario**: A complex e-commerce order processing pipeline is failing with cascading errors across 5 services. Multiple dependency chains make this genuinely challenging for frontier models. - **Services**: `order_service`, `inventory_service`, `shipping_service`, `api_gateway`, `auth_service` - **Issue pool** (7 possible, 5 selected): - Deprecated URL (`/v1/check` β†’ should be `/v2/reserve`) β†’ 301 redirect - Timeout too short (2s vs 4s processing time) β€” masked by wrong URL - Synchronous mode causing race conditions between concurrent orders - Expired auth token on inventoryβ†’shipping calls β†’ 401 - No auto token refresh configured β€” masked by expired token - No circuit breaker β†’ failed requests hammer inventory service - Missing idempotency key β†’ retries create duplicate orders - **Dependencies**: `timeout` depends on `wrong_url` fix; `token_refresh` depends on `expired_token` fix; `idempotency` depends on `async` fix --- ## Multi-Dimensional Grading Rubric The grader uses a **4-dimension weighted rubric**, not a simple `issues_fixed / total` ratio: | Dimension | Weight | What It Measures | |---|---|---| | **Fix Score** | 40% | `issues_fixed / total_issues` β€” how many bugs were actually resolved | | **Strategy Score** | 25% | Did the agent follow a logical approach? Inspect before fix, avoid repeats, follow dependency order, use all action types | | **Diagnosis Score** | 20% | Did the agent inspect the service (logs/config) **before** submitting a fix for it? | | **Efficiency Score** | 15% | `remaining_steps / max_steps` β€” faster solutions score higher | ``` Final Score = fix Γ— 0.40 + strategy Γ— 0.25 + diagnosis Γ— 0.20 + efficiency Γ— 0.15 Clamped to (0.001, 0.999) β€” never exactly 0.0 or 1.0 ``` **Strategy scoring details:** - Did the agent inspect logs/config before submitting any fix? (+1) - Ratio of unique inspections to total inspections (no wasteful repeats) (+1) - Did fixes follow the optimal dependency order? (+1) - Did the agent use a variety of action types? (+1) ### Baseline Scores (Rule-Based Heuristic Agent) | Task | Score | Steps Used | Issues Fixed | |---|---|---|---| | Easy | ~0.75 | 7 | 2/2 | | Medium | ~0.55 | 10 | 3/3 | | Hard | ~0.45 | 15 | 5/5 | *The baseline uses a deterministic heuristic (inspect all logs β†’ inspect all configs β†’ submit known fixes). An LLM-based agent following good debugging strategy can score higher.* --- ## Reward Shaping Every action produces a meaningful reward signal β€” not just sparse end-of-episode feedback: | Action | Reward | Condition | |---|---|---| | `inspect_logs` (first time, finds error patterns) | **+0.15** | New issue-related log patterns found | | `inspect_logs` (first time, no issues here) | +0.05 | Valid inspection, no errors in this service | | `inspect_logs` (repeat, no new info) | 0.00 | Already inspected, nothing changed | | `inspect_logs` (repeat, after a fix) | +0.05 | Dynamic logs appeared after a recent fix | | `inspect_config` (service has issues) | +0.05 | Relevant config retrieved | | `inspect_config` (service is clean) | +0.01 | Config retrieved but no issues here | | `inspect_config` (repeat) | 0.00 | Already inspected | | `inspect_endpoint` | +0.02 to +0.05 | Simulated endpoint test | | `submit_fix` (correct fix) | **+0.25** | Issue resolved, service health updated | | `submit_fix` (correct + inspected first) | **+0.30** | Fix + strategy bonus for diagnosis | | `submit_fix` (partial β€” close but not exact) | +0.03 | Right key, approximately right value | | `submit_fix` (wrong fix) | **-0.10** | Incorrect fix payload | | `submit_fix` (empty payload) | -0.10 | Empty fix_payload submitted | | All issues fixed | **+0.20** | Episode completion bonus | | Invalid target / invalid action | -0.05 | Bad input | | Every step | **-0.01** | Step cost β€” encourages efficiency | --- ## Action & Observation Spaces ### Action Schema (Pydantic model: `ApiDebugAction`) ```json { "action_type": "inspect_logs | inspect_config | inspect_endpoint | submit_fix", "target": "", "fix_payload": { "config_key": "corrected_value" } } ``` - `action_type` (required): One of the 4 debugging actions - `target` (required): The service to act on (from `available_targets` in the observation) - `fix_payload` (optional): Required only for `submit_fix` β€” the configuration correction **Fix payload formats:** ```json // Simple key-value fix {"timeout": 10} // Nested key fix (dot notation) {"headers.Authorization": "Bearer my_api_key"} // Complex nested object fix {"retry": {"max_retries": 3, "backoff_factor": 2, "retry_on_status": [429, 500]}} ``` ### Observation Schema (Pydantic model: `ApiDebugObservation`) ```json { "task_id": "easy", "task_description": "A payment processing API integration is failing...", "logs": ["[ERROR] 2026-03-25T10:15:23Z POST /process -> 401 Unauthorized", "..."], "config_snapshot": {"headers": {"Content-Type": "text/plain"}, "timeout": 30}, "api_response": {"status": "error", "status_code": 401, "error": "Missing Authorization"}, "service_status": {"payment_client": "error", "payment_gateway": "healthy"}, "dependency_graph": {"payment_client": ["payment_gateway"], "payment_gateway": []}, "error_trace": [ "[CRITICAL] payment_client: Missing Authorization header", " └─> payment_gateway: All requests rejected with 401" ], "hints": ["Check headers.Authorization"], "remaining_steps": 14, "issues_found": 1, "issues_fixed": 0, "issues_total": 2, "action_result": "Inspected logs for 'payment_client'. Found relevant error patterns!", "available_targets": ["payment_client", "payment_gateway"], "done": false, "reward": 0.15 } ``` **Key observation fields for agent reasoning:** - `service_status` β€” shows which services are healthy/degraded/error (updates dynamically) - `dependency_graph` β€” shows service relationships (agent should fix upstream first) - `error_trace` β€” shows active error cascades (shrinks as issues are fixed) - `hints` β€” progressive hints that get more specific as steps are used --- ## Example Transcript ``` >>> reset(task_id="easy") task_description: "A payment processing API integration is failing..." service_status: {payment_client: "error", payment_gateway: "healthy"} error_trace: [CRITICAL] payment_client: Missing Authorization header └─> payment_gateway: All requests rejected with 401 [ERROR] payment_client: Wrong Content-Type (text/plain instead of application/json) └─> payment_gateway: Request body parsing fails issues_total: 2, remaining_steps: 15 >>> step(action_type="inspect_logs", target="payment_client") logs: [ "[INFO] Payment client initialized...", "[ERROR] POST /process -> 401 Unauthorized", "[ERROR] Response: {'error': 'Missing or invalid Authorization header'}", "[WARN] Request headers: Content-Type=text/plain", "[ERROR] POST /process -> 415 Unsupported Media Type", ] issues_found: 2, reward: +0.15 >>> step(action_type="inspect_config", target="payment_client") config_snapshot: { "base_url": "https://api.paymentgateway.com/v2", "headers": {"Content-Type": "text/plain", "Accept": "application/json"}, "timeout": 30 } reward: +0.05 // Service has issues, first inspection >>> step(action_type="submit_fix", target="payment_client", fix_payload={"headers.Authorization": "Bearer sk_live_my_key"}) action_result: "Fix accepted! Fixed 1 issue(s). Total: 1/2" service_status: {payment_client: "degraded", payment_gateway: "healthy"} reward: +0.30 // Fix (+0.25) + strategy bonus (+0.05) for inspecting first >>> step(action_type="inspect_logs", target="payment_client") logs: [...original logs..., "[INFO] Authorization header set. Retrying request..." // NEW dynamic log! ] reward: +0.05 // Re-inspection has new dynamic logs >>> step(action_type="submit_fix", target="payment_client", fix_payload={"headers.Content-Type": "application/json"}) action_result: "Fix accepted! All issues fixed! Episode complete." service_status: {payment_client: "healthy", payment_gateway: "healthy"} error_trace: ["All issues resolved. No error cascades active."] reward: +0.50 // Fix (+0.25) + strategy (+0.05) + completion bonus (+0.20) done: true >>> grade() score: 0.82 fix_score: 1.00 (2/2 fixed) diagnosis_score: 1.00 (inspected before every fix) efficiency_score: 0.67 (5/15 steps used) strategy_score: 0.80 (inspected first, used multiple action types) ``` --- ## Setup & Usage ### Prerequisites - Python 3.10+ - [uv](https://docs.astral.sh/uv/) (recommended) or pip - Docker (for containerized deployment) ### Install Dependencies ```bash # Clone the repository git clone https://github.com/yadnyeshkolte/openenv-task.git cd openenv-task # Install dependencies with uv uv sync # Or with pip pip install -e . ``` ### Run the Server Locally ```bash # From the project root (openenv-task/) uvicorn server.app:app --reload --host 0.0.0.0 --port 8000 ``` The server will be available at `http://localhost:8000`. Visit `http://localhost:8000/docs` for interactive API documentation. ### Quick Test ```bash # Reset environment curl -X POST http://localhost:8000/reset # Inspect logs curl -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type": "inspect_logs", "target": "payment_client"}' # Submit a fix curl -X POST http://localhost:8000/step \ -H "Content-Type: application/json" \ -d '{"action_type": "submit_fix", "target": "payment_client", "fix_payload": {"headers.Authorization": "Bearer my_key"}}' ``` ### Docker Build & Run ```bash # From the project root (openenv-task/) docker build -t api_debug_env -f Dockerfile . docker run -p 8000:8000 api_debug_env ``` --- ## API Endpoints | Endpoint | Method | Description | |---|---|---| | `/` | GET | Environment info, version, and feature list | | `/reset` | POST | Reset environment (accepts `task_id` and `seed` params) | | `/step` | POST | Execute a debugging action | | `/state` | GET | Get current state (episode_id, step_count) | | `/schema` | GET | Get action/observation Pydantic schemas | | `/tasks` | GET | List all 3 tasks with action schema and service dependencies | | `/grader` | POST | Get multi-dimensional grader score for current episode | | `/baseline` | POST | Run the rule-based baseline agent on all 3 tasks | | `/health` | GET | Health check endpoint | | `/docs` | GET | Interactive Swagger UI documentation | --- ## Running Inference The `inference.py` script at the project root uses the OpenAI API client to run an LLM agent against all 3 tasks: ```bash # Set your API credentials export HF_TOKEN=your_huggingface_token # Optional: override model and API base export MODEL_NAME=Qwen/Qwen2.5-72B-Instruct export API_BASE_URL=https://router.huggingface.co/v1 # Run inference from the project root python inference.py ``` **Output format** (stdout): ``` [START] task=easy env=api_debug_env model=Qwen/Qwen2.5-72B-Instruct [STEP] step=1 action=inspect_logs(target=payment_client) reward=0.15 done=false error=null [STEP] step=2 action=submit_fix(target=payment_client, fix={...}) reward=0.30 done=false error=null ... [END] success=true steps=5 score=0.820 rewards=0.15,0.30,... ``` The inference script: - Uses `openai.OpenAI` client for all LLM calls - Reads `HF_TOKEN` (or `API_KEY`) from environment variables - Includes retry logic with exponential backoff - Emits `[START]`, `[STEP]`, `[END]` lines to stdout --- ## Running Tests ```bash # From the project root (openenv-task/) python -m pytest tests/ -v --tb=short ``` **70 tests** across 12 test classes covering: - Scenario loading, seed randomization, and issue pool selection - Environment reset and initialization - All 4 action types: `inspect_logs`, `inspect_config`, `inspect_endpoint`, `submit_fix` - Dynamic state: service health updates, dynamic log injection, error trace changes - Multi-dimensional grading rubric (fix, diagnosis, efficiency, strategy) - Strict fix validation with partial credit - Value matching (strings, numbers, booleans, lists, Bearer tokens) - Full episode integration tests (easy, medium, hard) - Cascading failure mechanics and dependency chains - Episode termination conditions ### Validate OpenEnv Compliance ```bash openenv validate ``` --- ## Project Structure ``` openenv-task/ # Project root β”œβ”€β”€ __init__.py # Package init (exports ApiDebugEnv, Action, Observation) β”œβ”€β”€ client.py # OpenEnv client (WebSocket connection to server) β”œβ”€β”€ models.py # Pydantic Action & Observation type definitions β”œβ”€β”€ scenarios.py # Task scenarios with dependency graphs & issue pools β”œβ”€β”€ inference.py # MANDATORY inference script (LLM agent, OpenAI client) β”œβ”€β”€ openenv.yaml # OpenEnv metadata (spec v1) β”œβ”€β”€ pyproject.toml # Python project config & dependencies β”œβ”€β”€ Dockerfile # Docker build for HF Spaces deployment β”œβ”€β”€ LICENSE # BSD license β”œβ”€β”€ README.md # This file β”œβ”€β”€ PROGRESS.md # Development session log β”œβ”€β”€ AGENTS.md # Instructions for AI coding agents β”œβ”€β”€ server/ β”‚ β”œβ”€β”€ __init__.py # Server package init β”‚ β”œβ”€β”€ api_debug_env_environment.py # Core environment (reset/step/grade logic) β”‚ β”œβ”€β”€ app.py # FastAPI endpoints (/reset, /step, /tasks, etc.) β”‚ β”œβ”€β”€ Dockerfile # Alternate Dockerfile (same as root) β”‚ └── requirements.txt # Server-specific requirements β”œβ”€β”€ scripts/ β”‚ └── baseline_inference.py # Alternate baseline script └── tests/ └── test_environment.py # 70 unit & integration tests ``` ### Key Files | File | Purpose | |---|---| | `server/api_debug_env_environment.py` | **Core logic** β€” `reset()`, `step()`, `grade()`, dynamic state, cascading failures | | `scenarios.py` | **Task definitions** β€” issue pools, dependency graphs, dynamic logs, service configs | | `models.py` | **Type definitions** β€” `ApiDebugAction` and `ApiDebugObservation` Pydantic models | | `inference.py` | **Mandatory** β€” LLM-based agent using OpenAI client with `[START]/[STEP]/[END]` output | | `openenv.yaml` | **Mandatory** β€” OpenEnv spec v1 metadata with task definitions | | `server/app.py` | **FastAPI server** β€” all HTTP endpoints including `/baseline` and `/grader` | --- ## Design Philosophy This environment is designed to be useful for **RL/agent training and evaluation**, not just a one-off benchmark: 1. **Dense Reward Signal** β€” every action type produces positive or negative reward, enabling gradient-based training (GRPO, DPO, PPO). Not just a sparse binary score at the end. 2. **Progressive Difficulty** β€” Easy (2 services, 2 issues) β†’ Medium (3 services, 3 issues with 1 dependency) β†’ Hard (5 services, 5 issues with multiple dependency chains). Difficulty comes from complexity, not ambiguity. 3. **Partial Credit** β€” close-but-wrong fixes get constructive feedback instead of just rejection. This provides learning signal for agents that are on the right track. 4. **Strategy Incentives** β€” the multi-dimensional rubric rewards **how** the agent solves (inspect before fix, follow dependencies, avoid waste), not just **what** it solves. This encourages emergent debugging strategies. 5. **Stochastic Scenarios** β€” seed-based randomization from expanded issue pools prevents policy overfitting to memorized scenarios while maintaining reproducibility. 6. **Cascading Dynamics** β€” upstream fixes change downstream state, requiring **multi-step causal reasoning**. The agent can't just pattern-match each issue independently β€” it must understand the system architecture. 7. **Real-World Relevance** β€” API integration debugging is a genuine, high-value task that software engineers spend significant time on. The scenarios model actual failure patterns (expired tokens, rate limiting, missing headers, deprecated endpoints, race conditions). --- ## OpenEnv Spec Compliance | Requirement | Status | |---|---| | OpenEnv spec v1 (`openenv.yaml`) | βœ… | | Typed Pydantic models (Action, Observation) | βœ… | | `reset()` / `step()` / `state()` API | βœ… | | 3+ tasks with difficulty range | βœ… (easy, medium, hard) | | Programmatic graders (0.0–1.0) | βœ… (multi-dimensional rubric) | | Meaningful reward function | βœ… (dense, not sparse) | | Baseline inference script | βœ… (`inference.py` at root) | | OpenAI client for LLM calls | βœ… | | `[START]/[STEP]/[END]` stdout format | βœ… | | Dockerfile builds and runs | βœ… | | HF Space deploys and responds | βœ… | | `openenv validate` passes | βœ… | --- ## Hackathon Submission - **HF Space**: [yadnyeshkolte/api-debug-env](https://huggingface.co/spaces/yadnyeshkolte/api-debug-env) - **GitHub**: [yadnyeshkolte/openenv-task](https://github.com/yadnyeshkolte/openenv-task) - **Hackathon**: Meta PyTorch OpenEnv Hackathon Γ— Scaler School of Technology