--- title: CorridorKey emoji: 🎬 colorFrom: yellow colorTo: yellow sdk: gradio sdk_version: 6.9.0 app_file: app.py python_version: "3.10" pinned: false tags: - green-screen - background-removal - video-matting - alpha-matting - vfx - corridor-digital - transparency - onnx - pytorch - zerogpu - mcp-server short_description: Remove green/blue screen from video, even glass & hair --- # CorridorKey Green/Blue Screen Matting Remove green or blue screen backgrounds from video. Handles transparent objects (glass, water, cloth) that traditional chroma key cannot. Based on [CorridorKey](https://github.com/nikopueringer/CorridorKey) by Corridor Digital. ## Inference Paths - **GPU (ZeroGPU H200)**: PyTorch GreenFormer with batched inference (batch 32 at 1024, batch 16 at 2048) - **CPU (fallback)**: ONNX Runtime sequential inference (batch 1) ## Pipeline 1. **BiRefNet** - Generates coarse foreground mask (ONNX, or fast classical HSV for green/blue screens) 2. **CorridorKey GreenFormer** - Refines alpha matte + extracts clean foreground (PyTorch on GPU, ONNX on CPU) 3. **GPU Postprocessing** - Despill, despeckle (connected components), resize — all on GPU via torchvision, single CPU transfer at the end ## GPU Optimizations - **Full GPU pipeline**: preprocessing (resize + normalize) and postprocessing (despill, clean_matte, resize) stay on device — avoids CPU↔GPU round-trips per batch - **TF32 tensor cores**: `torch.set_float32_matmul_precision('high')` for FP32 postprocessing ops - **AOTI compilation with torch.inductor + triton cudagraphs** (native CUDA kernels, fused ops, replays entire kernel sequence without CPU-GPU sync overhead) don't benefit GreenFormer: tested max-autotune (118s, 0 triton kernels) and reduce-overhead (36s compile + 48s graph recording = 84s for 5% speedup). Small feature maps (112-896ch) are cublas-optimal, not triton-friendly. Disabled on ZeroGPU — eager at 0.32s/frame beats 84s+ overhead. torch.compile still available for local GPU **Pipeline timing** (89 frames, batch 32 @ 1024px model res): CPU mask 22s → GPU load 5s → inference 29s → write 15s → stitch 9s ≈ 80s total, 49s GPU. Model always processes at 1024x1024 or 2048x2048 regardless of input resolution ## API ### REST API **Step 1: Submit request** ```bash curl -X POST "https://luminia-corridorkey.hf.space/gradio_api/call/process_video" \ -H "Content-Type: application/json" \ -d '{"data": ["video.mp4", "1024", 5, "Hybrid (auto)", true, 400]}' ``` **Step 2: Get result** ```bash curl "https://luminia-corridorkey.hf.space/gradio_api/call/process_video/{event_id}" ``` ### MCP (Model Context Protocol) **MCP Config:** ```json { "mcpServers": { "corridorkey": { "url": "https://luminia-corridorkey.hf.space/gradio_api/mcp/" } } } ``` ## Credits - [CorridorKey](https://github.com/nikopueringer/CorridorKey) by Niko Pueringer / Corridor Digital (synced to `8a4e4b4`, 2026-05-01) - [EZ-CorridorKey](https://github.com/edenaion/EZ-CorridorKey) UI reference by edenaion (synced to `888e032`, 2026-04-25) - [BiRefNet](https://github.com/ZhengPeng7/BiRefNet) by ZhengPeng7