Spaces:
Build error
Build error
A newer version of the Gradio SDK is available: 6.14.0
metadata
title: DeepSeek OCR PDF
emoji: 🏃
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: OCR interface for your PDF files
DeepSeek-OCR PDF & Image Interface
This Space wraps deepseek-ai/DeepSeek-OCR with a polished Gradio UI that can transcribe both individual images and multi-page PDFs into clean Markdown. It targets the free T4 GPU tier for fast startup while enabling flash-attention and optional vLLM acceleration for multi-page batching.
Features
- Support for
.png,.jpg,.jpeg,.webp,.tiff, and.pdf - Automatic PDF page conversion with PyMuPDF at 192 DPI
- Gundam mode defaults (
base_size=1024,image_size=640,crop_mode=True) for balanced speed and accuracy - Markdown-formatted output with per-page sections
- Optional custom prompt to tailor extraction instructions
Running Locally
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py
The interface launches on http://127.0.0.1:7860 by default. Set the environment variable USE_VLLM=0 to disable the vLLM backend or leave it enabled to leverage faster batching when the dependency is available.
Space Configuration
- Hardware:
t4-small - Python:
3.10 - SDK:
Gradio 5.49.1 - Model:
deepseek-ai/DeepSeek-OCR
Refer to the Spaces configuration reference for additional customization options.