Spaces:

Illia56
/

DeepSeek-OCR-PDF

Build error

App Files Files Community

DeepSeek-OCR-PDF / README.md

Illia56

Update README.md

0851c88 verified 6 months ago

preview code

raw

history blame contribute delete

1.55 kB

A newer version of the Gradio SDK is available: 6.14.0

Upgrade

metadata

title: DeepSeek OCR PDF
emoji: 🏃
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: OCR interface for your PDF files

DeepSeek-OCR PDF & Image Interface

This Space wraps deepseek-ai/DeepSeek-OCR with a polished Gradio UI that can transcribe both individual images and multi-page PDFs into clean Markdown. It targets the free T4 GPU tier for fast startup while enabling flash-attention and optional vLLM acceleration for multi-page batching.

Features

Support for .png, .jpg, .jpeg, .webp, .tiff, and .pdf
Automatic PDF page conversion with PyMuPDF at 192 DPI
Gundam mode defaults (base_size=1024, image_size=640, crop_mode=True) for balanced speed and accuracy
Markdown-formatted output with per-page sections
Optional custom prompt to tailor extraction instructions

Running Locally

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py

The interface launches on http://127.0.0.1:7860 by default. Set the environment variable USE_VLLM=0 to disable the vLLM backend or leave it enabled to leverage faster batching when the dependency is available.

Space Configuration

Hardware: t4-small
Python: 3.10
SDK: Gradio 5.49.1
Model: deepseek-ai/DeepSeek-OCR

Refer to the Spaces configuration reference for additional customization options.