DeepSeek-OCR-PDF / README.md
Illia56's picture
Update README.md
0851c88 verified

A newer version of the Gradio SDK is available: 6.14.0

Upgrade
metadata
title: DeepSeek OCR PDF
emoji: 🏃
colorFrom: indigo
colorTo: pink
sdk: gradio
sdk_version: 5.49.1
app_file: app.py
pinned: false
short_description: OCR interface for your PDF files

DeepSeek-OCR PDF & Image Interface

This Space wraps deepseek-ai/DeepSeek-OCR with a polished Gradio UI that can transcribe both individual images and multi-page PDFs into clean Markdown. It targets the free T4 GPU tier for fast startup while enabling flash-attention and optional vLLM acceleration for multi-page batching.

Features

  • Support for .png, .jpg, .jpeg, .webp, .tiff, and .pdf
  • Automatic PDF page conversion with PyMuPDF at 192 DPI
  • Gundam mode defaults (base_size=1024, image_size=640, crop_mode=True) for balanced speed and accuracy
  • Markdown-formatted output with per-page sections
  • Optional custom prompt to tailor extraction instructions

Running Locally

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python app.py

The interface launches on http://127.0.0.1:7860 by default. Set the environment variable USE_VLLM=0 to disable the vLLM backend or leave it enabled to leverage faster batching when the dependency is available.

Space Configuration

  • Hardware: t4-small
  • Python: 3.10
  • SDK: Gradio 5.49.1
  • Model: deepseek-ai/DeepSeek-OCR

Refer to the Spaces configuration reference for additional customization options.