Spaces:
Running
Running
File size: 7,391 Bytes
adf2fff | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 | # CodeFormer Face Restoration - Project Documentation
## 1. Introduction
**CodeFormer** is a robust blind face restoration algorithm designed to restore old, degraded, or AI-generated face images. It utilizes a **Codebook Lookup Transformer** (VQGAN-based) to predict high-quality facial features even from severe degradation, ensuring that the restored faces look natural and faithful to the original identity.
This project wraps the core CodeFormer research code into a deployable, user-friendly **Flask Web Application**, containerized with **Docker** for easy deployment on platforms like Hugging Face Spaces.
### Key Features
* **Blind Face Restoration:** Restores faces from low-quality inputs without knowing the specific degradation details.
* **Background Enhancement:** Uses **Real-ESRGAN** to upscale and enhance the non-face background regions of the image.
* **Face Alignment & Paste-back:** Automatically detects faces, aligns them for processing, and seamlessly blends them back into the original image.
* **Adjustable Fidelity:** Users can balance between restoration quality (hallucinating details) and identity fidelity (keeping the original look).
---
## 2. System Architecture
The application is built on a Python/PyTorch backend served via Flask.
### 2.1 Technology Stack
* **Framework:** Flask (Python Web Server)
* **Deep Learning:** PyTorch, TorchVision
* **Image Processing:** OpenCV, NumPy, Pillow
* **Core Libraries:** `basicsr` (Basic Super-Restoration), `facelib` (Face detection/utils)
* **Frontend:** HTML5, Bootstrap 5, Jinja2 Templates
* **Containerization:** Docker (CUDA-enabled)
### 2.2 Directory Structure
```
CodeFormer/
βββ app.py # Main Flask application entry point
βββ Dockerfile # Container configuration
βββ requirements.txt # Python dependencies
βββ basicsr/ # Core AI framework (Super-Resolution tools)
βββ facelib/ # Face detection and alignment utilities
βββ templates/ # HTML Frontend
β βββ index.html # Upload interface
β βββ result.html # Results display
βββ static/ # Static assets (css, js, uploads)
β βββ uploads/ # Temporary storage for input images
β βββ results/ # Temporary storage for processed output
βββ weights/ # Pre-trained model weights (downloaded on startup)
βββ CodeFormer/ # CodeFormer model (.pth)
βββ facelib/ # Detection (RetinaFace) and Parsing models
βββ realesrgan/ # Background upscaler (Real-ESRGAN)
```
### 2.3 Logic Flow
1. **Input:** User uploads an image via the Web UI.
2. **Pre-processing (`app.py`):**
* Image is saved to `static/uploads`.
* Parameters (fidelity, upscale factor) are parsed.
3. **Inference Pipeline:**
* **Detection:** `facelib` detects faces in the image using RetinaFace.
* **Alignment:** Faces are cropped and aligned to a standard 512x512 resolution.
* **Restoration:** The **CodeFormer** model processes the aligned faces.
* **Upscaling (Optional):** The background is upscaled using **Real-ESRGAN**.
* **Paste-back:** Restored faces are warped back to their original positions and blended.
4. **Output:** The final image is saved to `static/results` and displayed to the user.
---
## 3. Installation & Deployment
### 3.1 Docker Deployment (Recommended)
The project is optimized for Docker.
**Prerequisites:** Docker, NVIDIA GPU (optional, but recommended).
1. **Build the Image:**
```bash
docker build -t codeformer-app .
```
2. **Run the Container:**
```bash
# Run on port 7860 (Standard for HF Spaces)
docker run -it -p 7860:7860 codeformer-app
```
*Note: To use GPU, add the `--gpus all` flag to the run command.*
### 3.2 Hugging Face Spaces Deployment
This repository is configured for direct deployment to Hugging Face.
1. Create a **Docker** Space on Hugging Face.
2. Push this entire repository to the Space's Git remote.
```bash
git remote add hf git@hf.co:spaces/USERNAME/SPACE_NAME
git push hf main
```
3. The Space will build (approx. 5-10 mins) and launch automatically.
### 3.3 Local Development
1. **Install Environment:**
```bash
conda create -n codeformer python=3.8
conda activate codeformer
pip install -r requirements.txt
```
2. **Install Basicsr:**
```bash
python basicsr/setup.py install
```
3. **Run App:**
```bash
python app.py
```
---
## 4. User Guide (Web Interface)
### 4.1 Interface Controls
* **Input Image:** Supports standard formats (JPG, PNG, WEBP). Drag and drop supported.
* **Fidelity Weight (w):**
* **Range:** 0.0 to 1.0.
* **0.0 (Better Quality):** The model "hallucinates" more details. Results look very sharp and high-quality but may slightly alter the person's identity (look less like the original).
* **1.0 (Better Identity):** The model sticks strictly to the original features. Results are faithful to the original photo but might be blurrier or contain more artifacts.
* **Recommended:** 0.5 is a balanced default.
* **Upscale Factor:**
* Scales the final output resolution (1x, 2x, or 4x).
* *Note: Higher scaling requires more VRAM.*
* **Enhance Background:**
* If checked, runs Real-ESRGAN on the non-face areas.
* *Recommendation:* Keep checked for full-photo restoration. Uncheck if you only care about the face or are running on limited hardware.
* **Upsample Face:**
* If checked, the restored face is also upsampled to match the background resolution.
### 4.2 Viewing Results
The result page features an interactive **Before/After Slider**. Drag the handle left and right to compare the pixels of the original versus the restored image directly.
---
## 5. Technical Details
### 5.1 Model Weights
The application automatically checks for and downloads the following weights to the `weights/` directory on startup:
| Model | Path | Description |
| :--- | :--- | :--- |
| **CodeFormer** | `weights/CodeFormer/codeformer.pth` | Main restoration model. |
| **RetinaFace** | `weights/facelib/detection_Resnet50_Final.pth` | Face detection. |
| **ParseNet** | `weights/facelib/parsing_parsenet.pth` | Face parsing (segmentation). |
| **Real-ESRGAN** | `weights/realesrgan/RealESRGAN_x2plus.pth` | Background upscaler (x2). |
### 5.2 Performance Notes
* **Memory:** The full pipeline (CodeFormer + Real-ESRGAN) requires significant RAM/VRAM. On CPU-only environments (like basic HF Spaces), processing a single image may take 30-60 seconds.
* **Git LFS:** Image assets in this repository are tracked with Git LFS to keep the repo size manageable.
---
## 6. Credits & References
* **Original Paper:** [Towards Robust Blind Face Restoration with Codebook Lookup Transformer (NeurIPS 2022)](https://arxiv.org/abs/2206.11253)
* **Authors:** Shangchen Zhou, Kelvin C.K. Chan, Chongyi Li, Chen Change Loy (S-Lab, Nanyang Technological University).
* **Original Repository:** [sczhou/CodeFormer](https://github.com/sczhou/CodeFormer)
|