# CodeFormer Face Restoration - Project Documentation

## 1. Introduction

**CodeFormer** is a robust blind face restoration algorithm designed to restore old, degraded, or AI-generated face images. It utilizes a **Codebook Lookup Transformer** (VQGAN-based) to predict high-quality facial features even from severe degradation, ensuring that the restored faces look natural and faithful to the original identity.

This project wraps the core CodeFormer research code into a deployable, user-friendly **Flask Web Application**, containerized with **Docker** for easy deployment on platforms like Hugging Face Spaces.

### Key Features
*   **Blind Face Restoration:** Restores faces from low-quality inputs without knowing the specific degradation details.
*   **Background Enhancement:** Uses **Real-ESRGAN** to upscale and enhance the non-face background regions of the image.
*   **Face Alignment & Paste-back:** Automatically detects faces, aligns them for processing, and seamlessly blends them back into the original image.
*   **Adjustable Fidelity:** Users can balance between restoration quality (hallucinating details) and identity fidelity (keeping the original look).

---

## 2. System Architecture

The application is built on a Python/PyTorch backend served via Flask.

### 2.1 Technology Stack
*   **Framework:** Flask (Python Web Server)
*   **Deep Learning:** PyTorch, TorchVision
*   **Image Processing:** OpenCV, NumPy, Pillow
*   **Core Libraries:** `basicsr` (Basic Super-Restoration), `facelib` (Face detection/utils)
*   **Frontend:** HTML5, Bootstrap 5, Jinja2 Templates
*   **Containerization:** Docker (CUDA-enabled)

### 2.2 Directory Structure
```
CodeFormer/
├── app.py                 # Main Flask application entry point
├── Dockerfile             # Container configuration
├── requirements.txt       # Python dependencies
├── basicsr/               # Core AI framework (Super-Resolution tools)
├── facelib/               # Face detection and alignment utilities
├── templates/             # HTML Frontend
│   ├── index.html         # Upload interface
│   └── result.html        # Results display
├── static/                # Static assets (css, js, uploads)
│   ├── uploads/           # Temporary storage for input images
│   └── results/           # Temporary storage for processed output
└── weights/               # Pre-trained model weights (downloaded on startup)
    ├── CodeFormer/        # CodeFormer model (.pth)
    ├── facelib/           # Detection (RetinaFace) and Parsing models
    └── realesrgan/        # Background upscaler (Real-ESRGAN)
```

### 2.3 Logic Flow
1.  **Input:** User uploads an image via the Web UI.
2.  **Pre-processing (`app.py`):**
    *   Image is saved to `static/uploads`.
    *   Parameters (fidelity, upscale factor) are parsed.
3.  **Inference Pipeline:**
    *   **Detection:** `facelib` detects faces in the image using RetinaFace.
    *   **Alignment:** Faces are cropped and aligned to a standard 512x512 resolution.
    *   **Restoration:** The **CodeFormer** model processes the aligned faces.
    *   **Upscaling (Optional):** The background is upscaled using **Real-ESRGAN**.
    *   **Paste-back:** Restored faces are warped back to their original positions and blended.
4.  **Output:** The final image is saved to `static/results` and displayed to the user.

---

## 3. Installation & Deployment

### 3.1 Docker Deployment (Recommended)
The project is optimized for Docker.

**Prerequisites:** Docker, NVIDIA GPU (optional, but recommended).

1.  **Build the Image:**
    ```bash
    docker build -t codeformer-app .
    ```

2.  **Run the Container:**
    ```bash
    # Run on port 7860 (Standard for HF Spaces)
    docker run -it -p 7860:7860 codeformer-app
    ```
    *Note: To use GPU, add the `--gpus all` flag to the run command.*

### 3.2 Hugging Face Spaces Deployment
This repository is configured for direct deployment to Hugging Face.

1.  Create a **Docker** Space on Hugging Face.
2.  Push this entire repository to the Space's Git remote.
    ```bash
    git remote add hf git@hf.co:spaces/USERNAME/SPACE_NAME
    git push hf main
    ```
3.  The Space will build (approx. 5-10 mins) and launch automatically.

### 3.3 Local Development
1.  **Install Environment:**
    ```bash
    conda create -n codeformer python=3.8
    conda activate codeformer
    pip install -r requirements.txt
    ```
2.  **Install Basicsr:**
    ```bash
    python basicsr/setup.py install
    ```
3.  **Run App:**
    ```bash
    python app.py
    ```

---

## 4. User Guide (Web Interface)

### 4.1 Interface Controls

*   **Input Image:** Supports standard formats (JPG, PNG, WEBP). Drag and drop supported.
*   **Fidelity Weight (w):**
    *   **Range:** 0.0 to 1.0.
    *   **0.0 (Better Quality):** The model "hallucinates" more details. Results look very sharp and high-quality but may slightly alter the person's identity (look less like the original).
    *   **1.0 (Better Identity):** The model sticks strictly to the original features. Results are faithful to the original photo but might be blurrier or contain more artifacts.
    *   **Recommended:** 0.5 is a balanced default.
*   **Upscale Factor:**
    *   Scales the final output resolution (1x, 2x, or 4x).
    *   *Note: Higher scaling requires more VRAM.*
*   **Enhance Background:**
    *   If checked, runs Real-ESRGAN on the non-face areas.
    *   *Recommendation:* Keep checked for full-photo restoration. Uncheck if you only care about the face or are running on limited hardware.
*   **Upsample Face:**
    *   If checked, the restored face is also upsampled to match the background resolution.

### 4.2 Viewing Results
The result page features an interactive **Before/After Slider**. Drag the handle left and right to compare the pixels of the original versus the restored image directly.

---

## 5. Technical Details

### 5.1 Model Weights
The application automatically checks for and downloads the following weights to the `weights/` directory on startup:

| Model | Path | Description |
| :--- | :--- | :--- |
| **CodeFormer** | `weights/CodeFormer/codeformer.pth` | Main restoration model. |
| **RetinaFace** | `weights/facelib/detection_Resnet50_Final.pth` | Face detection. |
| **ParseNet** | `weights/facelib/parsing_parsenet.pth` | Face parsing (segmentation). |
| **Real-ESRGAN** | `weights/realesrgan/RealESRGAN_x2plus.pth` | Background upscaler (x2). |

### 5.2 Performance Notes
*   **Memory:** The full pipeline (CodeFormer + Real-ESRGAN) requires significant RAM/VRAM. On CPU-only environments (like basic HF Spaces), processing a single image may take 30-60 seconds.
*   **Git LFS:** Image assets in this repository are tracked with Git LFS to keep the repo size manageable.

---

## 6. Credits & References

*   **Original Paper:** [Towards Robust Blind Face Restoration with Codebook Lookup Transformer (NeurIPS 2022)](https://arxiv.org/abs/2206.11253)
*   **Authors:** Shangchen Zhou, Kelvin C.K. Chan, Chongyi Li, Chen Change Loy (S-Lab, Nanyang Technological University).
*   **Original Repository:** [sczhou/CodeFormer](https://github.com/sczhou/CodeFormer)