Add comprehensive model card for SynLecSlideGen pipeline

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +65 -3
README.md CHANGED
@@ -1,3 +1,65 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ pipeline_tag: object-detection
4
+ ---
5
+
6
+ # AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval
7
+
8
+ This repository contains the `SynLecSlideGen` pipeline and resources from the paper [AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval](https://huggingface.co/papers/2506.23605).
9
+
10
+ `SynLecSlideGen` is a large language model (LLM)-guided synthetic lecture slide generation pipeline that produces high-quality, coherent, and realistic slides. These synthetic slides are designed to include automated annotations for tasks such as **Slide Element Detection** and Text Query-based Slide Retrieval, which can help compensate for limited labeled real-world data.
11
+
12
+ Project page: [https://synslidegen.github.io/](https://synslidegen.github.io/)
13
+ Code: [https://github.com/synslidegen/synslidegen_pipeline](https://github.com/synslidegen/synslidegen_pipeline)
14
+
15
+ ## Abstract
16
+ Lecture slide element detection and retrieval are key problems in slide understanding. Training effective models for these tasks often depends on extensive manual annotation. However, annotating large volumes of lecture slides for supervised training is labor intensive and requires domain expertise. To address this, we propose a large language model (LLM)-guided synthetic lecture slide generation pipeline, SynLecSlideGen, which produces high-quality, coherent and realistic slides. We also create an evaluation benchmark, namely RealSlide by manually annotating 1,050 real lecture slides. To assess the utility of our synthetic slides, we perform few-shot transfer learning on real data using models pre-trained on them. Experimental results show that few-shot transfer learning with pretraining on synthetic slides significantly improves performance compared to training only on real data. This demonstrates that synthetic data can effectively compensate for limited labeled lecture slides. The code and resources of our work are publicly available on our project website: this https URL .
17
+
18
+ ## Sample Generated Slides for Object Detection (`SynDet`)
19
+ <table border="1">
20
+ <tr>
21
+ <td><img src="https://github.com/synslidegen/synslidegen_pipeline/blob/main/code/assets/syndet1.png" alt="SynDet1" width="100%"></td>
22
+ <td><img src="https://github.com/synslidegen/synslidegen_pipeline/blob/main/code/assets/syndet2.png" alt="SynDet2" width="100%"></td>
23
+ <td><img src="https://github.com/synslidegen/synslidegen_pipeline/blob/main/code/assets/syndet3.png" alt="SynDet3" width="100%"></td>
24
+ </tr>
25
+ </table>
26
+
27
+ ## Usage
28
+ To use the `synslidegen` pipeline, first clone the repository and install dependencies:
29
+ ```bash
30
+ git clone https://github.com/synslidegen/synslidegen_pipeline
31
+ cd synslidegen_pipeline
32
+ python3 -m venv <virtual-environment-name>
33
+ # Activate the virtual environment
34
+ # On Linux/macOS: source <virtual-environment-name>/bin/activate
35
+ # On Windows: <virtual-environment-name>\Scripts\activate.bat
36
+ pip install -r requirements.txt
37
+ ```
38
+
39
+ You can then modify the `code/config.py` file to adjust parameters such as `SUBJECT`, `BOOK`, `MAX_TOPIC`, `LLM_MODEL`, `TEMPERATURE`, etc.
40
+
41
+ To generate synthetic presentation slides, run:
42
+ ```powershell
43
+ .\run_synslidegen.ps1 -py_path "\Custom\Python\Path" -env_path "\Custom\VirtualEnv\Path"
44
+ ```
45
+ Generated slides will be saved in the `ppts/` folder.
46
+
47
+ To generate automated annotations for **Slide Element Detection**, uncomment the relevant lines in `run_synslidegen.ps1` (these typically involve running `code/post_processing/pptx_to_png.py` and `code/post_processing/bbox_annotations.py`):
48
+ ```python
49
+ # python 'D:\Research_work\Experimentation_Results\py_pptx_code_gen_using_LLMs\pptGEN_CLEANED\code\post_processing\pptx_to_png.py'
50
+ # python 'D:\Research_work\Experimentation_Results\py_pptx_code_gen_using_LLMs\pptGEN_CLEANED\code\post_processing\bbox_annotations.py'
51
+ ```
52
+ *(Note: The original paths in the script might be absolute and specific to a development environment. Adjust as necessary if running in a different setup, or refer to the `code/post_processing` directory.)*
53
+
54
+ For more detailed usage instructions on generating annotations for slide image retrieval or fine-tuning, please refer to the [official GitHub repository](https://github.com/synslidegen/synslidegen_pipeline).
55
+
56
+ ## Citation
57
+ If you find our work helpful, please cite our paper:
58
+ ```bibtex
59
+ @inproceedings{synslidegen,
60
+ title={SynSlideGen: AI-Generated Lecture Slides for Improving Slide Element Detection and Retrieval},
61
+ author={Suyash Maniyar, Vishvesh Trivedi, Ajoy Mondal, Anand Mishra, C.V. Jawahar},
62
+ booktitle={ICDAR},
63
+ year={2025}
64
+ }
65
+ ```