license: apache-2.0
π¨ SyntheticGen
Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation
Addressing class imbalance in remote sensing datasets through controlled synthetic generation
π Overview
SyntheticGen tackles the long-tail distribution problem in LoveDA by generating synthetic imagery with explicit control over class ratios. You can specify exactly what proportion of each land cover class should appear in the output.
π₯π₯ Updates
- π Try SyntheticGen in 2 minutes β no setup required at β¨β¨Live Demoβ¨β¨
- π€ Weights Released at HuggingFace
- Our paper was accepted to EEE International Geoscience and Remote Sensing Symposium (IGARSS) 2026.
β¨ Highlights
- Two-stage pipeline: ratio-conditioned layout D3PM + ControlNet image synthesis.
- Full or sparse ratio control (e.g.,
building:0.4). - Config-first workflow for reproducible experiments.
β What we try to answer
π°οΈ Why is remote-sensing segmentation still difficult, even with strong modern models?
Because the problem is not only in the model β it is also in the data. Some land-cover classes appear again and again, while others are so rare that the model barely gets a chance to learn them. In LoveDA, this becomes even more challenging because the dataset is split into Urban and Rural domains, each with different scene characteristics and different class distributions.
βοΈ So what if we could control the data instead of just accepting it as it is?
That is exactly the idea behind SyntheticGen. Instead of using augmentation as a random process, SyntheticGen makes it controllable. Users can explicitly specify target class ratios and domain conditions during generation, making it possible to create synthetic samples that are not just more numerous, but more useful. This means rare classes can be strengthened deliberately, while still preserving realistic layouts and domain-consistent appearance.
π§ What makes SyntheticGen stand out?
Its strength lies in a carefully designed two-stage pipeline. First, a ratio-conditioned discrete diffusion model generates semantically meaningful layouts. Then, a ControlNet-guided image synthesis stage converts those layouts into realistic remote-sensing imagery. By separating semantic control from visual rendering, the framework achieves something highly valuable: it is both principled and practical.
π Why does that matter beyond this single benchmark?
Because this is not just another generative model for remote sensing. SyntheticGen introduces a targeted augmentation strategy for improving segmentation under class imbalance and domain shift, and shows that synthetic data can be used not just to add more images, but to add the right images.
π The bigger message
SyntheticGen is a step toward data-centric remote-sensing segmentation β a setting where the training distribution is no longer passively accepted, but actively designed. Our paper shows that better segmentation is not only about building better models, but also about building better data.
π Quick Start
Installation
git clone https://github.com/Buddhi19/SyntheticGen.git
cd SyntheticGen
pip install -r requirements.txt
Install Dependencies
conda create -n diffusors python=3.10.19 -y
conda activate diffusors
# official PyTorch Linux + CUDA 12.8 install for v2.10.0
python -m pip install torch==2.10.0 torchvision==0.25.0 torchaudio==2.10.0 --index-url https://download.pytorch.org/whl/cu128
python -m pip install -r SyntheticGen/requirements.txt
Generate Your First Synthetic Image
python src/scripts/sample_pair.py \
--config configs/sample_pair_ckpt40000_building0.4.yaml
π Usage
Training Pipeline (Configs)
Stage A: Train Layout Generator (D3PM)
python src/scripts/train_layout_d3pm.py \
--config configs/train_layout_d3pm_masked_sparse_80k.yaml
(Optional) Ratio Prior for Sparse Conditioning
python src/scripts/compute_ratio_prior.py \
--config configs/compute_ratio_prior_loveda_train.yaml
Stage B: Train Image Generator (ControlNet)
python src/scripts/train_controlnet_ratio.py \
--config configs/train_controlnet_ratio_loveda_1024.yaml
Inference / Sampling (Configs)
End-to-end sampling (layout -> image):
python src/scripts/sample_pair.py \
--config configs/sample_pair_ckpt40000_building0.4.yaml
Override config parameters via CLI if needed:
python src/scripts/sample_pair.py \
--config configs/sample_pair_ckpt40000_building0.4.yaml \
--ratios "building:0.4,forest:0.3" \
--save_dir outputs/custom_generation
βοΈ Configuration
All experiments are driven by YAML/JSON config files in configs/.
| Task | Script | Example Config |
|---|---|---|
| Layout Training | src/scripts/train_layout_d3pm.py |
configs/train_layout_d3pm_masked_sparse_80k.yaml |
| Ratio Prior | src/scripts/compute_ratio_prior.py |
configs/compute_ratio_prior_loveda_train.yaml |
| ControlNet Training | src/scripts/train_controlnet_ratio.py |
configs/train_controlnet_ratio_loveda_1024.yaml |
| Sampling / Inference | src/scripts/sample_pair.py |
configs/sample_pair_ckpt40000_building0.4.yaml |
Config tips
- Examples live in
configs/. - To resume training, set
resume_from_checkpoint: "checkpoint-XXXXX"in your config. - Dataset roots and domains are centralized in configs; edit once, reuse everywhere.
- CLI flags override config values for quick experiments.
π Data Format
LoveDA Dataset Structure
LoveDA/
Train/
Train/ # some releases include this extra nesting
Urban/
images_png/
masks_png/
Rural/
images_png/
masks_png/
Urban/
images_png/
masks_png/
Rural/
images_png/
masks_png/
Val/
...
Generic Dataset Structure
your_dataset/
images/
image_001.png
masks/
image_001.png # label map with matching stem
π¦ Pre-Generated Datasets
We provide synthetic datasets used in the paper: https://drive.google.com/drive/folders/14cMpLTgvcLdXhRY0kGhFKpDRMvpok90h?usp=sharing
π§Ύ Outputs
- Checkpoints include
training_config.jsonandclass_names.json. - Sampling writes
image.png,layout.png, andmetadata.json.
π Citation
@misc{wijenayake2026mitigating,
title={Mitigating Long-Tail Bias via Prompt-Controlled Diffusion Augmentation},
author={Buddhi Wijenayake and Nichula Wasalathilake and Roshan Godaliyadda and Vijitha Herath and Parakrama Ekanayake and Vishal M. Patel},
year={2026},
eprint={2602.04749},
archivePrefix={arXiv},
primaryClass={cs.CV},
url={https://arxiv.org/abs/2602.04749},
}
π License
This project is licensed under the MIT License - see the LICENSE file for details.
π Acknowledgments
- LoveDA dataset creators for high-quality annotated remote sensing data
- Hugging Face Diffusers for diffusion model infrastructure
- ControlNet authors for controllable generation