---
license: apache-2.0
library_name: diffusers
pipeline_tag: text-to-image
datasets:
- opendiffusionai/laion2b-squareish-1536px
thumbnail: https://huggingface.co/neuralvfx/Z-Image-SAM-ControlNet/resolve/main/examples/side_by_side_b.png
base_model: jimmycarter/LibreFLUX
---
# LibreFLUX-ControlNet
![Example: Control image vs result](examples/side_by_side_b.png)

# Update - 4/10/2026
- Retrained this model on [laion2b-squareish-1536px](https://huggingface.co/datasets/opendiffusionai/laion2b-squareish-1536px)
- I tripled the control layers, to get better guidance

# Fun Facts
- Trained exclusively on images generated by [Segment Anything (SAM)](https://aidemos.meta.com/segment-anything/)
- Uses SAM style images as input, outputs photorealistic images
- Trained at 1024x1024 resolution, inference works best at 1.5k and up
- Trained on 320K segmented images from [laion2b-squareish-1536px](https://huggingface.co/datasets/opendiffusionai/laion2b-squareish-1536px)
- Base model is [LibreFLUX](https://huggingface.co/jimmycarter/LibreFLUX) ( de-distilled FLUX )


# Showcases
<table style="width:100%; table-layout:fixed;">
  <tr>
    <td><img src="./examples/resized_kitten_seg.png" ></td>
    <td><img src="./examples/resized_kitten.png" ></td>
  </tr>
  <tr>
    <td><img src="./examples/resized_dread_girl_seg.png" ></td>
    <td><img src="./examples/resized_dread_girl.png" ></td>
  </tr>
  <tr>
    <td><img src="./examples/resized_house_seg.png" ></td>
    <td><img src="./examples/resized_house.png" ></td>
  </tr>
</table>


# Extra Details
- I built this repo to train the model: [https://github.com/NeuralVFX/LibreFLUX-ControlNet](https://github.com/NeuralVFX/LibreFLUX-ControlNet)
- Trained in same non-distilled fashion as [LibreFLUX](https://huggingface.co/jimmycarter/LibreFLUX)
- Uses Attention Masking
- Uses CFG during Inference ( allows negative prompting )
- Inference code roughly adapted from: [https://github.com/bghira/SimpleTuner](https://github.com/bghira/SimpleTuner)

# ComfyUI
- I've made some custom nodes for this: [https://github.com/NeuralVFX/LibreFLUX-ComfyUI](https://github.com/NeuralVFX/LibreFLUX-ComfyUI)

# Compatibility
```py
pip install -U diffusers==0.32.0
pip install -U "transformers @ git+https://github.com/huggingface/transformers@e15687fffe5c9d20598a19aeab721ae0a7580f8a"
```
Low VRAM:
```py
pip install optimum-quanto
```
# Load Pipeline
```py
import torch
from diffusers import DiffusionPipeline

model_id = "neuralvfx/LibreFlux-ControlNet"  
device = "cuda" if torch.cuda.is_available() else "cpu"

dtype  = torch.bfloat16 if device == "cuda" else torch.float32

pipe = DiffusionPipeline.from_pretrained(
    model_id,
    custom_pipeline=model_id,
    trust_remote_code=True,   
    torch_dtype=dtype,
    safety_checker=None        
).to(device)
```

# Inference
```py
from PIL import Image
from torchvision.transforms import ToTensor

# Load Control Image
cond = Image.open("examples/libre_flux_control_image.png")
cond = cond.resize((1024, 1024))

# Convert PIL image to tensor and move to device with correct dtype
cond_tensor = ToTensor()(cond)[:3,:,:].to(pipe.device, dtype=pipe.dtype).unsqueeze(0)

out = pipe(
  prompt="many pieces of drift wood spelling libre flux sitting casting shadow on the lumpy sandy beach with foot prints all over it",
            negative_prompt="blurry",
            control_image=cond_tensor,  # Use the tensor here
            num_inference_steps=75,
            guidance_scale=4.0,
            height =1024,
            width=1024,
            controlnet_conditioning_scale=1.0,
            num_images_per_prompt=1,
            control_mode=None,
            generator= torch.Generator().manual_seed(32),
            return_dict=True,
        )
out.images[0]
```
# Load Pipeline ( Low VRAM )
```py
import torch
from diffusers import DiffusionPipeline
from optimum.quanto import freeze, quantize, qint8

model_id = "neuralvfx/LibreFlux-ControlNet" 
device = "cuda" if torch.cuda.is_available() else "cpu"
dtype  = torch.bfloat16 if device == "cuda" else torch.float32

pipe = DiffusionPipeline.from_pretrained(
    model_id,
    custom_pipeline=model_id,
    trust_remote_code=True,    
    torch_dtype=dtype,
    safety_checker=None         
)

quantize(
    pipe.transformer,
    weights=qint8,
    exclude=[
        "*.norm", "*.norm1", "*.norm2", "*.norm2_context",
        "proj_out", "x_embedder", "norm_out", "context_embedder",
    ],
)

quantize(
    pipe.controlnet,
    weights=qint8,
    exclude=[
        "*.norm", "*.norm1", "*.norm2", "*.norm2_context",
        "proj_out", "x_embedder", "norm_out", "context_embedder",
    ],
)
freeze(pipe.transformer)
freeze(pipe.controlnet)

pipe.enable_model_cpu_offload()

```