ValueError: state dict does not contain bitsandbytes__* / quantized_stats when loading sharded bnb-4bit FLUX.2 pipeline via DiffusionPipeline.from_pretrained

#2
by pmlakner - opened

Loading the Hugging Face repo diffusers/FLUX.2-dev-bnb-4bit fails during from_pretrained() with a ValueError indicating the supplied state dict for a layer weight does not contain bitsandbytes__* and other quantization stats components. The error occurs while loading the pipeline components (after checkpoint shards start loading).

This prevents using the official bnb-4bit FLUX.2 checkpoint via diffusers.

Environment

  • Python: 3.11
  • GPU runtime: CUDA 12.8 (PyTorch build +cu128)

Relevant package versions

  • diffusers==0.37.0.dev0
  • transformers==4.51.3
  • torch==2.8.0+cu128
  • bitsandbytes==0.49.2
  • accelerate==0.33.0
  • safetensors==0.7.0
  • huggingface-hub==0.36.1
  • tokenizers==0.21.4

Reproduction steps

 
import torch
from diffusers import DiffusionPipeline
from diffusers.utils import load_image

pipe = DiffusionPipeline.from_pretrained(
    "diffusers/FLUX.2-dev-bnb-4bit",
    dtype=torch.bfloat16,
    device_map="cuda",
)

prompt = "Turn this cat into a dog"
input_image = load_image(
    "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/cat.png"
)

image = pipe(image=input_image, prompt=prompt).images[0]

Observed behavior

Two things happen:

Warning:

Keyword arguments {'dtype': torch.bfloat16} are not expected by Flux2Pipeline and will be ignored.

Crash during model/component load:

Traceback (most recent call last):
  File "/root/desigen/flux2.py", line 5, in 
    pipe = DiffusionPipeline.from_pretrained("diffusers/FLUX.2-dev-bnb-4bit", dtype=torch.bfloat16, device_map="cuda")
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
    return fn(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/diffusers/pipelines/pipeline_utils.py", line 1043, in from_pretrained
    loaded_sub_model = load_sub_model(
                       ^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/diffusers/pipelines/pipeline_loading_utils.py", line 885, in load_sub_model
    loaded_sub_model = load_method(os.path.join(cached_folder, name), **loading_kwargs)
                       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 279, in _wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4399, in from_pretrained
    ) = cls._load_pretrained_model(
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 4833, in _load_pretrained_model
    disk_offload_index, cpu_offload_index = _load_state_dict_into_meta_model(
                                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/torch/utils/_contextlib.py", line 120, in decorate_context
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/root/desigen/.venv/lib/python3.11/site-packages/transformers/modeling_utils.py", line 827, in _load_state_dict_into_meta_model
    hf_quantizer.create_quantized_param(
  File "/root/desigen/.venv/lib/python3.11/site-packages/transformers/quantizers/quantizer_bnb_4bit.py", line 212, in create_quantized_param
    raise ValueError(
ValueError: Supplied state dict for language_model.model.layers.26.mlp.up_proj.weight does not contain `bitsandbytes__*` and possibly other `quantized_stats` components.

🧨Diffusers org
edited 1 day ago

where did you get the code? it should be:

pipe = DiffusionPipeline.from_pretrained(
    "diffusers/FLUX.2-dev-bnb-4bit",
    torch_dtype=torch.bfloat16,
    device_map="cuda",
)

The error you're getting comes from the transformers library, I tested it with transformers==5.1.0 and it works without a problem. Also I don't know what you're loading but the language_model.model.layers.26 has the quant data, maybe you have a corrupted download?

image

The code I got from the above button on this page "Use this model". I updated transformers to 5.1.0 but now I am getting a different issue. What version of diffusers did you run it with?

i fixed it. Needed to update peft and accelerate and now it works. Thanks!

pmlakner changed discussion status to closed

Sign up or log in to comment