Visual Document Retrieval
Transformers
Safetensors
ColPali
English
pretraining

Error in sample codes

#7
by martineden - opened

I am trying to run the model using the sample code in the model description page, and I received this error:

RuntimeError: Expected attn_mask dtype to be bool or float or to match query dtype, but got attn_mask.dtype: struct c10::BFloat16 and query.dtype: float instead.

  • This error raises from this part of the code after successful loading of the model:

Forward pass

with torch.no_grad():
-> image_embeddings = model(**batch_images)
query_embeddings = model(**batch_queries)

  • I am loading the model as defined in the model page:
    model_name = "vidore/colpali-v1.3-hf"

model = ColPaliForRetrieval.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="cuda:0", # cuda:0 if nvidia gpu or "mps" if on Apple Silicon
).eval()

processor = ColPaliProcessor.from_pretrained(model_name)

  • I have RTX 5090 and PyTorch with CUDA 12.8:

    nvcc --version
    nvcc: NVIDIA (R) Cuda compiler driver
    Copyright (c) 2005-2025 NVIDIA Corporation
    Built on Fri_Feb_21_20:42:46_Pacific_Standard_Time_2025
    Cuda compilation tools, release 12.8, V12.8.93
    Build cuda_12.8.r12.8/compiler.35583870_0

pip show torch
Name: torch
Version: 2.7.0+cu128
Summary: Tensors and Dynamic neural networks in Python with strong GPU acceleration
Home-page: https://pytorch.org/
Author: PyTorch Team
Author-email: packages@pytorch.org
License: BSD-3-Clause
Location: E:\Python312\Lib\site-packages
Requires: filelock, fsspec, jinja2, networkx, setuptools, sympy, typing-extensions
Required-by: accelerate, compressed-tensors, outlines, sentence-transformers, torchaudio, torchvision

Vidore org

Hey @martineden , it seems to be an issue with the latest versions of transformers, the script runs fine in transformers==4.53.3 but not in 4.54 or later versions.

I'll investigate, but waiting for the fix it is simpler if you juste downgrade the transformer version of your environnement.

Sign up or log in to comment