YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

LatentLens β€” Qwen2-VL Contextual Text Embeddings

Pre-computed contextual text embeddings from the Qwen2-VL-7B-Instruct LLM backbone, extracted at 8 transformer layers. Used by the LatentLens quickstart for interpreting visual token representations.

What is this?

LatentLens interprets continuous token representations (e.g., visual tokens in a VLM) by finding their nearest neighbors in contextual text embedding space β€” the same space the LLM uses internally. These embeddings are that text embedding bank.

Each layer directory contains an embeddings_cache.pt file with:

  • embeddings: [300836, 3584] tensor (float16) β€” contextual embeddings for ~26K unique text tokens, each with up to 20 contextual variants from Visual Genome captions
  • token_to_indices: dict mapping token string β†’ list of embedding row indices
  • metadata: list of dicts with token string, token ID, source caption, and position

Layers

Layer Stage Size
1 Very early ~2.1 GB
2 Early ~2.1 GB
4 Early-mid ~2.1 GB
8 Middle ~2.1 GB
16 Mid-late ~2.1 GB
24 Late ~2.1 GB
26 Near-final ~2.1 GB
27 Final ~2.1 GB

Usage

from huggingface_hub import hf_hub_download
import torch

path = hf_hub_download(
    repo_id="McGill-NLP/latentlens-qwen2vl-embeddings",
    filename="layer_27/embeddings_cache.pt",
)
cache = torch.load(path, map_location="cpu", weights_only=False)
embeddings = cache["embeddings"].float()  # [300836, 3584]

Or use the full quickstart script:

pip install latentlens
python examples/quickstart.py --image your_image.jpg

Citation

@article{krojer2026latentlens,
  title={LatentLens: Revealing Highly Interpretable Visual Tokens in LLMs},
  author={Krojer, Benno and Nayak, Shravan and Ma{\~n}as, Oscar and Adlakha, Vaibhav and Elliott, Desmond and Reddy, Siva and Mosbach, Marius},
  journal={arXiv preprint arXiv:2506.XXXXX},
  year={2026}
}

License

Apache License 2.0

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Collection including McGill-NLP/latentlens-qwen2vl-embeddings