Qwen3-Embedding-8B T5 Inverter

A T5-base model trained to invert Qwen3-Embedding-8B embeddings back into text. Given a 4096-dimensional embedding vector, the model autoregressively decodes the original text that produced it.

Built on the vec2text framework.

Architecture

The model consists of three components:

Embedding transform — A learned MLP that projects the 4096-dim Qwen3 embedding into a sequence of 16 pseudo-tokens in T5's hidden space (768-dim), which are fed as encoder input.
T5-base encoder — Processes the projected embedding sequence.
T5-base decoder — Autoregressively generates the reconstructed text.

The Qwen3-Embedding-8B embedder itself is not included in this checkpoint — only the T5 encoder-decoder and the embedding transform are saved. At inference time, you need to produce Qwen3-Embedding-8B embeddings separately and pass them as input.

Training Details

Parameter	Value
Base model	`t5-base` (220M params)
Embedder	`Qwen/Qwen3-Embedding-8B` (frozen)
Embedding dim	4096
Num repeat tokens	16
Max sequence length	128 tokens
Optimizer	AdamW (fused)
Learning rate	1e-4
LR schedule	Constant with warmup (2500 steps)
Batch size	128
Training steps	230,500
Epochs	~29.5
Precision	FP32
Freeze strategy	None (all params trainable)
Training data	~1M English sentences with precomputed Qwen3-Embedding-8B embeddings

Final training loss: ~1.61

Usage

This model uses a custom InversionModel architecture from vec2text. To load and run inference:

# 1. Produce embeddings with Qwen3-Embedding-8B
from transformers import AutoModel, AutoTokenizer
import torch

qwen_tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen3-Embedding-8B")
qwen_model = AutoModel.from_pretrained("Qwen/Qwen3-Embedding-8B", torch_dtype=torch.float16).cuda()

text = "The quick brown fox jumps over the lazy dog."
inputs = qwen_tokenizer(text, return_tensors="pt", padding=True, truncation=True).to("cuda")
with torch.no_grad():
    embedding = qwen_model(**inputs).last_hidden_state.mean(dim=1)  # [1, 4096]

# 2. Load the inverter and decode
# Requires the vec2text library: https://github.com/jxmorris12/vec2text
from vec2text.models.inversion import InversionModel

inverter = InversionModel.from_pretrained("kennethge123/qwen3-8b-t5-inverter").cuda().eval()
# Pass the frozen embedding through the model's generate method

Intended Use

Embedding interpretability: Understanding what information is captured in Qwen3 embeddings.
Research: Studying the invertibility of text embedding models.
Debugging: Inspecting what a retrieval system "sees" for a given document embedding.

Limitations

Trained on English text only.
Max output length is 128 tokens.
This is the first-stage inverter (hypothesis generator). For higher-quality reconstruction, it can be paired with a corrector model in an iterative refinement loop (see vec2text).
Reconstruction quality degrades on out-of-distribution text (e.g., code, heavily formatted text).

Citation

This model builds on the vec2text framework:

@article{morris2023text,
  title={Text Embeddings Reveal (Almost) As Much As Text},
  author={Morris, John X and Kuleshov, Volodymyr and Shmatikov, Vitaly and Rush, Alexander M},
  journal={arXiv preprint arXiv:2310.06816},
  year={2023}
}

Downloads last month: 1

Safetensors

Model size

0.3B params

Tensor type

F32

Model tree for kennethge123/qwen3-8b-t5-inverter

Base model

google-t5/t5-base

Finetuned

(730)

this model

Paper for kennethge123/qwen3-8b-t5-inverter

Text Embeddings Reveal (Almost) As Much As Text

Paper • 2310.06816 • Published Oct 10, 2023 • 1