None defined yet.
Steerable Visual Representations
Same Content, Different Answers: Cross-Modal Inconsistency in MLLMs