arxiv:2506.02161
Jinrui Zhang
zjr2000
AI & ML interests
None yet
Recent Activity
upvoted
a
paper
about 13 hours ago
Caption Anything: Interactive Image Description with Diverse Multimodal
Controls
upvoted
a
paper
about 13 hours ago
Transferable Decoding with Visual Entities for Zero-Shot Image
Captioning
upvoted
a
paper
about 13 hours ago
LongVALE: Vision-Audio-Language-Event Benchmark Towards Time-Aware
Omni-Modal Perception of Long Videos
Organizations
None yet