video-SALMONN 2 is a powerful audio-visual large language model (LLM) that generates high-quality audio-visual video captions.
AI & ML interests
https://www.ee.tsinghua.edu.cn/en/
Recent Activity
View all activity
Organization Card
Department of Electronic Engineering, Tsinghua University
models
16
tsinghua-ee/video_SALMONN2plus_3B_audioAlign
5B
ā¢
Updated
ā¢
7
tsinghua-ee/D-ORCA-8B-0210
10B
ā¢
Updated
ā¢
21
ā¢
1
tsinghua-ee/WAVE-7B
Updated
ā¢
22
ā¢
1
tsinghua-ee/video_SALMONN2_7B_audioAlign
Updated
ā¢
21
tsinghua-ee/video_SALMONN2plus_72B_audioAlign
Updated
ā¢
3
tsinghua-ee/video_SALMONN2plus_7B_audioAlign
9B
ā¢
Updated
ā¢
545
tsinghua-ee/SALMONN
Automatic Speech Recognition
ā¢
Updated
ā¢
50
tsinghua-ee/video-SALMONN-2_plus_72B
Updated
ā¢
6
ā¢
2
tsinghua-ee/video-SALMONN-2_plus_3B
Updated
ā¢
1.56k
ā¢
3
tsinghua-ee/video-SALMONN-2_plus_7B
Updated
ā¢
961
ā¢
6
datasets
8
tsinghua-ee/ELViM
Viewer
ā¢
Updated
ā¢
211
ā¢
14
tsinghua-ee/SACRED-Bench
Viewer
ā¢
Updated
ā¢
2.48k
ā¢
46
tsinghua-ee/F-16-NBA
Preview
ā¢
Updated
ā¢
38
tsinghua-ee/AVUTBenchmark
Viewer
ā¢
Updated
ā¢
3.28k
ā¢
4.83k
ā¢
1
tsinghua-ee/video-SALMONN_2_testset
Preview
ā¢
Updated
ā¢
89
tsinghua-ee/QualiSpeech
Viewer
ā¢
Updated
ā¢
14.6k
ā¢
562
ā¢
21
tsinghua-ee/RivaBench
Viewer
ā¢
Updated
ā¢
542
ā¢
394
ā¢
2
tsinghua-ee/SAVEBench
Preview
ā¢
Updated
ā¢
68
ā¢
3