Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
nguyenvulebinh
/
AVSRCocktail
like
0
Automatic Speech Recognition
Transformers
Safetensors
PyTorch
nguyenvulebinh/AVYT
English
avhubert_avsr
audio-visual-speech-recognition
multimodal
speech-recognition
lip-reading
cocktail-party
noise-robust
av-hubert
transformer
audio
video
english
lrs2
voxceleb2
ctc
attention
beam-search
multi-speaker
noisy-speech
arxiv:
2506.02178
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
AVSRCocktail
1.72 GB
Ctrl+K
Ctrl+K
1 contributor
History:
3 commits
nguyenvulebinh
Update README.md
ae29b16
verified
10 months ago
.gitattributes
Safe
1.52 kB
initial commit
10 months ago
README.md
Safe
9.91 kB
Update README.md
10 months ago
config.json
Safe
4.44 kB
Upload AVHubertAVSR
10 months ago
model.safetensors
Safe
1.72 GB
xet
Upload AVHubertAVSR
10 months ago