Music Flamingo
🎵
145
Answer music questions from uploaded audio or YouTube tracks
The secrets to building world-class LLMs
Multimodal OCR model for complex document understanding.
Generate natural-sounding speech in European languages with voice cloning
Create 3D models from images using depth estimation
Decompose a 3D model into its individual parts
Generate realistic dialogue from a script, using Dia!