Very unstable inference
#22
by andypotato - opened
The model has a tendency to completely spiral out of control and only produce garbage output. This often happens at sections where speakers repeat a single word multiple times like "yes yes yes". From this point on inference just keeps repeating this single word.
Another issue is that it will not stop after the file has already been fully transcribed and keeps repeating nonsense characters and end tokens.
This behavior can be reproduced on the gradio demo and also single file demos from the Github repo.