You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello, I've worked with CMU pocketsphinx for a while and now I'm transitioning to Vosk.
I'm really happy with the models and the recognition speed, thank you and kudos!
(especially vosk-model-en-us-daanzu-20200905-lgraph has >90% accuracy for my voice)
However i would like to offer the choice for my users to adapt to their environment/voice with limited audio data.
With pocketsphinx models you could create a MLLR matrix and that improved accuracy by +5-10% for the speaker.
From what i understood, vosk models are based (or similar) to Kaldi models.
So i peeked at Kaldi transform documentation
and i am wondering if there is a way to create and apply MLLR matrices to vosk models?
Or are MLLR matrices considered outdated now and fine tuning (with ~1hour data) is our only choice?
The text was updated successfully, but these errors were encountered:
Hello, I've worked with CMU pocketsphinx for a while and now I'm transitioning to Vosk.
I'm really happy with the models and the recognition speed, thank you and kudos!
(especially vosk-model-en-us-daanzu-20200905-lgraph has >90% accuracy for my voice)
However i would like to offer the choice for my users to adapt to their environment/voice with limited audio data.
With pocketsphinx models you could create a MLLR matrix and that improved accuracy by +5-10% for the speaker.
From what i understood, vosk models are based (or similar) to Kaldi models.
So i peeked at Kaldi transform documentation
and i am wondering if there is a way to create and apply MLLR matrices to vosk models?
Or are MLLR matrices considered outdated now and fine tuning (with ~1hour data) is our only choice?
The text was updated successfully, but these errors were encountered: