Speaker Adaptation with Limited Data #212

ThThoma · 2020-09-11T09:09:04Z

Hello, I've worked with CMU pocketsphinx for a while and now I'm transitioning to Vosk.
I'm really happy with the models and the recognition speed, thank you and kudos!
(especially vosk-model-en-us-daanzu-20200905-lgraph has >90% accuracy for my voice)

However i would like to offer the choice for my users to adapt to their environment/voice with limited audio data.
With pocketsphinx models you could create a MLLR matrix and that improved accuracy by +5-10% for the speaker.

From what i understood, vosk models are based (or similar) to Kaldi models.
So i peeked at Kaldi transform documentation
and i am wondering if there is a way to create and apply MLLR matrices to vosk models?

Or are MLLR matrices considered outdated now and fine tuning (with ~1hour data) is our only choice?

nshmyrev · 2020-09-11T09:15:43Z

Hi! Thanks for your feedback!

MLLR is not compatible anymore, you can not use it.

Our models use ivectors internally which kind of superseeds MLLR and work automatically inside, so you should not worry about that.

Finetuning is the way but not very straightforward unfortunately. But Daanzu does it sometimes with helpful results daanzu/kaldi-active-grammar#33

See also here: https://www.quora.com/Does-adaptation-help-with-speech-recognition-accuracy

ThThoma · 2020-09-11T10:45:55Z

Thank you for the quick and clear response!
I will eventually look into finetuning.

your post on adaptation with modern speech recognition tool kits is enough :)

ThThoma closed this as completed Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Speaker Adaptation with Limited Data #212

Speaker Adaptation with Limited Data #212

ThThoma commented Sep 11, 2020

nshmyrev commented Sep 11, 2020

ThThoma commented Sep 11, 2020

Speaker Adaptation with Limited Data #212

Speaker Adaptation with Limited Data #212

Comments

ThThoma commented Sep 11, 2020

nshmyrev commented Sep 11, 2020

ThThoma commented Sep 11, 2020