-
-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Other languages #21
Comments
Please let me know if there is a specific language you are looking for, and I can see about using it for testing. |
I want to convert any of the following Chinese Mandarin models to compatible with KAG. Thanks for any help or documentation. I have no experience with Kaldi. Currently the only one environment I can run is from And I know something about |
@SwimmingTiger It looks like the M11 model should be easiest, because it doesn't use pitch features, which I haven't implemented in this project. I will try to find time to try converting it to work with this, which should also give me a good opportunity to document the process clearly. However, basically the process is mostly just trying to get the various model files to be in the right place, as shown by how they are in the currently published models. There can be difficulty depending on which exact files have been published for the model you are trying to import. |
So it only need to move the location of some files without converting the data content? |
@SwimmingTiger Correct, I think that may work, depending on exactly what files are included in your model package and what their structure is. However, it has been a while since I converted a fresh external model, and I haven't tried another language yet, so I am not sure. Please give it a try, and let me know how it goes. |
Today I finally tried to port http://kaldi-asr.org/models/m11 to KAG. Then I immediately encountered some unsolvable problems. My actions (recorded as some Linux shell commands): # rename the default model
mv kaldi_model kaldi_model.default
# rename http://kaldi-asr.org/models/m11
mv multi_cn_chain_sp_online kaldi_model
# KAG need this file
cp kaldi_model.default/KAG_VERSION kaldi_model/ Then I got two None errors here when I run the demo: kaldi-active-grammar/kaldi_active_grammar/model.py Lines 326 to 327 in 869cd46
Try to bypass it: # copy from the default model
cp kaldi_model.default/align_lexicon.int kaldi_model.default/align_lexicon.base.int kaldi_model.default/lexiconp_disambig.txt kaldi_model.default/lexiconp_disambig.base.txt kaldi_model/ Then I got this with the demo:
Try to bypass it: cp kaldi_model.default/phones.nonterm.txt kaldi_model
cp kaldi_model/phones.txt kaldi_model/phones.base.txt
cat kaldi_model/phones.nonterm.txt >> kaldi_model/phones.txt Got Bypass: cp kaldi_model/words.txt kaldi_model/words.base.txt Got Bypass: cp kaldi_model.default/words.nonterm.txt kaldi_model
cat kaldi_model/words.nonterm.txt >> kaldi_model/words.base.txt Got:
Bypass: cp kaldi_model.default/nonterminals.txt kaldi_model.default/left_context_phones.txt kaldi_model/ Got Unable to resolve. It seems that some files copied from the default model do not match the model http://kaldi-asr.org/models/m11. And I don't know how to generate those files:
And I don't know what I need some help. @daanzu |
I tried to compile https://github.com/kaldi-asr/kaldi, but I couldn't find a suitable tool to generate these files from it.
|
@SwimmingTiger You will definitely need to use the phone set from the new model rather than from the english model. You also need to use the lexicon from the new model. However, both will need to be modified slightly to match what was added to the english model. You should compare how my english model differs from a standard english model. |
In the future, it should be able to support a lot more, but the work is in finding decent models for other languages, and then some minor modifications to enable their use in KaldiAG.
dictation-toolbox/dragonfly#241
The text was updated successfully, but these errors were encountered: