These codes is used by myself and not well organized yet. :-p
- prepare transcriptions
utils/xxx_trans.py
- prepare
allwords.txt
in dataset dir - generate files used in lang
utils/gen_lang_files.py
- set path in
config.py
- run
pipeline.py
to generate data - run
kaldi/run.sh
- !Remember to set
hmm_root
anddataset
- !Remember to set
- run
utils/visaligns.sh
- show alignments
- crop chars
- generate sample list for caffe with
utils/gen_list.sh
- run
caffe/makeds.sh
- train digit model
cd caffe
./train.sh
- run
rnnlib/create_nc.py
in the folder withconfig.py