Sample codes and data to follow up on TIAL group meeting on 2/25. WIP
- Official docs
- Slides from Group Meeting
- Eleanor Chodroff's tutorial
- JHU Summer Workshop
- Random blogs I found helpful:
- Install Kaldi
- Create a directory for data and experiments
- Copy
cmd.shandpath.shto this directory; this should be the same in anykaldi/egsrecipes - Example audio files are in
sample_data; the audio has to be in .wav formatand mono-channel. For an example of how to do this, seeconvert_trim.sh
- Make sure the configurations are correct in
/conf. For example:- I changed
fbank.confto also extract total energy by setting--use-energy=true - I used
mfcc_hires.confinstead of the defaultmfcc.conf; according to the ASpiRE-related models, this gave better results in speech recognition
- I changed
comp_mfcc.sh,comp_fbank_energy.sh, andcomp_pitch.share sample scripts to extract mfcc, fbank, and pitch features. Note that the split channel part is repeated, comment this out if you've already done this.
For my purposes (conversational speech, Switchboard), I'm using the chain model trained on Fisher data. The paper that this model is based on is this one, and this blog post has some nice and detailed derivations.
Steps:
- Download a model from kaldi models. For this example, I'm using the ASpIRE chain model, version with the precompiled HCLG
- Untar it:
tar xvf 0001_aspire_chain_model_with_hclg.tar.bz2 - To this recipe, copy
cmd.shandpath.shif you haven't done so - Link common modules we'll be using (or copy all these here if you'll be editing the scripts in these directory):
export KALDI_ROOT=/homes/ttmt001/kaldi ln -s $KALDI_ROOT/egs/aspire/s5/steps . ln -s $KALDI_ROOT/egs/aspire/s5/utils . ln -s $KALDI_ROOT/egs/aspire/s5/conf . ln -s $KALDI_ROOT/egs/aspire/s5/local . - Most instructions and comments are in
decode_audio.sh- Steps 1 and 2 can be done first and then reused (model preparation, graph compilation)
- Steps 3 and 4 can be run after creating
wav.scpandutt2spkfiles inmy_datafolder. An example of creating those is inmy_data