kaldi_examples

Sample codes and data to follow up on TIAL group meeting on 2/25. WIP

References

Common Setup/Quick Start

Install Kaldi
Create a directory for data and experiments
Copy cmd.sh and path.sh to this directory; this should be the same in any kaldi/egs recipes
Example audio files are in sample_data; the audio has to be in .wav formatand mono-channel. For an example of how to do this, see convert_trim.sh

Feature Extraction

Make sure the configurations are correct in /conf. For example:
- I changed fbank.conf to also extract total energy by setting --use-energy=true
- I used mfcc_hires.conf instead of the default mfcc.conf; according to the ASpiRE-related models, this gave better results in speech recognition
comp_mfcc.sh, comp_fbank_energy.sh, and comp_pitch.sh are sample scripts to extract mfcc, fbank, and pitch features. Note that the split channel part is repeated, comment this out if you've already done this.

Decoding Using a Pretrained Model

For my purposes (conversational speech, Switchboard), I'm using the chain model trained on Fisher data. The paper that this model is based on is this one, and this blog post has some nice and detailed derivations.

Steps:

Download a model from kaldi models. For this example, I'm using the ASpIRE chain model, version with the precompiled HCLG
Untar it: tar xvf 0001_aspire_chain_model_with_hclg.tar.bz2
To this recipe, copy cmd.sh and path.sh if you haven't done so

Link common modules we'll be using (or copy all these here if you'll be editing the scripts in these directory):

export KALDI_ROOT=/homes/ttmt001/kaldi
ln -s $KALDI_ROOT/egs/aspire/s5/steps .
ln -s $KALDI_ROOT/egs/aspire/s5/utils .
ln -s $KALDI_ROOT/egs/aspire/s5/conf .
ln -s $KALDI_ROOT/egs/aspire/s5/local .

Most instructions and comments are in decode_audio.sh
- Steps 1 and 2 can be done first and then reused (model preparation, graph compilation)
- Steps 3 and 4 can be run after creating wav.scp and utt2spk files in my_data folder. An example of creating those is in my_data

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
conf		conf
da		da
my_data		my_data
parse		parse
s6		s6
sample_data		sample_data
NOTES.txt		NOTES.txt
README.md		README.md
cmd.sh		cmd.sh
comp_fbank_energy.sh		comp_fbank_energy.sh
comp_mfcc.sh		comp_mfcc.sh
comp_pitch.sh		comp_pitch.sh
convert_trim.sh		convert_trim.sh
decode.sh		decode.sh
decode_audio.sh		decode_audio.sh
decode_audio_pa_dev.sh		decode_audio_pa_dev.sh
get_ctm_all.sh		get_ctm_all.sh
get_ctm_local.sh		get_ctm_local.sh
get_ctm_nbest.sh		get_ctm_nbest.sh
path.sh		path.sh
process_kaldi_feats_splits.py		process_kaldi_feats_splits.py
relink_lib.sh		relink_lib.sh
wav_feat.scp		wav_feat.scp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kaldi_examples

References

Common Setup/Quick Start

Feature Extraction

Decoding Using a Pretrained Model

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

kaldi_examples

References

Common Setup/Quick Start

Feature Extraction

Decoding Using a Pretrained Model

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages