Sequence_Labeling

– Bidirectional LSTM & CNN layers are used for training the model, which predicts the associated phone sequences given the acoustic signals.

Project Link: https://www.csie.ntu.edu.tw/~yvchen/f106-adl/A1

Dataset

TIMIT dataset, which has MFCC features (39 dims) and FBank features (69 dims) for each frame is used in this task. It also has 48 different kinds of phone.

TIMIT dataset can be downloaded from kaggle https://www.kaggle.com/c/hw1-timit/data (created by class MLDS2017, NTU)

Quick start

Run the shell script

./hw1_best.sh [input directory] [output filename]

or any of the following: hw1_cnn.sh, hw1_rnn.sh

[input directory] should be TIMIT dataset which you download from link above

[output filename] a .csv file which shows the result of prediction

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
model		model
README.md		README.md
data_utils.py		data_utils.py
hw1_best.sh		hw1_best.sh
hw1_cnn.sh		hw1_cnn.sh
hw1_rnn.sh		hw1_rnn.sh
main_best.py		main_best.py
main_cnn.py		main_cnn.py
main_rnn.py		main_rnn.py
model_best.py		model_best.py
model_cnn.py		model_cnn.py
model_rnn.py		model_rnn.py
report.pdf		report.pdf
testlabelmap.txt		testlabelmap.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sequence_Labeling

Dataset

Quick start

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sequence_Labeling

Dataset

Quick start

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages