Skip to content

SeanvonB/speech-recognizer

Repository files navigation

Speech Recognizer

LIVE DEVELOPMENT NOTEBOOK

This was my final for Udacity's Natural Language Processing Nanodegree, which I completed in late 2020. It's really a summative project for all of my School of AI Nanodegrees: AI Programming and Computer Vision included; because it relies heavily on the knowledge from all three courses.

In data science terms, I should actually call this an Automatic Speech Recognition (ASR) pipeline. An ASR pipeline recieves spoken audio as input and returns a text transcript of the speech as output, so you'll often find this at the heart of speech recognition or dictation software.

Features

  • Convert recorded speech into written text
  • Transform raw audio data into spectrogram or MFCC features
  • Test the performance of simple, deep, and bidirectional RNNs on this task
  • Test additional architecture options, like CNNs for feature extraction

Credits

License

Copyright © 2020-2022 Sean von Bayern
Licensed under the MIT License

About

Transcribe spoken audio to text using machine learning

Topics

Resources

License

Stars

Watchers

Forks

Contributors

Languages