Audio super-resolution with an ECA-enhanced encoder-decoder CNN. Upsamples low-rate speech (2/4/8 kHz → 16 kHz) to improve ASR performance, trained on LibriSpeech with Wav2Vec2-based perceptual loss.
deep-learning tensorflow cnn speech-to-text speech-processing asr librispeech encoder-decoder-architecture wav2vec2 speech-super-resolution eca-attention
-
Updated
Sep 29, 2025 - Jupyter Notebook