Skip to content

bekarys2003/SpeechConvert

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

🎙️ SpeechConvert

Real-time Speech-to-Text with Translation and Emotion Detection

Python Django React TypeScript License


🧭 Table of Contents


🧩 Overview

SpeechConvert is a full-stack web application that:

  • Converts speech to text in real time using Whisper.
  • Optionally translates the result into another language.
  • Optionally detects emotion in the text.
  • Allows users to record audio or upload files.
  • Includes full user authentication (email + Google OAuth) and password reset.

🚀 Features

Feature Description
🎤 Speech Recognition Converts recorded or uploaded audio into text (via Faster-Whisper).
🌐 Translation Uses Google Cloud Translate to convert text into multiple languages.
😊 Emotion Detection Analyzes text with a HuggingFace emotion model.
🔐 Auth System Register, login, logout, and password reset.
🔄 Token Refresh Short-lived access tokens auto-refresh with cookies.
📁 Upload Audio Supports MP3, WAV, and WEBM formats.
💾 Download Transcript Exports transcript as a .txt file.

🧱 Tech Stack

Frontend

  • React (TypeScript)
  • Redux Toolkit
  • React Router
  • Axios (with token interceptor)
  • Google OAuth

Backend

  • Django 5 + Django REST Framework
  • Django Channels (WebSockets)
  • PostgreSQL
  • Faster-Whisper
  • HuggingFace Transformers
  • Google Cloud Translate API
  • FFmpeg for audio conversion

🖼️ Screenshots

Screenshot 2025-06-18 at 10 43 17 AM Screenshot 2025-06-18 at 10 45 51 AM Screenshot 2025-06-18 at 10 48 30 AM

Getting Started

Follow these steps to run SpeechConvert locally.


1️⃣ Prerequisites

  • Python 3.10+
  • Node.js 18+ and npm
  • PostgreSQL 13+
  • A Google Cloud Service Account JSON key with Translate API enabled
  • ffmpeg binary available at backend/bin/ffmpeg (make it executable)

💡 On macOS/Linux you may need:
chmod +x backend/bin/ffmpeg


2️⃣ Backend — Environment Setup

Create a .env file inside the backend/ directory:

SECRET_KEY=your_secret_key
DEBUG=true
RENDER_EXTERNAL_HOSTNAME=localhost

DB_NAME=your_db
DB_USER=your_user
DB_PASSWORD=your_pass
DB_HOST=127.0.0.1
DB_PORT=5432

EMAIL_HOST=smtp.gmail.com
EMAIL_PORT=587
EMAIL_HOST_USER=your_email@gmail.com
EMAIL_HOST_PASSWORD=your_app_password
GOOGLE_APPLICATION_CREDENTIALS=/abs/path/to/service-account.json

⚙️ Backend — Install & Run (ASGI required)

cd backend
pip install -r requirements.txt
python manage.py migrate
python manage.py createsuperuser
daphne -b 0.0.0.0 -p 8000 backend.asgi:application

🚀 Running the Frontend

cd frontend
npm install
npm start

About

Full-stack Django + React app for real-time speech transcription, translation, and emotion analysis.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors