An AI-driven application that analyzes mock interview responses and provides quantitative feedback on confidence, fluency, and communication quality using speech recognition, audio signal processing, NLP, and machine learning.
This project is designed as a local ML system with a thin UI layer, focusing on correctness of methodology, interpretability, and clean software architecture.
- Speech-to-Text using OpenAI Whisper
- Audio Signal Processing
- Speaking pace (words per minute)
- Pitch variation
- Pause ratio
- NLP-Based Text Analysis
- Filler word detection
- Grammar & fluency proxy scoring
- Machine Learning Confidence Model
- Supervised regression model
- Uses acoustic, linguistic, and transformer-based emotion features
- Visual Analytics
- Radar chart for overall performance
- Bar chart for metric breakdown
- Automated PDF Report
- Transcript
- Metrics
- Confidence score
- Actionable feedback
- Clean, user-friendly Streamlit UI
ai_interview_analyzer/
│
├── app.py # Streamlit UI
├── requirements.txt
├── README.md
│
├── audio/
│ └── sample.wav
│
├── processing/ # Feature extraction
│ ├── speech_to_text.py
│ ├── audio_features.py
│ ├── text_analysis.py
│ └── confidence_score.py # Rule-based baseline
│
├── models/ # ML model & weights
│ ├── confidence_model.py
│ └── confidence_model.pkl
│
├── training/ # Offline training & evaluation
│ ├── training_data.py
│ ├── train_confidence_model.py
│ └── evaluate_confidence_model.py
│
├── utils/
│ ├── feedback.py
│ └── report_pdf.py
- Regression
- Output: Confidence score ∈ [0, 100]
- Speaking pace (WPM)
- Filler words per minute
- Grammar & fluency score (NLP-based proxy)
- Pitch variation
- Pause ratio
- Emotion probabilities from a pretrained Hugging Face transformer
- RandomForestRegressor
- Trained on heuristic-labeled samples
- Designed to learn relative importance of features
- Optimized for interpretability and stability
There is no pretrained model that directly outputs interview confidence. Instead, this project follows an industry-standard hybrid approach: pretrained models for feature extraction + a lightweight supervised regressor.
| Metric | Value |
|---|---|
| MAE | 1.16 |
| RMSE | 1.54 |
| R² Score | 0.992 |
Interpretation
- Low error is expected due to small dataset size and heuristic labeling.
- Results validate the ML pipeline and feature relevance, not real-world generalization.
Built using Streamlit with a simple, intuitive flow:
- Upload interview audio
- View transcript
- Inspect performance metrics
- Understand confidence score
- Receive actionable feedback
- Download a PDF performance report
python3.10 -m venv venv
source venv/bin/activate
pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m training.train_confidence_model
python -m training.evaluate_confidence_model
streamlit run app.py
The application generates a downloadable PDF containing:
- Interview transcript
- Extracted metrics
- Confidence score
- Personalized feedback
- Confidence is subjective.
- Labels are heuristic-based, not psychological ground truth.
- Predictions represent perceived confidence from observable signals.
- Larger labeled dataset
- Cross-validation & robustness testing
- Video-based cues (eye contact, posture)
- Backend API and cloud deployment