AudioScholar is an intelligent, multi-user platform designed to record lecture audio and leverage AI-driven summarization techniques to produce structured insights for learners. As a dual-platform solution comprising an Android mobile application and a comprehensive web interface, it allows users to capture, summarize, and receive personalized learning material recommendations based on audio recordings. By transforming lengthy lectures into digestible key points, AudioScholar enhances note-taking efficiency and content comprehension.
For the most detailed product overview, see docs/README-AudioScholar.md. This root README is the contributor setup and validation entry point.
- Smart Recording: Record lectures in real-time (online or offline) using the Android mobile app.
- Audio Uploads: Upload pre-recorded audio files from both mobile and web interfaces.
- Playback & Management: Centralized library with advanced playback controls and organizational tools.
- Intelligent Summarization: Leverages Google Gemini AI API to generate structured summaries, key topics, and glossaries.
- Contextual Enhancement: Upload PowerPoint presentations to augment AI processing for higher accuracy.
- Smart Recommendations: Automatically suggests relevant YouTube learning materials based on lecture content.
- Cross-Platform Access: Seamless experience across Android Mobile App and Web Interface.
- Cloud Synchronization: Securely sync recordings and summaries to Nhost Storage for access on any device.
- Offline Capabilities: Record and access local data on mobile even without an internet connection.
- Flexible Authentication: Secure login via Google, GitHub, or Email/Password using Firebase Authentication.
- User Notes: Integrated note-taking system to add personal insights alongside AI summaries.
- Freemium Architecture: Tiered feature access distinguishing between Free and Premium user experiences.
- Admin Dashboard: Powerful tools for user management, analytics, and system monitoring.
Ensure the following tools are installed on your system:
- Java Development Kit (JDK) 24
- Node.js (v18+)
- npm or yarn
- Git
- Maven (for backend)
- Android Studio (Latest version)
- Nhost Account (for cloud file storage - Nhost Cloud)
- Firebase Account (for authentication and Firestore database - Firebase Console)
git clone https://github.com/MasuRii/AudioScholar.git
cd AudioScholar-
Navigate to the backend:
cd backend/audioscholar -
Required Configuration Files: The backend requires two specific files to function correctly. You must create/place them in the specified locations:
.envfile: Required atbackend\audioscholar\.env- Firebase Service Account: Required at
backend\audioscholar\src\main\resources\firebase-service-account.json
-
Set up
.envcontent: Create the.envfile inbackend/audioscholar/with the following variables:# Copy from backend/audioscholar/.env.example and fill deployment-specific values. SPRING_PROFILES_ACTIVE=local APP_CORS_ALLOWED_ORIGINS=http://localhost:5173,http://localhost:8080,capacitor://localhost FIREBASE_WEB_API_KEY=your-firebase-web-api-key FIREBASE_DATABASE_URL=https://your-project.firebaseio.com GOOGLE_CLIENT_ID=your-google-oauth-client-id GOOGLE_CLIENT_SECRET=your-google-oauth-client-secret GOOGLE_ANDROID_CLIENT_ID=your-google-android-client-id GOOGLE_AI_API_KEY=your-gemini-api-key GEMINI_API_KEYS=your-gemini-api-key-or-comma-separated-keys YOUTUBE_API_KEY=your-youtube-api-key NHOST_STORAGE_URL=https://your-nhost-project.storage.region.nhost.run/v1/files NHOST_ADMIN_SECRET=your-nhost-admin-secret GITHUB_CLIENT_ID=your-github-oauth-client-id GITHUB_CLIENT_SECRET=your-github-oauth-client-secret JWT_SECRET=your-strong-jwt-secret CONVERTAPI_SECRET=your-convertapi-secret CONVERTAPI_SECRETS=your-convertapi-secret-or-comma-separated-secrets UPTIME_ROBOT_API=your-uptime-robot-api-key NVD_API_KEY=your-nvd-api-key
-
Configure Application Properties: Keep secrets and deployment identifiers in
.envor environment variables.application.propertiesreads Firebase, Nhost, CORS, OAuth, JWT, Gemini, YouTube, ConvertAPI, and UptimeRobot values from those variables. -
Run the backend:
./mvnw spring-boot:run -Dspring-boot.run.profiles=local
Or run
AudioscholarApplication.javafrom your IDE. (Spring Boot version3.5.8)
- Navigate to the web app:
cd frontend_web/audioscholar-app - Install dependencies:
npm ci
- Create a
.envfile infrontend_web/audioscholar-app:# Backend API URL VITE_API_URL=http://localhost:8080 # Firebase Frontend Configuration VITE_FIREBASE_API_KEY=your-firebase-api-key VITE_FIREBASE_AUTH_DOMAIN=your-firebase-auth-domain VITE_FIREBASE_DATABASE_URL=your-firebase-database-url VITE_FIREBASE_PROJECT_ID=your-firebase-project-id VITE_FIREBASE_STORAGE_BUCKET=your-firebase-storage-bucket VITE_FIREBASE_MESSAGING_SENDER_ID=your-firebase-messaging-sender-id VITE_FIREBASE_APP_ID=your-firebase-app-id VITE_FIREBASE_MEASUREMENT_ID=your-firebase-measurement-id VITE_GITHUB_CLIENT_ID=your-github-oauth-client-id
- Run the development server:
(Uses Vite
npm run dev
6.4.1, React19) Open at:http://localhost:5173
- Open Android Studio β "Open an Existing Project"
- Navigate to:
frontend_mobile/AudioScholar - Sync Gradle files.
- Configure Firebase:
- Place
google-services.jsoninfrontend_mobile/AudioScholar/app/.
- Place
- Configure API Base URL:
In
frontend_mobile/AudioScholar/local.properties:Release builds default to the HTTPS Render backend and should be overridden only with an HTTPS# Debug builds may use the Android emulator loopback URL. BASE_URL=http://10.0.2.2:8080/
BASE_URL. Cleartext HTTP is disabled for release and permitted only by the debug network security config for local emulator hosts. - Run on an emulator or physical device.
Run the smallest target-native checks for the component you changed:
# Backend
cd backend/audioscholar
./mvnw -B test
./mvnw -B spotless:check
# Optional security gate when NVD_API_KEY is configured: ./mvnw -B dependency-check:check
# Web frontend
cd frontend_web/audioscholar-app
npm ci
npm run lint
npm run test
npm run build
npm audit --audit-level=high
# Mobile frontend
cd frontend_mobile/AudioScholar
./gradlew test
./gradlew lint
./gradlew assembleDebug
# connectedAndroidTest requires an emulator or physical device- Do not commit
.env, Firebase service account JSON,google-services.json, API keys, OAuth secrets, or JWT secrets. - Backend bearer/JWT values must never be logged; logs should contain user IDs and event metadata only.
- Dependency/security gates are defined in GitHub Actions and can be run locally with the commands above.
- Register or log in using Firebase Authentication.
- Record a lecture using the mobile app. Audio is uploaded to Nhost Storage.
- AI processing generates summaries and metadata in Firebase Firestore.
- View the summary on web or mobile under My Lectures.
- Access recommended YouTube videos for deeper learning.
- Spring Boot 3.5.8
- Nhost Interaction
- Firebase Admin SDK
- Google Gemini AI API
- YouTube Data API v3
- React 19
- Vite 6.4.1
- Firebase SDK
- Kotlin + Jetpack Compose
- AndroidX
- Firebase SDKs
- Ktor Client
- Media3 ExoPlayer
The following features are explicitly noted as outside the scope of the initial AudioScholar release (v1.0):
| Feature | Status |
|---|---|
| Real-time Transcription | π« Future enhancement |
| iOS Mobile Platform Support | π« Android only for v1.0 |
| Web Interface Audio Recording | π« Upload only for v1.0 |
| Multi-language Support | π« English only for v1.0 |
| Background Recording (Free Users) | π« Restricted feature |
| Recommendation Engine beyond YouTube | π« Future enhancement |
- Use Case & Activity Diagrams: View on Figma
- Mobile Wireframes: View on Figma
- Web Wireframes: View on Figma
- Database Schema & ER Diagrams: View on Figma
Adviser/Lead:
- Ralph P. Laviste
Group Adviser:
- Jasmine A. Tulin
Proponents:
- Biacolo, Math Lee L.
- Terence, John Duterte
- Orlanes, John Nathan
- Barrientos, Claive Justin
- Alpez, Christian Brent
Distributed under the MIT License. See LICENSE for more information.
For issues, suggestions, or collaboration inquiries, feel free to open an issue or contact the development team.
β AudioScholar β Empowering learners through intelligent audio insights.

