SARAL AI is a full-stack application that automates the process of converting research papers (LaTeX or arXiv) into engaging educational videos. The system leverages AI for script generation, slide creation, audio narration, and video synthesis, providing a seamless workflow from paper upload to downloadable media.This guide covers the full local setup for the SARAL monorepo, which contains both the frontend (React + Vite) and backend (Python + FastAPI) in a single repository.
saral/
├── frontend/ # React + Vite app
└── backend/ # FastAPI + worker services
- Prerequisites
- Repository Setup
- Backend Setup
- Poster Generation Go Service
- Frontend Setup
- Running the Full Stack
- Troubleshooting
- Notes for Contributors
Before anything else, make sure you have these installed:
- Git
- Node.js (active LTS, minimum Node 18+) — comes bundled with
npm - Python 3.11.x — required by
backend/pyproject.toml - Go (recommended 1.22+) — required for poster generation service in
poster-service/ - A modern browser (Google Chrome recommended)
Quick version checks:
node --version
npm --version
python3.11 --version
go version
git --versiongit clone <your-repository-url>
cd saralWindows users: The backend uses Linux shell scripts and Linux-oriented worker tooling. WSL2 is strongly recommended for full parity. Native PowerShell is possible for API-only scenarios but not supported for the full worker pipeline.
brew update
brew install ffmpeg poppler libreoffice redisLaTeX + Beamer — choose one:
Option A (full distribution, easiest):
brew install --cask mactex-no-gui
sudo tlmgr update --self
sudo tlmgr install beamer latexmkOption B (smaller install):
brew install --cask basictex
export PATH="/Library/TeX/texbin:$PATH"
sudo tlmgr update --self
sudo tlmgr install beamer collection-latexrecommended collection-fontsrecommended xetex latexmkPersist TeX path if using BasicTeX:
echo 'export PATH="/Library/TeX/texbin:$PATH"' >> ~/.zshrc
source ~/.zshrcStart Redis:
brew services start redisThe backend ships convenience scripts. From the repo root:
cd backend
chmod +x install_dependencies_linux.sh check_dependencies.sh
./install_dependencies_linux.sh
./check_dependencies.shOr install manually:
sudo apt update
sudo apt install -y \
ffmpeg \
poppler-utils \
libreoffice \
redis-server \
texlive-base \
texlive-latex-base \
texlive-latex-recommended \
texlive-latex-extra \
texlive-fonts-recommended \
texlive-fonts-extra \
texlive-xetex \
latexmk
sudo systemctl enable redis-server
sudo systemctl start redis-serverOpen PowerShell as Administrator and install WSL2:
wsl --install -d UbuntuReboot if prompted, then open Ubuntu and follow the Linux steps above.
From the repo root:
cd backend
python3.11 -m venv .venv
source .venv/bin/activate # Windows WSL: same command
pip install --upgrade pip
pip install uv
uv syncOptional — install Playwright browser runtime (needed for scraping paths):
uv run playwright install chromiumCopy the example env template:
# From repo root
cp .env.example backend/.env
# Or if you're already inside backend/
cp ../.env.example .envEdit backend/.env with at minimum:
# Required for auth
GOOGLE_CLIENT_ID=your_google_oauth_client_id
# Required for generation flows
GEMINI_API_KEY=your_gemini_api_key
SARVAM_API_KEY=your_sarvam_api_keyAll variable names must stay as-is. Do not commit
.env.
The backend expects a Firebase service account JSON at backend/firebase_service_account.json.
- Go to Firebase Console and create (or select) a project.
- Open Build → Firestore Database and create a Firestore database.
- Open Project Settings → Service Accounts.
- Click Generate new private key and download the JSON.
- Rename the file to
firebase_service_account.jsonand move it tobackend/.
- Open Google Cloud Console for the same project.
- Go to APIs & Services → OAuth consent screen and configure it.
- Go to APIs & Services → Credentials → Create Credentials → OAuth client ID.
- Select Web application, create it, and copy the Client ID into
backend/.envasGOOGLE_CLIENT_ID.
cd backend
./check_dependencies.shExpected: ffmpeg, pdflatex, xelatex, pdftoppm, pdfinfo, and soffice/libreoffice all detected.
From backend/ with the venv active:
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000With auto-reload for development:
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadSwagger UI will be available at: http://127.0.0.1:8000/docs
The background workers handle PDF processing, video generation, audio generation, poster generation, and more. For full feature parity, run them alongside the API.
cd backend
chmod +x start_workers.sh stop_workers.sh
./start_workers.shView logs:
tail -f logs/pdf_processor.log
tail -f logs/pdf_worker.log
tail -f logs/arxiv_worker.log
tail -f logs/latex_worker.log
tail -f logs/video_worker.log
tail -f logs/poster_worker.log
tail -f logs/audio_worker.logStop workers:
./stop_workers.shPoster generation uses a dedicated Go service expected at ports 8080 and 8081.
Run both instances in separate terminals before testing poster generation.
brew install gosudo apt update
sudo apt install -y golang-go# Option A: winget
winget install GoLang.Go
# Option B: chocolatey
choco install golangVerify:
go versioncd poster-service
go mod tidyOpen two more terminals (in addition to backend/frontend terminals):
Terminal A (Go poster server on port 8080):
cd poster-service
go run . --server --port=:8080Terminal B (Go poster server on port 8081):
cd poster-service
go run . --server --port=:8081These two processes are used by the poster worker load balancer.
If go run . fails with go.mod file not found or no Go files in ..., ensure the runnable server entrypoint exists in poster-service/ and that go.mod is present.
Skip this section if you already have Node 18+.
# Option A: Homebrew
brew install node
# Option B: nvm
nvm install --lts && nvm use --lts# Option A: nvm (recommended)
nvm install --lts && nvm use --lts
# Option B: apt
sudo apt update && sudo apt install -y nodejs npm# Option A: winget
winget install OpenJS.NodeJS.LTS
# Option B: nvm-windows
nvm install lts
nvm use ltsAfter installing, reopen your terminal and verify:
node --version
npm --versioncd frontend
npm installCreate a .env file inside frontend/:
VITE_APP_API_URL=http://localhost:8000
VITE_FIREBASE_API_KEY=your_firebase_api_key
VITE_FIREBASE_AUTH_DOMAIN=your_project.firebaseapp.com
VITE_FIREBASE_PROJECT_ID=your_project_id
VITE_FIREBASE_STORAGE_BUCKET=your_project.appspot.com
VITE_FIREBASE_MESSAGING_SENDER_ID=your_sender_id
VITE_FIREBASE_APP_ID=your_app_id
VITE_FIREBASE_MEASUREMENT_ID=your_measurement_id
VITE_REACT_APP_GOOGLE_CLIENT_ID=your_google_oauth_client_id
# Optional
VITE_MIXPANEL_TOKEN=your_mixpanel_tokenAll frontend env variables must be prefixed with
VITE_to be accessible in-app. Restart the dev server after any.envchange. Do not commit.env.
- In your Firebase project, add a Web App.
- Enable Google sign-in under Authentication → Sign-in method.
- Copy the Firebase config values into
frontend/.env.
The codebase currently contains a hardcoded redirect URI for the YouTube OAuth flow:
https://summarizesaral.democratiseresearch.in/oauth2callback
For this flow to work locally, ensure this URI is listed as an Authorized redirect URI in your Google Cloud OAuth client configuration.
cd frontend
npm run devThe app runs at http://localhost:3000. Vite opens the browser automatically.
npm run dev # start local development server
npm run build # create production build (output: build/)
npm run preview # preview production build locally
npm run lint # run eslintOnce both are configured, run these in separate terminals:
Terminal 1 — Backend API:
cd backend
source .venv/bin/activate
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reloadTerminal 2 — Background Workers (optional, for full features):
cd backend
./start_workers.shTerminal 3 — Go Poster Service 1 (required for poster generation):
cd poster-service
go run . --server --port=:8080Terminal 4 — Go Poster Service 2 (required for poster generation):
cd poster-service
go run . --server --port=:8081Terminal 5 — Frontend:
cd frontend
npm run devThen open http://localhost:3000.
- App opens at
http://localhost:3000 -
http://127.0.0.1:8000/docsloads Swagger UI -
redis-cli pingreturnsPONG - Home page loads without build errors
- Login page renders
- After valid login, protected routes are accessible
- API setup page appears and accepts a Gemini API key
| Problem | Fix |
|---|---|
python3.11: command not found |
Install Python 3.11: brew install python@3.11 (macOS) or sudo apt install python3.11 python3.11-venv (Ubuntu) |
| Redis not running | brew services start redis (macOS) or sudo systemctl start redis-server (Linux) |
Missing pdflatex / xelatex / Beamer |
Install TeX distribution; on Linux ensure texlive-latex-extra and texlive-xetex are installed |
Missing soffice / libreoffice |
Install LibreOffice and verify it is in PATH |
firebase_service_account.json errors |
Ensure a valid Firebase service account JSON is at backend/firebase_service_account.json |
| Playwright/Patchright browser issues | Run uv run playwright install chromium |
| API key errors on generation endpoints | Set GEMINI_API_KEY and SARVAM_API_KEY in backend/.env |
go: command not found when starting poster service |
Install Go, reopen terminal, and verify with go version |
go mod tidy fails in poster-service/ |
Ensure poster-service/go.mod exists and the service code is complete |
| Poster generation request fails with worker errors | Confirm both Go servers are running on ports 8080 and 8081 |
| Problem | Fix |
|---|---|
npm install fails |
Check Node/npm versions; delete node_modules and retry (rm -rf node_modules && npm install) |
| Port 3000 already in use | Stop the conflicting process, then rerun npm run dev |
| Login fails immediately | Verify all VITE_FIREBASE_* values; confirm Firebase Google sign-in is enabled |
| API calls fail / CORS errors | Confirm backend is running at VITE_APP_API_URL; confirm backend allows requests from http://localhost:3000 |
| OAuth callback issues | Confirm VITE_REACT_APP_GOOGLE_CLIENT_ID is correct; confirm authorized redirect URI is configured in Google Cloud |
.env changes not reflected |
Stop and restart npm run dev |
- Never commit secrets. Keep
.envfiles andfirebase_service_account.jsonout of version control — they are already in.gitignore. - Document new environment variables. Add any new
VITE_or backend env var to the relevant.env.exampleand to this README immediately. - Update this guide if you add a new local dependency or setup step.
- If you only need API smoke tests, you can run the backend API server alone without workers.
- For full media pipelines (PDF → video, poster generation, audio), run the API server + all workers + all system dependencies.