AI Study Assistant Quiz Generator

An AI-enabled Streamlit study assistant that turns course materials into source-grounded quizzes, flashcards, explanations, and review snippets.

This project is built for the SDEV378 Applied AI final project standard: a functional proof of concept with at least three ML-based components working together in a meaningful way.

ML Components

Document extraction and OCR
- PyMuPDF extracts selectable PDF text locally.
- pytesseract can OCR image uploads and PDF pages that do not contain extractable text.
- Output: normalized study text plus extraction diagnostics.
Semantic retrieval
- sentence-transformers/all-MiniLM-L6-v2 creates local embeddings.
- ChromaDB stores and searches chunks from the user's uploaded or pasted materials.
- Output: source chunks ranked by semantic relevance.
LLM study generation
- GroqCloud runs llama-3.1-8b-instant by default.
- The generator receives retrieved snippets and produces quiz questions, flashcards, or explanations grounded in those snippets.
- Output: structured study content plus source references.

Happy Path

Upload or paste course material.
Preview extracted text and fix it if needed.
Build a local semantic index.
Choose quiz, flashcards, or explanation mode.
Generate study output using retrieved snippets.
Answer quiz questions and review explanations tied back to source text.

Setup From GitHub

Clone the repository and enter the project folder:

git clone https://github.com/mmaslov007/SDEV378-Final-AI.git
cd SDEV378-Final-AI

Create and activate a virtual environment:

python -m venv .venv
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip
python -m pip install -r requirements.txt

Create your local environment file:

copy .env.example .env

Add your Groq API key to .env. Never commit this file:

GROQ_API_KEY=your_key_here

Install Tesseract OCR for image uploads and scanned PDFs. On Windows, use one of these package-manager options:

winget install --id tesseract-ocr.tesseract --exact --accept-source-agreements --accept-package-agreements

choco install tesseract -y

Close and reopen PowerShell after installing Tesseract so PATH refreshes. If OCR still shows as unavailable, set the executable path in .env:

TESSERACT_CMD=C:\Program Files\Tesseract-OCR\tesseract.exe

For macOS or Linux:

brew install tesseract
sudo apt-get install tesseract-ocr

Run the app from the project virtual environment:

.\.venv\Scripts\python.exe -m streamlit run app.py

The first semantic retrieval run may download the sentence-transformers/all-MiniLM-L6-v2 model. Tesseract OCR is optional but recommended for images and scanned PDFs. If it is not installed, the app still supports pasted text and selectable PDF text.

Validation

Run the dependency-light checks:

.\.venv\Scripts\python.exe -m unittest discover -s tests
.\.venv\Scripts\python.exe -m compileall study_assistant tests app.py
.\.venv\Scripts\python.exe scripts\check_setup.py

Run the Streamlit app:

.\.venv\Scripts\python.exe -m streamlit run app.py

The app displays component status in the sidebar:

OCR is available only when local Tesseract is installed.
Retrieval uses ChromaDB and MiniLM when the full requirements are installed; otherwise it falls back to an in-memory hashing index.
Generation uses Groq when GROQ_API_KEY is configured; otherwise it shows source-grounded fallback study prompts.

Troubleshooting

If the app says PyMuPDF is not installed, it is running from the wrong Python environment. Stop any old Streamlit servers, then launch with the virtual environment command:

Get-CimInstance Win32_Process | Where-Object { $_.CommandLine -like '*streamlit*app.py*' -and $_.ProcessId -ne $PID } | ForEach-Object { Stop-Process -Id $_.ProcessId -Force }
.\.venv\Scripts\python.exe -m streamlit run app.py

If OCR still says unavailable after installing Tesseract, run:

.\.venv\Scripts\python.exe scripts\check_setup.py

Then set TESSERACT_CMD in .env if needed.

Demo Material

Use sample_materials/sdev378_ai_study_notes.txt for a reliable local demo without needing external files.

Project Review Fit

The interface is minimal but demo-ready.
The app completes the intended study workflow end to end.
The three AI components each do something distinct: OCR/extraction reads the material, embeddings retrieve the right parts, and the LLM creates study output from the retrieved evidence.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Study Assistant Quiz Generator

ML Components

Happy Path

Setup From GitHub

Validation

Troubleshooting

Demo Material

Project Review Fit

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
sample_materials		sample_materials
scripts		scripts
study_assistant		study_assistant
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

AI Study Assistant Quiz Generator

ML Components

Happy Path

Setup From GitHub

Validation

Troubleshooting

Demo Material

Project Review Fit

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages