Uniarch OCR & Answer Assessor 📝

A Streamlit application designed to process scanned handwritten answer sheets (PDF format), extract answers using the Qwen-VL model, merge multi-page answers, and provide AI-powered assessment and chat functionalities using Google Gemini.

(Add a screenshot of the running application here)

Features ✨

PDF Upload: Upload multi-page PDF answer sheets.
Image Conversion: Converts PDF pages to images using PyMuPDF.
Advanced OCR: Utilizes the Qwen/Qwen2.5-VL-7B-Instruct model for OCR, specifically tailored to extract structured answer data (number + text) based on predefined layout rules (delimiters, number boxes).
JSON Output: OCR process generates structured JSON output per page.
Answer Merging: Intelligently merges answer text that spans multiple pages based on "Continuation" markers identified during OCR.
Verification Tab: Allows users to view the original image and the raw/parsed OCR output for each page.
AI Assessment: Uses Google Gemini (gemini-1.5-flash) to assess the quality, clarity, and coherence of the extracted answer text.
AI Chat Assistant: Provides a chat interface powered by Google Gemini (gemini-1.5-pro) for asking questions about the extracted content or assessments.
GPU Accelerated: Leverages GPU for faster Qwen-VL model inference (torch, accelerate).
Memory Optimization: Uses float16 precision for the Qwen model to reduce memory footprint.

Setup and Installation ⚙️

Prerequisites

Python: 3.9+
pip: Package installer for Python.
Git: (Optional) For cloning the repository.
NVIDIA GPU: Required for running the Qwen-VL model efficiently.
CUDA Toolkit & cuDNN: Compatible versions installed for your NVIDIA driver and PyTorch.
Google Gemini API Key: You need an API key from Google AI Studio.

Installation Steps

Clone the Repository (Optional):
```
git clone <your-repo-url>
cd <your-repo-directory>
```
Alternatively, just place main.py and requirements.txt in a directory.

Create a Virtual Environment (Recommended):

python -m venv venv
source venv/bin/activate  # On Windows use `venv\Scripts\activate`

Install Dependencies:
```
pip install -r requirements.txt
```
Note: Installing PyTorch might take time and depends on your CUDA setup. Ensure you have a compatible version.

Configure API Key:

⚠️ Security Warning: The current code hardcodes the Google Gemini API key. This is highly insecure for shared or deployed applications.

Recommended Method (Streamlit Secrets):

Create a directory .streamlit in your project folder.
Inside .streamlit, create a file named secrets.toml.

Add your API key to secrets.toml:

# .streamlit/secrets.toml
GOOGLE_API_KEY="AIzaSy..."

Modify main.py to load the key using st.secrets:

# Replace the hardcoded key section in main.py
try:
    # Attempt to load from secrets first
    api_key = st.secrets["GOOGLE_API_KEY"]
except Exception:
     # Fallback or error (remove hardcoded fallback for production)
     st.error("Google API Key not found in Streamlit secrets (/.streamlit/secrets.toml)")
     api_key = None # Or use the hardcoded one for local testing ONLY if necessary

if api_key:
    try:
        genai.configure(api_key=api_key)
        genai.list_models() # Test configuration
        st.session_state.api_key_configured = True
    except Exception as e:
        st.error(f"Gemini API configuration failed: {e}", icon="❌")
        st.session_state.api_key_configured = False
else:
     st.session_state.api_key_configured = False

# Remove the global HARDCODED_API_KEY variable and its usage

Alternative (Environment Variables): Set an environment variable GOOGLE_API_KEY and load it in Python using os.getenv("GOOGLE_API_KEY").

Running the Application 🚀

Ensure your virtual environment is activated.
Make sure the API key is configured (preferably using secrets).
Run the Streamlit app:
```
streamlit run main.py
```
Open your web browser and navigate to the local URL provided by Streamlit (usually http://localhost:8501).

Docker Setup (GPU Required) 🐳

You can run this application inside a Docker container, leveraging GPU acceleration via the NVIDIA Container Toolkit.

Prerequisites

Docker: Install Docker Desktop or Docker Engine.
NVIDIA Container Toolkit: Install this to enable GPU access within Docker containers. Installation Guide

Build the Docker Image

docker build -t uniarch-ocr-assessor .

Run the Docker Container

Using Streamlit Secrets: Mount your .streamlit directory into the container.

docker run --gpus all -p 8501:8501 \
  -v ./.streamlit:/app/.streamlit \
  uniarch-ocr-assessor

Using Environment Variables: Pass the API key as an environment variable.
```
docker run --gpus all -p 8501:8501 \
  -e GOOGLE_API_KEY="AIzaSy..." \
  uniarch-ocr-assessor
```
(Remember to modify main.py to read the key from os.getenv("GOOGLE_API_KEY") if using this method).

Access the application at http://localhost:8501 in your browser.

Key Technologies 🛠️

Streamlit: Web application framework.
Qwen-VL (Transformers): Vision-Language Model for OCR.
Google Gemini (google-generativeai): AI model for assessment and chat.
PyTorch: Deep learning framework (backend for Transformers).
PyMuPDF (fitz): PDF parsing and image conversion.
Pillow (PIL): Image manipulation.

Configuration ⚙️

API Keys: Google Gemini API key (handle securely!).
Models:
- OCR: Qwen/Qwen2.5-VL-7B-Instruct
- Assessment: gemini-1.5-flash
- Chat: gemini-1.5-pro
- These are hardcoded in main.py but could be made configurable.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
.streamlit		.streamlit
.gitignore		.gitignore
.python-version		.python-version
DOCKERFILE		DOCKERFILE
README.md		README.md
img.jpeg		img.jpeg
main.py		main.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Uniarch OCR & Answer Assessor 📝

Features ✨

Setup and Installation ⚙️

Prerequisites

Installation Steps

Running the Application 🚀

Docker Setup (GPU Required) 🐳

Prerequisites

Build the Docker Image

Run the Docker Container

Key Technologies 🛠️

Configuration ⚙️

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Uniarch OCR & Answer Assessor 📝

Features ✨

Setup and Installation ⚙️

Prerequisites

Installation Steps

Running the Application 🚀

Docker Setup (GPU Required) 🐳

Prerequisites

Build the Docker Image

Run the Docker Container

Key Technologies 🛠️

Configuration ⚙️

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages