Skip to content

VinodLouis/receipt-extractor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

2 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

Receipt Extractor System

Lightweight receipt extraction service using NestJS, BullMQ, Redis, Ollama (LLM), Postgres and S3. This repo contains backend and frontend components and orchestration for local development.

Contents

  • backend โ€” NestJS backend (API, queue consumers, websocket gateway)
  • frontend โ€” React frontend (upload/list/details, realtime updates)
  • docker-compose.yml โ€” local Redis & Ollama services

Architecture Overview

Layers:

  • Frontend (React) โ€” REST + WebSocket client
  • Backend (NestJS) โ€” Controllers, Services, WebSocket Gateway
  • Processing Layer โ€” BullMQ queues, Redis cache, Ollama LLM
  • Storage Layer โ€” Postgres (main), Redis (temp), S3 (images) Architecture Diagram

Flow Diagram

Flow Diagram

Step-by-Step Installation

Prerequisites

  • Node.js 18+ and npm
  • Docker & docker-compose
  • PostgreSQL (local or remote)
  • AWS credentials (for S3) or local S3-compatible service
  • Ollama model (qwen2.5vl:7b) for image โ†’ structured data

1. Start Infrastructure Services

# Start Redis and ollama service
docker-compose up -d

# Verify services are running
docker-compose ps

2. Setup Ollama and Pull LLaVA Model

# Install qwen
docker exec ollama-vision ollama pull qwen2.5vl:7b

3. Setup Backend

4.1 Install Dependencies

cd backend
npm install

4.2 Configure Environment

nano .env

Required environment variables:

# Server Configuration
PORT=3000
NODE_ENV=development

# Database
DATABASE_URL= <Postgres URL>

# Redis
REDIS_HOST=localhost
REDIS_PORT=6379

# AWS S3
AWS_REGION=eu-north-1
AWS_ACCESS_KEY_ID=<AWS_KEY>
AWS_SECRET_ACCESS_KEY=<ACCESS_SECRET>

# Ollama
OLLAMA_HOST=http://localhost:11434
OLLAMA_MODEL=qwen2.5vl:7b

# CORS
CORS_ORIGIN=http://localhost:5173

# frontend/.env.example
VITE_API_URL=http://localhost:3000
VITE_WS_URL=ws://localhost:3000

4.3 Run Database Migrations

# Generate initial migration (if needed)
npm npx prisma generate

4.4 Start Backend Server

# Development mode (with hot reload)
npm run start:dev

Backend should now be running at http://localhost:3000

It also exposes a queeue monitoring dashboard at http://localhost:3000/queues

5. Setup Frontend

5.1 Install Dependencies

cd frontend
npm install

5.3 Start Frontend Server

# Development mode
npm run dev

Frontend should now be running at http://localhost:5173

Running Tests

Backend Tests

cd backend

# Unit tests
npm run test

# E2E tests
npm run test:e2e

Frontend Tests

cd frontend

# Run tests
npm run test

Troubleshooting

  • Redis connection errors: verify docker service redis or your host/port.
  • Bull worker not processing: confirm the processor class has @Processor('extraction') and process method is decorated @Process('process-receipt').
  • Queues missing in Bull Board: check AppModule import ordering (queues must be registered before BullBoard).
  • Ollama timeouts: ensure Ollama service is healthy and model is pulled.

Project TODO / Improvements

  • Persist raw LLM responses for traceability
  • Improve model configuration (prompt/params) and graceful shutdown of LLM connections
  • Add Playwright/Cypress E2E tests for the full flow
  • Optional: Support pluggable LLM backends and model choice

Sample Flow

  1. Initial email form for Auth mimic Email Form

  2. Form with valid email Email Form Valid

  3. Initial landing screen landing page

  4. Add new extraction Add New Extraction

  5. Image uploaded invalid Invalid image upload

  6. Image uploaded Valid image upload

  7. List while Extraction in progress List while extraction running

  8. Modal while Extraction in progress Modal while extraction running

  9. Real time updates Real time update

  10. Extracted successful response successful response

  11. Delete Option Delete option

  12. Delete Confirm Delete confirm

  13. Invalid image Invalid image


License

  • MIT

About

๐—ฅ๐—ฒ๐—ฐ๐—ฒ๐—ถ๐—ฝ๐˜ ๐—˜๐˜…๐˜๐—ฟ๐—ฎ๐—ฐ๐˜๐—ถ๐—ผ๐—ป ๐—”๐—ฝ๐—ฝ that can take any receipt as input, intelligently extract its contents, and display them in a structured format.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors