Tech stack: Node.js, Express, TypeScript, Redis, BullMQ, Zod, Docker.
- Upload a document (PDF, JPEG, PNG) via HTTP.
- Enqueue async processing with BullMQ (OCR -> Extract -> Validate -> Persist).
- Maintain per-document status in Redis: uploaded, processing, processed, validated, validation_failed, done, failed.
- Persist raw file on disk and metadata in Redis.
Docker (recommended):
- Build and start services
docker compose build
docker compose up-
API available at http://localhost:3000
-
Upload a file
curl -F "document=@assets/untitled.pdf" http://localhost:3000/upload- List docs (IDs)
curl http://localhost:3000/documents- List summaries
curl "http://localhost:3000/documents?summary=1"- Get a document by id
curl http://localhost:3000/documents/<id>Env vars:
- PORT: API port (default 3000)
- REDIS_URL: e.g. redis://localhost:6379 (docker-compose sets to redis://redis:6379)
- STORAGE_DIR: local file storage dir (default ./storage)
-
GET /health
- returns: { success, redis }
-
POST /upload
- form-data field: document (file)
- returns: { documentId, status, createdAt }
-
GET /documents
- returns: { ids: string[] }
-
GET /documents?summary=1
- returns: { documents: { id, status, originalFilename, size }[] }
-
GET /documents/:id
- returns: all metadata for a document (status, filenames, metadata when available)
- Upload: saves file to STORAGE_DIR and creates Redis hash doc: with status "uploaded".
- Queue: adds a job with attempts=3 and exponential backoff.
- Worker steps:
- processing -> simulateOCR -> processed
- extractInvoiceData -> validate (Zod)
- if invalid: validation_failed + save metadata + job fails (will retry)
- if valid: validated -> save metadata -> done
- on ultimate failure: status set to failed and job moved to dead-letter queue