📘 CreditExplain RAG

Evidence-backed, auditable explanations for credit decisions — with citations, PII-safety, and regulator-ready audit trails.

Built for the NSK.AI RAG Hackathon 2025, CreditExplain RAG helps compliance officers and analysts quickly answer:

"Why was this loan declined?"
"Which clause justifies KYC step X?"

The system integrates advanced RAG techniques (SELF-RAG critic loop, reranking, provenance logging, PII redaction) to provide trustworthy, auditable explanations.

Try the system here: https://credit-explain.vercel.app

📋 Table of Contents

Project Overview
Tech Stack
Installation & Setup
- Pre-requisites
- Setup Instructions
Basic Usage
Repository Structure
Known Issues
Future Development
Acknowledgement
Contact Information
License

✨ Project Overview

CreditExplain RAG addresses the critical need for transparent, evidence-based explanations in financial compliance and credit decisioning. Traditional AI systems often provide "black box" responses without verifiable sources, making them unsuitable for regulated environments.

Our solution provides:

Regulatory Compliance: Every explanation cites specific clauses from authoritative documents
Audit Trail: Complete provenance tracking with reflection token scoring
PII Protection: Automatic redaction of sensitive personal information
Multi-jurisdictional Support: Handling of Nigerian, Kenyan, and global financial regulations

🛠️ Tech Stack

Backend

FastAPI - Modern Python web framework
LangChain - RAG orchestration and tooling
ChromaDB - Vector database for document storage
HuggingFace Transformers - Embeddings and reranking models
Groq API - LLM inference for critic and generator components
Pydantic - Data validation and serialization

Frontend

React 18 - User interface library
TypeScript - Type-safe development
Vite - Build tool and dev server
Tailwind CSS - Utility-first CSS framework
TanStack Query - Server state management
Axios - HTTP client for API communication

Machine Learning

SELF-RAG Architecture - Adaptive retrieval with reflection tokens
sentence-transformers/all-MiniLM-L6-v2 - Embedding model
cross-encoder/ms-marco-MiniLM-L-6-v2 - Reranker model
Llama 3 - LLMs for generation and critique

🚀 Installation & Setup

Pre-requisites

Python 3.9+ installed on your system
Node.js 18+ and npm for frontend development
Groq API account and API key for LLM access
Git for version control

Setup Instructions

1. Clone & Install Backend Dependencies

git clone https://github.com/Maina314159/CreditExplain
cd CreditExplain

# Create virtual environment
python -m venv .venv
source .venv/bin/activate  # On Windows: venv\Scripts\activate

# Install Python dependencies
pip install -r requirements.txt

2. Configure Environment Variables

# Copy and update the environment template
cp .env.example .env

Edit .env with your configuration:

GROQ_API_KEY=your_groq_api_key_here

For frontend, also create an .env with the backend server added to it

# Go to frontend and create .env file
cd frontend
cp .env.example .env

# Then add backend server
VITE_BACKEND_URL=http://localhost:8000

3. Setup Frontend

cd frontend
npm install

4. Ingest Sample Documents

cd ..
python -m ingest.index

5. Run the Application

Start Backend API:

python -m api.app
# API server starts at http://localhost:8000

Start Frontend (in new terminal):

cd frontend
npm run dev
# Frontend starts at http://localhost:5173

CLI Application

You can also choose to interact with the terminal app:

python -m core.self_rag

💡 Basic Usage

Access the Web Interface: Open http://localhost:5173 in your browser
Upload Documents: Navigate to the Upload page to add regulatory PDFs
- Supported formats: PDF, text documents
- Documents are automatically chunked and indexed
Ask Questions: Use the Query interface to ask compliance questions like:
- "What are the capital requirements for banks in Nigeria?"
- "What are the financial regulations in Kenya"
- "What documents are required for KYC verification?"
Review Results: Each response includes:
- Evidence-backed explanations with citations
- Confidence scores (HIGH, MEDIUM, LOW)
- Source document references with exact excerpts
- Suggested follow-up questions
Monitor Performance: Check the Metrics dashboard for system performance and audit logs

📁 Repository Structure

.
├── .github/                 # GitHub templates and workflows
├── api/                     # FastAPI backend application
│   ├── app.py               # Main FastAPI application with CORS
│   ├── models.py            # Pydantic models for request/response
│   └── __init__.py
├── core/                    # RAG pipeline core components
│   ├── self_rag.py          # Main SELF-RAG orchestration logic
│   ├── critic.py            # Critic model for retrieval decisions
│   ├── generator.py         # Response generation component
│   ├── retrieval.py         # Vector retrieval functionality
│   ├── reranker.py          # Cross-encoder reranking
│   ├── prompts.py           # LLM prompt templates
│   ├── provenance.py        # Audit logging and provenance tracking
│   └── __init__.py
├── data/                    # Document storage
│   ├── raw/                 # Original PDF documents
│   └── interim/             # Processed data files
├── frontend/                # React TypeScript frontend
│   ├── src/
│   │   ├── api/             # API client and hooks
│   │   ├── components/      # Reusable UI components
│   │   ├── pages/           # Main application pages
│   │   ├── hooks/           # Custom React hooks
│   │   ├── types/           # TypeScript type definitions
│   │   └── utils/           # Utility functions
│   ├── package.json
│   ├── vite.config.ts
│   └── tsconfig.json
├── ingest/                  # Document ingestion pipeline
│   ├── loader.py            # PDF loading and parsing
│   ├── chunker.py           # Text chunking strategies
│   ├── index.py             # Vector indexing process
│   ├── normalize.py         # Text normalization
│   └── __init__.py
├── eval/                    # Evaluation and metrics
│   ├── metrics.py           # Performance metrics calculation
│   └── __init__.py
├── tests/                   # Test suites
│   ├── unit/                # Unit tests
│   ├── integration/         # Integration tests
│   ├── demo_data/           # Test data
│   └── scripts/             # Test scripts
├── requirements.txt         # Python dependencies
└── README.md               # This file

⚠️ Known Issues

Current Limitations

PDF Parsing Accuracy: Complex PDF layouts with tables and multi-column formats may not parse perfectly
Rate Limiting: Groq API has rate limits that may affect performance during high usage
Context Length: Currently limited to ~1000 token chunks due to model constraints
Metadata Extraction: Some document metadata (section headers, page numbers) may not be fully preserved

Performance Considerations

Initial document ingestion can be slow for large PDF collections
Real-time query processing typically takes 5-15 seconds depending on complexity
Vector search performance degrades with very large document collections (>10,000 chunks)

Browser Compatibility

Best experienced in modern browsers (Chrome, Firefox, Safari, Edge)
Mobile experience is functional but optimized for desktop use

🔮 Future Development

Planned Features

Real-time Collaboration: Multi-user support with shared workspaces
Advanced Document Types: Support for Word documents, HTML, and scanned PDFs
Custom Model Support: Integration with local LLMs and embedding models
Enhanced Analytics: Advanced dashboard with trend analysis and compliance reporting
API Extensions: Webhook support and third-party integrations

Research Directions

Improved chunking strategies for legal and regulatory documents
Multi-hop reasoning across multiple documents
Automated regulatory change detection and alerting
Cross-jurisdictional compliance mapping

🙏 Acknowledgement

This project was developed for the NSK.AI RAG Hackathon 2025 and builds upon several open-source technologies and research:

SELF-RAG Paper (Asai et al., 2023) for the adaptive retrieval framework
LangChain and LangSmith for RAG orchestration tools
HuggingFace for transformer models and embeddings
FastAPI for the high-performance backend framework
React Query for efficient server state management

Special thanks to the regulatory bodies whose documents made this system possible:

Central Bank of Nigeria (CBN)
Central Bank of Kenya (CBK)
Financial Action Task Force (FATF)

📞 Contact Information

For questions, issues, or contributions:

Project Maintainer: CreditExplain Team

GitHub Issues: Create an issue

We welcome bug reports, feature requests, and contributions from the community!

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

Built with ❤️ for the NSK.AI RAG Hackathon 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

📘 CreditExplain RAG

📋 Table of Contents

✨ Project Overview

🛠️ Tech Stack

Backend

Frontend

Machine Learning

🚀 Installation & Setup

Pre-requisites

Setup Instructions

1. Clone & Install Backend Dependencies

2. Configure Environment Variables

3. Setup Frontend

4. Ingest Sample Documents

5. Run the Application

💡 Basic Usage

📁 Repository Structure

⚠️ Known Issues

Current Limitations

Performance Considerations

Browser Compatibility

🔮 Future Development

Planned Features

Research Directions

🙏 Acknowledgement

📞 Contact Information

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 66 Commits
.github		.github
api		api
core		core
data		data
eval		eval
frontend		frontend
ingest		ingest
tests		tests
.env.example		.env.example
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

📘 CreditExplain RAG

📋 Table of Contents

✨ Project Overview

🛠️ Tech Stack

Backend

Frontend

Machine Learning

🚀 Installation & Setup

Pre-requisites

Setup Instructions

1. Clone & Install Backend Dependencies

2. Configure Environment Variables

3. Setup Frontend

4. Ingest Sample Documents

5. Run the Application

💡 Basic Usage

📁 Repository Structure

⚠️ Known Issues

Current Limitations

Performance Considerations

Browser Compatibility

🔮 Future Development

Planned Features

Research Directions

🙏 Acknowledgement

📞 Contact Information

📜 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages