Evidence-backed, auditable explanations for credit decisions โ with citations, PII-safety, and regulator-ready audit trails.
CreditExplain RAG helps compliance officers and analysts quickly answer:
- "Why was this loan declined?"
- "Which clause justifies KYC step X?"
The system integrates advanced RAG techniques (SELF-RAG critic loop, reranking, provenance logging, PII redaction) to provide trustworthy, auditable explanations.
Try the system here: https://credit-explain.vercel.app
- Project Overview
- Tech Stack
- Installation & Setup
- Basic Usage
- Repository Structure
- Known Issues
- Future Development
- Acknowledgement
- Contact Information
- License
CreditExplain RAG addresses the critical need for transparent, evidence-based explanations in financial compliance and credit decisioning. Traditional AI systems often provide "black box" responses without verifiable sources, making them unsuitable for regulated environments.
Our solution provides:
- Regulatory Compliance: Every explanation cites specific clauses from authoritative documents
- Audit Trail: Complete provenance tracking with reflection token scoring
- PII Protection: Automatic redaction of sensitive personal information
- Multi-jurisdictional Support: Handling of Nigerian, Kenyan, and global financial regulations
1. Description and Purpose Banks and fintechs use AI models to assess creditworthiness by analyzing transaction history, repayment behavior, mobile data, and alternative data sources. The goal is faster, more accurate lending decisions.
2. Ethical Risks Bias & Financial Exclusion: Marginalized groups may be unfairly denied credit due to biased data. Lack of Explainability: Applicants often donโt understand why loans are rejected. Privacy Concerns: Use of alternative data (e.g., phone usage) can be intrusive.
3. Responsible AI Mitigations This is where CreditExplain comes in!
-Fairness testing across demographics -Explainable credit decisions (reason codes) -Strict limits on data sources and informed consent.
- FastAPI - Modern Python web framework
- LangChain - RAG orchestration and tooling
- ChromaDB - Vector database for document storage
- HuggingFace Transformers - Embeddings and reranking models
- Groq API - LLM inference for critic and generator components
- Pydantic - Data validation and serialization
- React 18 - User interface library
- TypeScript - Type-safe development
- Vite - Build tool and dev server
- Tailwind CSS - Utility-first CSS framework
- TanStack Query - Server state management
- Axios - HTTP client for API communication
- SELF-RAG Architecture - Adaptive retrieval with reflection tokens
- sentence-transformers/all-MiniLM-L6-v2 - Embedding model
- cross-encoder/ms-marco-MiniLM-L-6-v2 - Reranker model
- Llama 3 - LLMs for generation and critique
- Python 3.9+ installed on your system
- Node.js 18+ and npm for frontend development
- Groq API account and API key for LLM access
- Git for version control
git clone https://github.com/Maina314159/CreditExplain
cd CreditExplain
# Create virtual environment
python -m venv .venv
source .venv/bin/activate # On Windows: venv\Scripts\activate
# Install Python dependencies
pip install -r requirements.txt# Copy and update the environment template
cp .env.example .envEdit .env with your configuration:
GROQ_API_KEY=your_groq_api_key_here- For frontend, also create an
.envwith the backend server added to it
# Go to frontend and create .env file
cd frontend
cp .env.example .env
# Then add backend server
VITE_BACKEND_URL=http://localhost:8000cd frontend
npm installcd ..
python -m ingest.indexStart Backend API:
python -m api.app
# API server starts at http://localhost:8000Start Frontend (in new terminal):
cd frontend
npm run dev
# Frontend starts at http://localhost:5173CLI Application
- You can also choose to interact with the terminal app:
python -m core.self_rag-
Access the Web Interface: Open http://localhost:5173 in your browser
-
Upload Documents: Navigate to the Upload page to add regulatory PDFs
- Supported formats: PDF, text documents
- Documents are automatically chunked and indexed
-
Ask Questions: Use the Query interface to ask compliance questions like:
- "What are the capital requirements for banks in Nigeria?"
- "What are the financial regulations in Kenya"
- "What documents are required for KYC verification?"
-
Review Results: Each response includes:
- Evidence-backed explanations with citations
- Confidence scores (HIGH, MEDIUM, LOW)
- Source document references with exact excerpts
- Suggested follow-up questions
-
Monitor Performance: Check the Metrics dashboard for system performance and audit logs
.
โโโ .github/ # GitHub templates and workflows
โโโ api/ # FastAPI backend application
โ โโโ app.py # Main FastAPI application with CORS
โ โโโ models.py # Pydantic models for request/response
โ โโโ __init__.py
โโโ core/ # RAG pipeline core components
โ โโโ self_rag.py # Main SELF-RAG orchestration logic
โ โโโ critic.py # Critic model for retrieval decisions
โ โโโ generator.py # Response generation component
โ โโโ retrieval.py # Vector retrieval functionality
โ โโโ reranker.py # Cross-encoder reranking
โ โโโ prompts.py # LLM prompt templates
โ โโโ provenance.py # Audit logging and provenance tracking
โ โโโ __init__.py
โโโ data/ # Document storage
โ โโโ raw/ # Original PDF documents
โ โโโ interim/ # Processed data files
โโโ frontend/ # React TypeScript frontend
โ โโโ src/
โ โ โโโ api/ # API client and hooks
โ โ โโโ components/ # Reusable UI components
โ โ โโโ pages/ # Main application pages
โ โ โโโ hooks/ # Custom React hooks
โ โ โโโ types/ # TypeScript type definitions
โ โ โโโ utils/ # Utility functions
โ โโโ package.json
โ โโโ vite.config.ts
โ โโโ tsconfig.json
โโโ ingest/ # Document ingestion pipeline
โ โโโ loader.py # PDF loading and parsing
โ โโโ chunker.py # Text chunking strategies
โ โโโ index.py # Vector indexing process
โ โโโ normalize.py # Text normalization
โ โโโ __init__.py
โโโ eval/ # Evaluation and metrics
โ โโโ metrics.py # Performance metrics calculation
โ โโโ __init__.py
โโโ tests/ # Test suites
โ โโโ unit/ # Unit tests
โ โโโ integration/ # Integration tests
โ โโโ demo_data/ # Test data
โ โโโ scripts/ # Test scripts
โโโ requirements.txt # Python dependencies
โโโ README.md # This file
- PDF Parsing Accuracy: Complex PDF layouts with tables and multi-column formats may not parse perfectly
- Rate Limiting: Groq API has rate limits that may affect performance during high usage
- Context Length: Currently limited to ~1000 token chunks due to model constraints
- Metadata Extraction: Some document metadata (section headers, page numbers) may not be fully preserved
- Initial document ingestion can be slow for large PDF collections
- Real-time query processing typically takes 5-15 seconds depending on complexity
- Vector search performance degrades with very large document collections (>10,000 chunks)
- Best experienced in modern browsers (Chrome, Firefox, Safari, Edge)
- Mobile experience is functional but optimized for desktop use
- Real-time Collaboration: Multi-user support with shared workspaces
- Advanced Document Types: Support for Word documents, HTML, and scanned PDFs
- Custom Model Support: Integration with local LLMs and embedding models
- Enhanced Analytics: Advanced dashboard with trend analysis and compliance reporting
- API Extensions: Webhook support and third-party integrations
- Improved chunking strategies for legal and regulatory documents
- Multi-hop reasoning across multiple documents
- Automated regulatory change detection and alerting
- Cross-jurisdictional compliance mapping
This project was developed for the NSK.AI RAG Hackathon 2025 and builds upon several open-source technologies and research:
- SELF-RAG Paper (Asai et al., 2023) for the adaptive retrieval framework
- LangChain and LangSmith for RAG orchestration tools
- HuggingFace for transformer models and embeddings
- FastAPI for the high-performance backend framework
- React Query for efficient server state management
Special thanks to the regulatory bodies whose documents made this system possible:
- Central Bank of Nigeria (CBN)
- Central Bank of Kenya (CBK)
- Financial Action Task Force (FATF)
For questions, issues, or contributions:
- Project Maintainer: CreditExplain Team
- GitHub Issues: Create an issue
We welcome bug reports, feature requests, and contributions from the community!
This project is licensed under the MIT License - see the LICENSE file for details.