🤝 HRMate

An AI-powered, agentic HR assistant that automatically answers employee email queries using company policy documents — powered by RAG (Retrieval-Augmented Generation), Google Gemini, and a multi-layered guardrail system.

📖 How It Works

HRMate watches an email inbox for new messages from employees. When one arrives, it:

Validates the email through input guardrails (prompt injection detection, content moderation).
Evaluates whether the email is a genuine HR query using an LLM fitness check.
Retrieves the most relevant policy sections using Hybrid Search (BM25 + FAISS) with LLM-based reranking.
Generates a professional, policy-grounded email reply using Gemini via a ReAct agent.
Validates the response through output guardrails (grounding check, PII redaction, leak detection).
Sends the reply automatically back to the employee via SMTP, or escalates to human HR if needed.

Employee Email
      │
      ▼
  Input Guardrails ──► LLM Fitness Check ──► ReAct Agent
  (injection, abuse)                              │
                                    ┌─────────────┼─────────────┐
                                    ▼             ▼             ▼
                              Policy_Retriever  Check_PTO   Submit_Leave
                              (BM25 + FAISS     Balance     Request
                               + LLM Rerank)
                                    │
                                    ▼
                            Output Guardrails
                         (grounding, PII, leaks)
                                    │
                              ┌─────┴─────┐
                              ▼           ▼
                        Auto-Reply    Escalate to
                        via SMTP      Human HR

🗂️ Project Structure

HRMate/
├── main.py              # Core email polling loop — reads inbox & sends AI replies
├── llm_runner.py         # Agentic RAG pipeline: ReAct agent with tools & guardrails
├── guardrails.py         # Input/output guardrail classes (injection, PII, moderation, leaks)
├── db_utils.py           # SQLite helpers: analytics logging, PTO balance, leave requests
├── rag_runner.py         # One-time script to chunk & index the policy document
├── requirements.txt      # Python dependencies
├── rag/
│   ├── retriever.py      # Hybrid retriever: BM25 + FAISS ensemble → LLM relevance filter
│   └── doc/
│       ├── policy.txt           # Company HR policy document
│       └── system_prompt.md     # Prompt template that governs the AI's behaviour
└── tests/
    ├── conftest.py        # Shared pytest fixtures (temp DB, monkeypatching)
    ├── test_db_utils.py   # Unit tests for database utilities
    ├── test_guardrails.py # Unit tests for all guardrail classes
    └── test_llm_runner.py # Unit tests for LLM pipeline (mocked, no API calls)

🛡️ Guardrails

HRMate implements a multi-layered guardrail system to ensure safe, accurate, and professional responses:

Input Guardrails

Guardrail	Description
InputSanitizer	Detects prompt injection attempts (jailbreaks, "ignore instructions", XSS, DAN)
ContentModerationGuard	Flags abusive or threatening language
Email Truncation	Caps input at 3,000 characters to prevent prompt stuffing

Output Guardrails

Guardrail	Description
Grounding Validation	LLM check to ensure the response is factually grounded in retrieved policy
ResponseValidator	Detects leaked API keys, system prompts, or raw agent traces
PIIDetector	Detects and redacts SSNs, credit cards, phone numbers from responses
Length Limit	Caps response at 2,000 characters

Agent Guardrails

Guardrail	Description
Max Iterations	Agent limited to 5 reasoning steps to prevent infinite loops
Execution Timeout	60-second hard timeout on agent execution
Escalation	Automatic escalation to human HR when confidence is low

⚙️ Setup

1. Clone the repository

git clone https://github.com/your-username/HRMate.git
cd HRMate

2. Create & activate virtual environment

python -m venv myenv

# Windows
myenv\Scripts\activate

# macOS/Linux
source myenv/bin/activate

3. Install dependencies

pip install -r requirements.txt

4. Configure environment variables

Create a .env file in the project root:

# Email credentials
EMAIL_USER=your-email@example.com
EMAIL_PASS=your-app-password

# IMAP & SMTP servers (example: Gmail)
IMAP_SERVER=imap.gmail.com
SMTP_SERVER=smtp.gmail.com

# Google Gemini
GOOGLE_API_KEY=your-google-api-key

Gmail users: Use an App Password instead of your regular password, and enable IMAP in Gmail settings.

5. Add your HR policy document

Place your company's policy as plain text at rag/doc/policy.txt.

6. Index the policy document

Run this once (or whenever the policy changes):

python rag_runner.py

This uses a two-stage section-aware chunking strategy to split the policy into semantically meaningful chunks, generates embeddings via Google's gemini-embedding-001 model, and creates both a BM25 index and a FAISS vector store locally.

The indexer prints a chunk summary so you can verify each section was captured correctly.

7. Start the auto-reply bot

python main.py

The bot will poll the inbox every 2 seconds and automatically reply to any unread emails. Press Ctrl+C to stop.

🧪 Testing

HRMate includes a comprehensive test suite with 63 tests covering all modules. Tests use mocks — no real API calls or email connections needed.

# Run all tests
python -m pytest tests/ -v

# Run specific test files
python -m pytest tests/test_guardrails.py -v
python -m pytest tests/test_db_utils.py -v
python -m pytest tests/test_llm_runner.py -v

Test Coverage

Module	Tests	What's Covered
`test_db_utils.py`	11	Table creation, seed idempotency, logging, PTO balance, leave requests
`test_guardrails.py`	35	Prompt injection (9 patterns), PII detection/redaction, content moderation, response validation, composite runners
`test_llm_runner.py`	17	Email fitness eval, policy retrieval, grounding validation, tool wrappers, end-to-end agent pipeline

🧠 AI Behaviour

The assistant is guided by the system prompt in rag/doc/system_prompt.md:

Strictly policy-based: All answers are drawn exclusively from the retrieved policy chunks — no hallucination.
Anti-hallucination rules: Never fabricates numbers, dates, amounts, or policy details. References section numbers when quoting policy.
Professional & friendly: Replies are warm, concise, and ready to send as-is.
Graceful fallback: If the policy doesn't cover a query, the bot escalates to human HR rather than guessing.
Agentic tools: The agent can check PTO balances and submit leave requests via database tools.

📦 Chunking Strategy

HRMate uses a two-stage section-aware chunking pipeline optimised for structured HR policy documents:

Stage	What It Does	Config
Stage 1	Splits at `Section`, triple/double newline boundaries	2000 chars, no overlap
Stage 2	Sub-splits any chunk still > 2000 chars	1500 chars, 200 char overlap

Why not basic RecursiveCharacterTextSplitter?

The policy has clear Section X.Y structure — blind character splits break tables and subsections in half
Section-aware separators (\nSection , \n\n\n, \n\n) keep full policy subsections intact
Each chunk is enriched with section metadata (section_number, section_title, subsection) for citation
The "Works cited" footer is stripped before chunking since it's not policy content

Result: 36 section-aligned chunks (vs ~57 blind chunks), each mapping to a complete policy subsection.

🛠️ Tech Stack

Component	Technology
LLM	Google Gemini 2.0 Flash
Embeddings	Google `gemini-embedding-001`
Vector Database	FAISS (local)
Keyword Search	BM25 (via `rank_bm25`)
Chunking	Two-stage section-aware (2000 → 1500 chars)
Retrieval Strategy	Hybrid Ensemble (BM25 + FAISS) → LLM Reranking
Agent Framework	LangChain (ReAct agent)
Database	SQLite (analytics, PTO, leave requests)
Email (Read)	Python `imaplib`
Email (Send)	Python `smtplib`
Guardrails	Custom regex-based + LLM-based validation
Testing	pytest (63 tests, fully mocked)
Config	`python-dotenv`

📝 Notes

The polling interval is set to 2 seconds (POLL_INTERVAL_SECONDS in main.py). Adjust as needed.
The chunker uses a two-stage section-aware strategy — Stage 1 splits at section boundaries (2000 chars), Stage 2 sub-splits oversized chunks (1500 chars, 200 overlap).
Each chunk is tagged with section metadata (number + title) for precise citation in responses.
The hybrid retriever returns the top 5 results per retriever, combined via Reciprocal Rank Fusion.
The bot only processes UNSEEN emails, so already-read messages are skipped.
All guardrails are fail-open on validation errors (to avoid blocking legitimate queries) but fail-closed on detected threats (prompt injection, abuse).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤝 HRMate

📖 How It Works

🗂️ Project Structure

🛡️ Guardrails

Input Guardrails

Output Guardrails

Agent Guardrails

⚙️ Setup

1. Clone the repository

2. Create & activate virtual environment

3. Install dependencies

4. Configure environment variables

5. Add your HR policy document

6. Index the policy document

7. Start the auto-reply bot

🧪 Testing

Test Coverage

🧠 AI Behaviour

📦 Chunking Strategy

🛠️ Tech Stack

📝 Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
rag		rag
tests		tests
.gitignore		.gitignore
README.md		README.md
db_utils.py		db_utils.py
guardrails.py		guardrails.py
llm_runner.py		llm_runner.py
main.py		main.py
rag_runner.py		rag_runner.py
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

🤝 HRMate

📖 How It Works

🗂️ Project Structure

🛡️ Guardrails

Input Guardrails

Output Guardrails

Agent Guardrails

⚙️ Setup

1. Clone the repository

2. Create & activate virtual environment

3. Install dependencies

4. Configure environment variables

5. Add your HR policy document

6. Index the policy document

7. Start the auto-reply bot

🧪 Testing

Test Coverage

🧠 AI Behaviour

📦 Chunking Strategy

🛠️ Tech Stack

📝 Notes

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages