MIRA (Memory-Integrated Retrieval Agent) is a modular, multi-agent system designed to process, retrieve, and query documents at scale. It combines vectorized memory (Qdrant), state-of-the-art NLP models, and an intelligent agent orchestration layer to deliver highly accurate and context-aware responses.
MIRA is built for systems that demand precision, scalability, and adaptability in document intelligence.
-
🧠 Integrated Memory
Persistent, context-aware memory enables efficient and intelligent document retrieval. -
🤖 Multi-Agent Architecture
Each document is handled by a dedicated agent, ensuring granular and scalable processing. -
⚡ High Performance Retrieval
Vector similarity search minimizes latency even across large datasets. -
🔌 Modular & Extensible
Easily swap embedding models, vector stores, or LLMs. -
🎯 Context-Aware Precision
Combines retrieval, planning, and reranking for high-quality responses.
-
Vectorized Document Storage
Uses HuggingFace embeddings stored in Qdrant for fast similarity search. -
Dynamic Agent Selection
Query Planner selects the most relevant agents based on context and performance. -
Custom Reranking
Enhances retrieval accuracy using contextual and statistical scoring. -
Scalable Pipeline
Designed to handle large-scale document collections efficiently. -
End-to-End NLP Pipeline
Integrates LLMs like GPT for advanced reasoning and response generation.
- Architecture
- Getting Started
- Configuration
- Usage
- Example Workflow
- File Structure
- Contributing
- License
MIRA follows a modular, multi-agent architecture:
Central orchestrator that:
- Routes queries to relevant agents
- Aggregates responses
- Works with Query Planner & Reranker
Handles individual documents:
- Converts documents into embeddings
- Evaluates query relevance
- Interfaces with Qdrant
- Selects relevant agents dynamically
- Uses query context + historical performance
- Reorders retrieved results
- Improves accuracy beyond raw similarity scores
- Powered by Qdrant
- Stores and retrieves embeddings efficiently
- Documents are ingested and embedded
- Stored in Qdrant vector database
- Query is processed by Master Agent
- Query Planner selects relevant Document Agents
- Results are retrieved and reranked
- Final response is generated using LLM
- Python 3.9+
- Docker
- Qdrant instance (local or cloud)
- OpenAI API Key
git clone https://github.com/your-org/mira.git
cd mira
pip install -r requirements.txt
Create a .env file in the root directory
OPENAI_API_KEY=your-openai-key
QDRANT_HOST=localhost
QDRANT_PORT=6333
COLLECTION_NAME=mira_documents
VECTOR_SIZE=384
VECTOR_DISTANCE=COSINE
Run Qdrant locally using Docker:
docker run -p 6333:6333 -p 6334:6334 \
-v $(pwd)/qdrant_storage:/qdrant/storage:z \
qdrant/qdrant
| Setting | Value |
|---|---|
| Embedding Model | sentence-transformers/all-MiniLM-L6-v2 |
| Language Model | gpt-4 |
| Vector Size | 384 |
| Distance Metric | COSINE |
bash comman.sh
python main.py --query "What is neural attention?" --docs_path "docs/sample.pdf"
tail -f logs/app.log
.
├── agents/
│ ├── document_agents.py
│ └── master_agent.py
├── modules/
│ ├── query_planner.py
│ └── reranker.py
├── index/
│ └── vector_store.py
├── utils/
│ ├── logger.py
├── config/
│ └── config.py
├── docs/
├── logs/
├── Dockerfile
├── comman.sh
└── README.md
Contributions are welcome!
- Fork the repository
- Create a new branch (feature/your-feature)
- Commit your changes
- Push to your branch
- Open a Pull Request
- For major changes, please open an issue first to discuss your ideas.
This project is licensed under the MIT License.
MIRA isn’t just a document retrieval system—it's a step toward building intelligent, memory-aware AI systems capable of reasoning over large-scale knowledge bases with precision and efficiency.