RAG-App is an intelligent app built on the RAG (Retrieval-Augmented Generation) architecture, which allows you to effectively work with various data sources and provide accurate answers based on contextual information. This powerful tool combines the capabilities of large language models with efficient information retrieval to deliver precise and contextually relevant responses.
RAG-App is designed to solve the complex challenge of managing and accessing information across multiple platforms and formats. It serves as a unified knowledge management system that:
- Automatically processes and indexes documentation from various sources (Confluence, YouTrack, Git repositories)
- Maintains an up-to-date knowledge base through intelligent incremental updates
- Provides semantic search capabilities across all integrated sources
- Delivers context-aware responses based on the most relevant information
- Handles multiple document formats and specialized content types (like Ansible playbooks)
The bot is particularly valuable for organizations dealing with distributed documentation, technical teams requiring quick access to information, and projects that need to maintain synchronized knowledge across different platforms.
-
Confluence Integration
- Automatic indexing of Confluence documentation
- PDF attachment processing and indexing
- Space-specific content management
- Real-time documentation updates
-
Document Processing
- Multi-format support (PDF, DOCX, XLSX)
- Intelligent text extraction
- Metadata preservation
- Automatic chunking and processing
-
YouTrack Integration
- Multi-project issue indexing
- Cross-project knowledge search
- Issue tracking and documentation sync
- Project-specific filtering
-
Git Repository Management
- Codebase documentation indexing
- Ansible playbook processing
- Infrastructure code documentation
- Repository change tracking
-
Advanced Search Capabilities
- Semantic understanding of queries
- Context-aware response generation
- Multi-source information integration
- Hybrid search (semantic + keyword)
-
Automated Maintenance
- Incremental knowledge base updates
- Change detection and processing
- Cache management
- Source synchronization
-
Streamlit Web Interface
- Interactive query interface
- Real-time response generation
- Document upload and processing
- Search history management
-
API and Integration
- REST API endpoints
- Webhook support
- Custom data source integration
- Export capabilities
- LangChain for text processing and generation
- ChromaDB for vector storage
- Streamlit for web interface
- OpenAI for language models
- Various parsers for working with documents
- Clone the repository:
git clone [repository URL]
cd RAG-App- Install dependencies:
pip install -r requirements.txt- Set up the configuration in the
config.pyfile
- Create a
.envfile based on.env.template:
cp .env.template .env- Configure the following environment variables:
OPENAI_API_KEY=your_openai_api_key
YOUTRACK_URL=your_youtrack_instance_url
YOUTRACK_TOKEN=your_youtrack_permanent_token
To obtain a YouTrack permanent token:
- Log in to your YouTrack instance
- Go to Settings → Personal → Tokens
- Click "Generate new token"
- Copy the generated token
CONFLUENCE_URL=your_confluence_instance_url
CONFLUENCE_EMAIL=your_confluence_email
CONFLUENCE_API_TOKEN=your_confluence_api_token
To obtain a Confluence API token:
- Log in to your Atlassian account
- Go to Security → Create and manage API tokens
- Click "Create API token"
- Copy the generated token
- ChromaDB is used as the vector database for storing and retrieving embeddings
- Configure database settings in
config.py:
PERSIST_DIRECTORY = "path/to/chroma/db" # Default: '../db/chroma/'RAG-App/
├── src/ # Source code
│ ├── loaders/ # Data loaders
│ ├── parsers/ # Document parsers
│ ├── maintenance_scripts/ # Maintenance scripts
│ ├── vector_db.py # Working with vector DB
│ ├── streamlit_app.py # Web interface
│ ├── rag_bot.py # Main RAG logic
│ └── main.py # Entry point
├── data/ # Data
├── cache/ # Cache
├── db/ # Database data
├── config.py # Configuration
└── requirements.txt # Dependencies
- Run the web interface:
streamlit run src/streamlit_app.py- Or use via Python API:
from src.rag_bot import RagApp
bot = RAGBot()
response = bot.query("Your question")- Python 3.10+
- Dependencies from requirements.txt
- Access to OpenAI API or local LLM
- (Optional) Configured integrations with Confluence and YouTrack