Skip to content

xeonvs/RAG-App

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RAG-App

RAG-App is an intelligent app built on the RAG (Retrieval-Augmented Generation) architecture, which allows you to effectively work with various data sources and provide accurate answers based on contextual information. This powerful tool combines the capabilities of large language models with efficient information retrieval to deliver precise and contextually relevant responses.

Overview

RAG-App is designed to solve the complex challenge of managing and accessing information across multiple platforms and formats. It serves as a unified knowledge management system that:

  • Automatically processes and indexes documentation from various sources (Confluence, YouTrack, Git repositories)
  • Maintains an up-to-date knowledge base through intelligent incremental updates
  • Provides semantic search capabilities across all integrated sources
  • Delivers context-aware responses based on the most relevant information
  • Handles multiple document formats and specialized content types (like Ansible playbooks)

The bot is particularly valuable for organizations dealing with distributed documentation, technical teams requiring quick access to information, and projects that need to maintain synchronized knowledge across different platforms.

Key Capabilities

Knowledge Management and Documentation

  • Confluence Integration

    • Automatic indexing of Confluence documentation
    • PDF attachment processing and indexing
    • Space-specific content management
    • Real-time documentation updates
  • Document Processing

    • Multi-format support (PDF, DOCX, XLSX)
    • Intelligent text extraction
    • Metadata preservation
    • Automatic chunking and processing

Project Management and Issue Tracking

  • YouTrack Integration

    • Multi-project issue indexing
    • Cross-project knowledge search
    • Issue tracking and documentation sync
    • Project-specific filtering
  • Git Repository Management

    • Codebase documentation indexing
    • Ansible playbook processing
    • Infrastructure code documentation
    • Repository change tracking

Intelligent Information Processing

  • Advanced Search Capabilities

    • Semantic understanding of queries
    • Context-aware response generation
    • Multi-source information integration
    • Hybrid search (semantic + keyword)
  • Automated Maintenance

    • Incremental knowledge base updates
    • Change detection and processing
    • Cache management
    • Source synchronization

User Interface and Integration

  • Streamlit Web Interface

    • Interactive query interface
    • Real-time response generation
    • Document upload and processing
    • Search history management
  • API and Integration

    • REST API endpoints
    • Webhook support
    • Custom data source integration
    • Export capabilities

Technologies

  • LangChain for text processing and generation
  • ChromaDB for vector storage
  • Streamlit for web interface
  • OpenAI for language models
  • Various parsers for working with documents

Installation

  1. Clone the repository:
git clone [repository URL]
cd RAG-App
  1. Install dependencies:
pip install -r requirements.txt
  1. Set up the configuration in the config.py file

Project Configuration

Environment Setup

  1. Create a .env file based on .env.template:
cp .env.template .env
  1. Configure the following environment variables:

API Keys and Authentication

OPENAI_API_KEY=your_openai_api_key

YouTrack Configuration

YOUTRACK_URL=your_youtrack_instance_url
YOUTRACK_TOKEN=your_youtrack_permanent_token

To obtain a YouTrack permanent token:

  1. Log in to your YouTrack instance
  2. Go to Settings → Personal → Tokens
  3. Click "Generate new token"
  4. Copy the generated token

Confluence Configuration

CONFLUENCE_URL=your_confluence_instance_url
CONFLUENCE_EMAIL=your_confluence_email
CONFLUENCE_API_TOKEN=your_confluence_api_token

To obtain a Confluence API token:

  1. Log in to your Atlassian account
  2. Go to Security → Create and manage API tokens
  3. Click "Create API token"
  4. Copy the generated token

Vector Database Configuration

  1. ChromaDB is used as the vector database for storing and retrieving embeddings
  2. Configure database settings in config.py:
PERSIST_DIRECTORY = "path/to/chroma/db"  # Default: '../db/chroma/'

Structure project

RAG-App/
├── src/                   # Source code
│ ├── loaders/             # Data loaders
│ ├── parsers/             # Document parsers
│ ├── maintenance_scripts/ # Maintenance scripts
│ ├── vector_db.py         # Working with vector DB
│ ├── streamlit_app.py     # Web interface
│ ├── rag_bot.py           # Main RAG logic
│ └── main.py              # Entry point
├── data/                  # Data
├── cache/                 # Cache
├── db/                    # Database data
├── config.py              # Configuration
└── requirements.txt       # Dependencies

Usage

  1. Run the web interface:
streamlit run src/streamlit_app.py
  1. Or use via Python API:
from src.rag_bot import RagApp

bot = RAGBot()
response = bot.query("Your question")

Requirements

  • Python 3.10+
  • Dependencies from requirements.txt
  • Access to OpenAI API or local LLM
  • (Optional) Configured integrations with Confluence and YouTrack

About

RAG-App is an intelligent chat application built on the RAG (Retrieval-Augmented Generation) architecture, which allows you to effectively work with Confluence, YouTrack, Git repositories data sources and provide accurate answers based on contextual information.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages