Skip to content

directtt/rag-with-knowledge-base-management

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

rag-with-knowledge-base-management

RAG (Retrieval-Augmented Generation) app integrated with a voice assistant and knowledge base management system.

Python Docker

Table of Contents

Introduction

This application integrates a RAG (Retrieval-Augmented Generation) model with a voice assistant, allowing users to interact with the system via voice or text input. Additionally, it includes a knowledge base management system, enabling users to add, view, and delete documents used by the RAG model via URLs.

Preview

rag-with-knowledge-base-management-720.mp4

Deployment

The application is deployed on Streamlit Share and can be accessed at the following URL:

Technologies

LangChain

LangChain is a framework designed for building applications that leverage language models. It provides tools for connecting language models to external data sources, enabling more complex and contextual interactions.

OpenAI Models

The application uses several OpenAI models to provide conversational capabilities and document retrieval:

  • Chat Model (default: gpt-3.5-turbo) to generate responses based on user queries and previous conversation context.
  • Whisper API (default: whisper-1) for automatic speech recognition to transcribe audio inputs from users.

Additionally, Cohere Re-ranker (default: rerank-english-v2.0) to improve the relevance of retrieved documents by re-ranking them based on their relevance to the query.

DeepLake Vector Store

DeepLake is used as a vector store to store and retrieve document embeddings. It facilitates efficient similarity search and retrieval of relevant documents from the knowledge base.

Apify

Apify is a web scraping and automation platform that allows for the extraction of data from websites. It is used to scrape documents from URLs provided by users and store them in the knowledge base.

Streamlit

Streamlit is an open-source app framework that allows for the creation of custom web applications for machine learning and data science projects with minimal effort. It is used here to build the user interface of the application.

Local installation & usage

To install the application locally, you need to have Docker installed on your machine. Then, run following commands:

  1. Build the Docker image:
docker build -t rag-with-knowledge-base-management .
  1. Run the Docker container:
docker run -p 8501:8501 rag-with-knowledge-base-management

The application should now be accessible at http://localhost:8501.

API keys

Please make sure to add your API keys to the .env file before running the application. The following keys inside .env.example need to be filled in:

  • OPENAI_API_KEY - OpenAI API key
  • COHERE_API_KEY - Cohere API key
  • APIFY_API_TOKEN - Apify API token
  • ACTIVELOOP_TOKEN - ActiveLoop API token
  • ACTIVELOOP_ORG_ID - ActiveLoop organization ID

License

Distributed under the open-source Apache 2.0 License. See LICENSE for more information.

References

Following repositories were useful in building this project:

About

RAG (Retrieval-Augmented Generation) app integrated with a voice assistant and knowledge base management system.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors