AI Research Writers Tool

This project is an AI-powered technical article generator that uses ArXiv papers as its knowledge base. The system consists of two main components:

arxivdatabase.py: A script to fetch and store ArXiv papers in a vector database
arxivapp.py: A Streamlit web application that generates technical articles based on user queries

Features

Fetches thousands of ArXiv papers across multiple computer science categories
Stores papers in a Chroma vector database with embeddings
Provides a user-friendly web interface to generate technical articles
Retrieves relevant research papers based on the user's topic
Generates structured academic papers with proper citations

Requirements

Python 3.8+
OpenAI API key
Internet connection (for fetching ArXiv data)

Installation

Clone the repository:

git clone https://github.com/HappyHackingSpace/AI-Research-Writers-Tool.git
cd arxiv-research-tool

Install dependencies:

pip install -r requirements.txt

Create a .env file in the project root and add your OpenAI API key:

OPENAI_API_KEY=your_api_key_here

Usage

Step 1: Build the Vector Database

Run the database builder script to fetch ArXiv papers and create the vector database:

python arxivdatabase.py

This process may take several hours depending on the number of papers you're fetching (default is 5000).

Note: You can adjust the num_papers variable in the script to change the number of papers fetched per category.

Step 2: Launch the Web Application

After building the database, run the Streamlit application:

streamlit run arxivapp.py

The web interface will open in your browser. Enter a topic in the text field and click "Generate Article" to create a technical article based on relevant ArXiv papers.

How It Works

arxivdatabase.py

Fetches papers from ArXiv across multiple computer science categories
Extracts title, authors, summary, and other metadata
Splits the text into chunks suitable for embedding
Creates embeddings using OpenAI's text-embedding-3-small model
Stores the embedded documents in a Chroma vector database

arxivapp.py

Accepts a topic from the user via the Streamlit interface
Searches the vector database for relevant research papers
Allows the user to select the number of references to use
Generates a structured academic paper using OpenAI's GPT-4o model
Displays the generated article with proper formatting

Configuration

You can modify the following parameters in the scripts:

num_papers: Number of papers to fetch per category (default: 5000)
results_article: Maximum number of results per ArXiv API request (default: 200)
batch_size: Number of documents to add to the vector database in each batch (default: 5000)
chunk_size: Size of text chunks for embedding (default: 2000)
chunk_overlap: Overlap between consecutive chunks (default: 100)
model: LLM model used for generation (default: "gpt-4o")
temperature: Creativity parameter for the LLM (default: 0.4)

Limitations

The ArXiv API has rate limits, so the fetching process includes sleep intervals
Generating articles for very niche topics may result in less relevant content
The quality of generated articles depends on the availability of relevant papers in the database

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
chroma_arxiv_db		chroma_arxiv_db
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
arxivapp.py		arxivapp.py
arxivdatabase.py		arxivdatabase.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

AI Research Writers Tool

Features

Requirements

Installation

Usage

Step 1: Build the Vector Database

Step 2: Launch the Web Application

How It Works

arxivdatabase.py

arxivapp.py

Configuration

Limitations

License

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

AI Research Writers Tool

Features

Requirements

Installation

Usage

Step 1: Build the Vector Database

Step 2: Launch the Web Application

How It Works

arxivdatabase.py

arxivapp.py

Configuration

Limitations

License

About

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages