Skip to content

MadtorXD/ResearchMate-AI

Repository files navigation

📊 ResearchMate AI - Automated Literature Discovery & Research Agent

ResearchMate AI is a tool-augmented, agentic LLM-powered system built to automate research workflows such as academic paper search, PDF ingestion, summarization, and structured insight extraction. It integrates multiple tools—Arxiv search, PDF reading, and intelligent reasoning—within a REAct-style agent loop to produce high-quality research outputs.

What exactly the project does:

  • Searches the latest academic papers from arXiv.org based on any research topic entered by the user.

  • Reads and extracts content from selected PDF research papers dynamically.

  • Analyzes the paper, identifies research gaps, and proposes new research directions.

  • Writes a complete LaTeX formatted research paper and exports it as a downloadable PDF.

The project demonstrates how modern LLMs can coordinate external tools, stream structured reasoning paths, and produce actionable research output suitable for academic and scientific applications.

Important

You can tell the Agent to in very detail about what type of research paper you want exactly.

🖧 Technical Architecture

Logo

📁 Project Structure

ResearchMate-AI/
│
├── ai_researcher.py         # Simple ReAct Agent workflow
├── ai_researcher_2.py       # Enhanced LangGraph agent with streaming & memory
├── arxiv_tool.py            # Arxiv search integration
├── frontend.py              # Streamlit interface for chat interaction
├── pyproject.toml           # Python dependencies
├── read_pdf.py              # PDF parsing tool using PyPDF2
├── write_pdf.py             # LaTeX → PDF rendering using Tectonic
└── README.md                # Project documentation

🧩 Key Components

1️⃣ ai_researcher.py — Base ReAct Agent Implementation

  • This file contains the initial foundational version of the AI Research Agent using a ReAct (Reasoning + Acting) paradigm. It demonstrates the traditional LangChain workflow where the model reasons step-by-step and invokes tools when needed.
    • Initializes the model using Google Gemini 2.5 Pro.
    • Defines and registers critical tools: arxiv_search, read_pdf, and render_latex_pdf
    • Creates a ReAct-based agent graph using LangGraph prebuilt utilities.
    • Streams model responses sequentially in the console using a generator.
    • Handles continuous conversation through while True loop.

2️⃣ ai_researcher_2.py — Advanced LangGraph Agent with Stateful Workflows

  • This is the core intelligence engine of the project, built using LangGraph, enabling multi-step autonomous decision making and memory persistence.
    • Defines the State object using TypedDict to manage message context
    • Uses conditional routing logic to decide dynamically whether tool invocation is required or whether response is final.
    • Implements bidirectional loop between:
      • Agent node → LLM reasoning
      • Tools node → external tool execution
    • Adds memory checkpointing using MemorySaver which enables persistent conversations, reversible workflows, and reproducibility.
Capability Description
Dynamic Tool Binding LLM determines when to call tools automatically
Streaming Thought Process Real-time incremental response construction
Stateful Graph Maintains chat history across interactions
Re-entrant Architecture Prevents model hallucinations by checking tool call necessity

Note: This file transforms the project from a simple agent into a production-grade autonomous research workflow controller.

3️⃣ arxiv_tool.py — Research Paper Retrieval Utility

  • This module integrates the system with arXiv.org, the world’s largest open-access scientific research repository.
    • Accepts query topics and formulates structured search queries.
    • Retrieves metadata such as:
      • Title, abstract, authors
      • Publication date
      • PDF download link
    • Formats the results in a structured payload for the agent to evaluate.

Important

Enables automated academic discovery and exploration of latest published works. And Powers the agent's ability to recommend relevant papers intelligently.

4️⃣ frontend.py — Streamlit Chat UI Interface

  • The interactive user-facing interface that controls user prompts and visual output.
  • Functional Capabilities:
    • Provides chat input and streaming assistant responses with progressive updates.
    • Displays model messages and maintains internal session history.
    • Logs tool usage and backend operations during execution.
    • Acts as the gateway client for interacting with LangGraph agent.
  • Why it is critical:
    • Converts backend agent workflow into a usable web application.
    • Offers production-ready UI suitable for deployments and demos.

5️⃣ read_pdf.py — LaTeX to PDF Rendering Tool

  • Responsible for exporting the final structured research paper.
  • Core Capabilities:
    • Generates .tex file dynamically using received LaTeX content.
    • Invokes Tectonic engine to compile the LaTeX into a PDF.
    • Saves time-stamped output into /output directory.
  • Why it matters:
    • Enables automatic generation of polished, academic-style research papers.
    • Supports mathematical typesetting, proof formatting, and equations.

🛠️ Tech Stack

  • Language: Python (>= 3.11)
  • Environment & packaging: uv (for virtualenv + dependency management via pyproject.toml)
  • AI Framework: LangChain, LangGraph
  • UI Framework: Streamlit
  • LLM / Agent: Google Gemini 2.5 Pro
  • PDF Tools: PyPDF2, Tectonic

Dependencies (from pyproject.toml):

  • langchain>=0.3.27
  • langchain-core>=0.3.72
  • langchain-google-genai>=2.1.9
  • langgraph>=0.6.3
  • pypdf2>=3.0.1
  • python-dotenv>=1.1.1
  • requests>=2.32.4
  • streamlit>=1.48.0

📦 Prerequisites

  • Python 3.11+ installed on your system.
  • uv installed (for virtual environment + dependency management).
  • Google Gemini API Key.
  • Tectonic PDF Processor installed locally.

⚙️ Setup with uv

All commands below assume you are in the project root: ResearchMate-AI/.

1. Clone and install

git clone https://github.com/MadtorXD/ResearchMate-AI.git
cd ResearchMate-AI

# Install dependencies and create .venv using uv
uv sync

2. Activate the virtual environment

# windowsOS / powerShell
.venv\Scripts\Activate.ps1

If you prefer not to activate the venv manually, you can also run commands through uv directly (see examples below).

3. Configure environment variables

Create a .env file

GOOGLE_API_KEY= "YOUR_API_KEY_HERE"

4. Running the Backend

`streamlit run frontend.py`

Important

Security note: Don’t commit real keys; use .env or environment variables in production.

Note

If you prefer environment variables, adapt config.py to read from os.environ.

📚 What is arXiv and How to Use It?

arXiv.org is the largest open-access academic repository for scientific research across domains such as:

  • Computer Science
  • Machine Learning & AI
  • Physics
  • Mathematics
  • Economics
  • Quantitative Biology

It returns:

  • Paper title, author, publication date
  • PDF download link

📡 API Reference

Below are the required API keys and tokens that are required for the proper functioning of the project:

Parameter Description
GOOGLE_API_KEY Required. Your Google Gemini 3.5 Pro key

📝 Development Notes

  • Streaming response enables real-time UI updates
  • LangGraph manages decisions when to call tools
  • PDF generation supports mathematical LaTeX equations
  • Checkpointing ensures persistent conversation state

🧪 Suggested Next Steps

To further strengthen the reliability, safety, and production readiness of ResearchMate AI, consider implementing the following enhancements:

  • Add citation formatting (IEEE / APA).
  • Add local PDF upload option.
  • Build user authentication & workspace saving.
  • Add RAG embeddings for better research contextualization.

⚠️ Disclaimer

The ResearchMate AI project is intended for educational and research purposes only. Generated papers should not be used for unethical publication, plagiarism, or academic misconduct.

⚖️ License

MIT License

About

ResearchMate AI is an AI powered research paper writer, that can browse papers on a given topic, read papers in depth, perform research and export a ready-to-publish research paper.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages