Skip to content

neo4j-documentation/neo4j-docs-mcp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Neo4j Documentation MCP Server & Evaluation Pipeline

This project provides:

  1. MCP Server: A Model Context Protocol server for Neo4j documentation
  2. Evaluation Pipeline: Tools to evaluate and compare LLM retrieval approaches

Project Structure

.
├── mcp-neo4j-docs/          # MCP Server for Neo4j Documentation
│   ├── main.py              # MCP server implementation
│   ├── test_server.py       # Server testing utilities
│   ├── pyproject.toml       # Project dependencies
│   └── README.md            # MCP server documentation
│
├── evals/                   # Evaluation Pipeline
│   ├── eval_pipeline.py     # Main evaluation script
│   ├── populate_neo4j_vectors.py  # Vector store population
│   ├── test_questions.json  # Test question dataset
│   ├── .env.template        # Environment variables template
│   └── README.md            # Evaluation documentation
│
└── README.md                # This file

Quick Start

1. MCP Server Setup

The MCP server provides tools for browsing and reading Neo4j documentation with intelligent caching.

# Navigate to MCP directory
cd mcp-neo4j-docs

# Install dependencies
uv sync

# Run the server
python main.py

# Or test the server
python test_server.py

See mcp-neo4j-docs/README.md for detailed documentation.

2. Evaluation Pipeline Setup

The evaluation pipeline compares MCP server retrieval vs Neo4j vector search.

# Navigate to evals directory
cd evals

# Configure environment
cp .env.template .env
# Edit .env with your credentials

# (Optional) Populate Neo4j vector store
python populate_neo4j_vectors.py

# Run evaluation
python eval_pipeline.py

See evals/README.md for detailed documentation.

Components

MCP Server (mcp-neo4j-docs/)

An MCP server that provides:

  • Resources: Lists of Neo4j manuals and GraphAcademy courses
  • Tools: Browse manuals, read pages, access courses, manage cache
  • Caching: Automatic caching of fetched content

Key Features:

  • Browse all Neo4j documentation manuals
  • Read specific documentation pages
  • Access GraphAcademy courses
  • Smart caching for performance
  • Compatible with Claude Desktop, Cline, and other MCP clients

Evaluation Pipeline (evals/)

A comprehensive evaluation framework that:

  • Compares MCP server vs vector search retrieval
  • Measures accuracy and efficiency
  • Generates detailed reports

Key Features:

  • Dual Retrieval: Tests both MCP and vector search approaches
  • LLM Evaluation: Uses GPT-4o-mini to evaluate answer quality
  • Multiple Metrics: Accuracy, completeness, relevance, clarity, speed
  • Comprehensive Reporting: JSON, CSV, and console reports
  • Extensible: Easy to add custom test questions

Use Cases

For Developers

  • Test and optimize documentation retrieval systems
  • Compare different RAG (Retrieval-Augmented Generation) approaches
  • Benchmark MCP server performance
  • Evaluate LLM response quality

For Researchers

  • Study efficiency of different retrieval methods
  • Analyze trade-offs between speed and accuracy
  • Generate datasets for RAG evaluation
  • Compare semantic search vs direct fetching

For Documentation Teams

  • Assess documentation searchability
  • Identify gaps in documentation
  • Optimize content for better retrieval
  • Monitor documentation quality over time

Requirements

MCP Server

  • Python 3.12+
  • Dependencies: mcp, fastmcp, requests, beautifulsoup4

Evaluation Pipeline

  • Python 3.12+
  • Neo4j instance (for vector search)
  • OpenAI API key (for embeddings and LLM)
  • Additional dependencies: langchain, langchain-neo4j, langchain-openai, pandas, numpy

Environment Variables

For the evaluation pipeline, create evals/.env with:

# Neo4j Docs Chatbot Connection
DOCS_CHATBOT_URI=neo4j+s://your-instance.databases.neo4j.io
DOCS_CHATBOT_USERNAME=neo4j
DOCS_CHATBOT_PASSWORD=your-password
DOCS_CHATBOT_INDEX=documentation_embeddings
DOCS_CHATBOT_EMBEDDING_MODEL=text-embedding-3-small

# OpenAI API
OPENAI_API_KEY=your-openai-key

# Optional: Anthropic API Key
ANTHROPIC_API_KEY=your-anthropic-key

Example Workflow

# 1. Start the MCP server
cd mcp-neo4j-docs
python main.py &

# 2. Populate vector store (first time only)
cd ../evals
python populate_neo4j_vectors.py

# 3. Run evaluation
python eval_pipeline.py

# 4. Review results
cat evaluation_results.json
# or
open evaluation_results.csv

Sample Output

================================================================================
EVALUATION RESULTS SUMMARY
================================================================================

## Overall Performance
Total Questions Evaluated: 15

## Accuracy Metrics (Average Scores)
MCP - Overall:       8.10/10
Vector - Overall:    7.60/10

## Efficiency Metrics (Average Times)
MCP - Total:         5.65s
Vector - Total:      3.95s

## Winner Analysis
MCP Wins:     9 (60.0%)
Vector Wins:  6 (40.0%)

## Recommendations
✓ MCP Server provides better answer quality on average
✓ Vector Search is faster on average

Configuration

MCP Server

Configure in MCP client (e.g., Claude Desktop):

{
  "mcpServers": {
    "neo4j-docs": {
      "command": "python",
      "args": ["/path/to/mcp-neo4j-docs/main.py"]
    }
  }
}

Evaluation Pipeline

Customize test questions in evals/test_questions.json:

{
  "question": "Your question here",
  "expected_answer": "Expected answer",
  "category": "category-name",
  "difficulty": "easy|medium|hard"
}

Development

Running Tests

# Test MCP server
cd mcp-neo4j-docs
python test_server.py

# Test with specific manual
python test_server.py --manual cypher-manual

Adding New Evaluation Metrics

Extend LLMEvaluator class in evals/eval_pipeline.py:

def evaluate_custom_metric(self, question: str, answer: str) -> float:
    # Your evaluation logic here
    return score

Troubleshooting

MCP Server Issues

  • Ensure all dependencies are installed: uv sync
  • Check network connectivity to neo4j.com
  • Clear cache if stale data: Use clear_cache() tool

Evaluation Pipeline Issues

  • Import errors: Run from evals/ directory
  • Neo4j connection: Verify credentials in .env
  • OpenAI rate limits: Reduce question count or add delays
  • Vector search fails: Run populate_neo4j_vectors.py first

Performance Tips

MCP Server

  • Cache is automatic - frequently accessed pages load instantly
  • Limit page fetches per manual for faster browsing
  • Use specific manuals instead of browsing all

Evaluation Pipeline

  • Start with a small question set (5-10) for testing
  • Populate vector store once, reuse for multiple evaluations
  • Use gpt-4o-mini for cost-effective evaluation
  • Run evaluations during off-peak hours to avoid rate limits

Contributing

Contributions welcome! Areas for improvement:

  • Additional evaluation metrics
  • Support for more LLM providers
  • Enhanced caching strategies
  • Better documentation coverage
  • UI for evaluation results

License

See LICENSE file for details.

Resources

Support

For issues or questions:

  1. Check the README files in each directory
  2. Review the troubleshooting sections
  3. Open an issue on GitHub (if applicable)

Built with ❤️ for better documentation retrieval and evaluation.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages