This project sets up a dual-mode Model Context Protocol (MCP) server that supports both:
π§© Tool-based responses (e.g., current time, jokes) via FastMCP π¬ Prompt-response LLM completions via FastAPI
| File | Purpose |
|---|---|
server.py |
Handles LLM loading, generation, config |
π Features
βοΈ Compatible with Claude Desktop, MCP Inspector, LangGraph, etc. π FastMCP standard for tool registration and stdin communication π§ Run local LLM completions from app.py using /generate API π Optional HTTP server mode for broader integrations π§ͺ MCP tools return structured JSON responses
π Installation pip install fastapi uvicorn transformers torch pip install requests pip install bitsandbytes accelerate # Only needed if using quantized models
π Registered MCP Tools Tool Name What It Does time_now Returns current UTC timestamp dad_joke Returns a random dad joke from API
π‘ What is MCP? Model Context Protocol (MCP) allows custom tools and local models to be integrated into AI assistants like Claude Desktop or LangGraph. You can register Python functions as tools, and they become callable by the assistant when relevant.