Skip to content

Latest commit

 

History

History
396 lines (303 loc) · 10.5 KB

File metadata and controls

396 lines (303 loc) · 10.5 KB

🧠 Deep Learner

A production-grade autonomous learning agent built with pure LangGraph and LangChain. No CrewAI. No abstractions hiding the mechanics. Every node, edge, and decision explicitly defined and fully controllable.

Give it any topic — it researches the web, extracts core concepts, writes a beginner study guide, and generates a quiz. Quality gates with automatic retry loops ensure the output meets defined standards before moving forward.


✨ What Makes This Different

Most AI agent tutorials use high-level frameworks that hide how agents actually work. This project is built from first principles:

❌ No CrewAI
❌ No magic abstractions
✅ Pure LangGraph StateGraph
✅ Direct LLM calls via LangChain
✅ Every routing decision explicitly coded
✅ Full visibility into every step
✅ Model agnostic — swap LLMs via .env

🏗️ Architecture

START
  ↓
[researcher_node]         → searches web, reads articles, synthesises notes
  ↓
[research_quality_node]   → checks length, URLs, bullet points
  ↓
  ├── FAIL + attempts < 3 → retry researcher (different search query)
  ├── FAIL + attempts >= 3 → error_node
  └── PASS
        ↓
    [summarizer_node]     → extracts 5 core concepts with analogies
        ↓
    [teacher_node]        → writes full study guide, streams live to terminal
        ↓
    [guide_quality_node]  → checks word count, sections, completeness
        ↓
        ├── FAIL + attempts < 2 → retry teacher (stronger instructions)
        └── PASS
              ↓
          [quiz_node]     → creates 5-question multiple choice quiz
              ↓
             END

📁 Project Structure

deep-learner/
│
├── run.py             # Entry point — run from terminal
├── graph.py           # Assembles nodes, edges, conditions
├── nodes.py           # All node functions (the actual work)
├── edges.py           # All routing logic (the decisions)
├── state.py           # AgentState TypedDict definition
├── config.py          # Model agnostic LLM getter
│
├── .env               # API keys + model config
├── requirements.txt   # Dependencies
│
└── outputs/
    ├── study_guide.md     # Generated study guide
    └── quiz_questions.md  # Generated quiz

Each file has exactly one responsibility. No file knows more than it needs to.


⚙️ Setup

Prerequisites

  • Python 3.11 or 3.12
  • pyenv recommended

1. Clone the repository

git clone https://github.com/yourusername/deep-learner.git
cd deep-learner

2. Set Python version

pyenv local 3.12
python3 --version  # should show 3.12.x

3. Create virtual environment

python3 -m venv venv
source venv/bin/activate  # Mac/Linux
venv\Scripts\activate     # Windows

4. Install dependencies

pip install --upgrade pip
pip install -r requirements.txt

5. Configure environment variables

Create a .env file in the project root:

# API Keys
ANTHROPIC_API_KEY=your_anthropic_key_here
GROQ_API_KEY=your_groq_key_here
SERPER_API_KEY=your_serper_key_here

# Researcher
LLM_RESEARCHER=claude-haiku-4-5-20251001
LLM_RESEARCHER_PROVIDER=anthropic

# Summarizer
LLM_SUMMARIZER=llama-3.3-70b-versatile
LLM_SUMMARIZER_PROVIDER=groq

# Teacher
LLM_TEACHER=claude-sonnet-4-6
LLM_TEACHER_PROVIDER=anthropic

# Quiz
LLM_QUIZ=claude-sonnet-4-6
LLM_QUIZ_PROVIDER=anthropic

🚀 Running

python3 run.py

Enter any topic when prompted. Watch every agent stream its output live to the terminal with full visibility into quality gate decisions.

##################################################
  🤖 DEEP LEARNER
  📚 Topic: Machine Learning
##################################################

==================================================
🔍 RESEARCHER — Attempt 1/3
==================================================
   [researcher] using: anthropic/claude-haiku-4-5-20251001
   Searching: Machine Learning beginner guide explained simply
   Found 3 results
   Reading: https://...
   Synthesising research notes...
   ........................ done (1847 chars)

==================================================
🔎 RESEARCH QUALITY CHECK
==================================================
   ✅ length > 500 chars
   ✅ contains URLs
   ✅ has bullet points
   ✅ mentions topic
   ✅ Research PASSED quality gate!

==================================================
🧠 SUMMARIZER — Extracting concepts
==================================================
   [summarizer] using: groq/llama-3.3-70b-versatile
   Extracting concepts................. done (312 words)

==================================================
✍️  TEACHER — Writing guide (attempt 1/2)
==================================================
   [teacher] using: anthropic/claude-sonnet-4-6

# Machine Learning — Beginner's Guide

## Why This Matters
Machine learning is quietly reshaping...
(streams word by word live)

🤖 The Nodes

Node Model Job
researcher_node Claude Haiku Searches web, reads articles, synthesises notes
research_quality_node None (deterministic) Checks research meets quality standards
summarizer_node Groq Llama 3.3 Extracts 5 core concepts with analogies
teacher_node Claude Sonnet Writes complete study guide, streams live
guide_quality_node None (deterministic) Checks guide meets quality standards
quiz_node Claude Sonnet Creates 5-question multiple choice quiz
error_node None Graceful failure handler

🔀 The Edges

Simple edges (always go to next node)

researcher    → research_quality
summarizer    → teacher
teacher       → guide_quality
quiz          → END
error         → END

Conditional edges (route based on state)

research_quality → route_research()
  passed=True              → summarizer
  passed=False, attempts<3 → researcher  (retry)
  passed=False, attempts≥3 → error

guide_quality → route_guide()
  passed=True              → quiz
  passed=False, attempts<2 → teacher    (retry)
  passed=False, attempts≥2 → quiz       (accept what we have)

📊 Quality Gates

Research Quality Gate

Checks all of the following before continuing:

  • Content length > 500 characters
  • Contains at least 1 URL
  • Contains at least 5 bullet points
  • Topic name mentioned at least once

On fail: Retries with a different search query (max 3 attempts)

Guide Quality Gate

Checks all of the following before continuing:

  • Word count > 500 words
  • At least 4 section headers (##)
  • At least 10 bullet points
  • Contains "next steps" or "takeaway"

On fail: Retries with stronger length instructions (max 2 attempts)


🔧 Model Agnostic Config

Switch any model by editing .env only. Zero code changes required.

# config.py

def get_llm(role: str):
    model = os.getenv(
        f"LLM_{role.upper()}",
        DEFAULTS.get(role)
    )
    return init_chat_model(model)
# Development — use free Groq for everything
LLM_RESEARCHER=groq/llama-3.3-70b-versatile
LLM_SUMMARIZER=groq/llama-3.3-70b-versatile
LLM_TEACHER=groq/llama-3.3-70b-versatile
LLM_QUIZ=groq/llama-3.3-70b-versatile

# Production — best quality models
LLM_RESEARCHER=anthropic/claude-haiku-4-5-20251001
LLM_SUMMARIZER=groq/llama-3.3-70b-versatile
LLM_TEACHER=anthropic/claude-sonnet-4-6
LLM_QUIZ=anthropic/claude-sonnet-4-6

Any provider supported by LangChain's init_chat_model works — Anthropic, OpenAI, Groq, Google, Mistral, and more.


💰 Cost Per Run

Node Model Approx Cost
Researcher Claude Haiku ~$0.001
Summarizer Groq Llama 3.3 $0.000
Teacher Claude Sonnet ~$0.015
Quiz Claude Sonnet ~$0.010
Total ~$0.026

🔑 Getting API Keys

Service URL Free Tier
Anthropic (Claude) console.anthropic.com Pay as you go
Groq console.groq.com 30,000 tokens/min free
Serper (Google Search) serper.dev 2,500 searches/month free

🧠 How It Works — Key Concepts

State travels through every node

# Every node receives the full state dict
def researcher_node(state: AgentState) -> dict:
    topic = state["topic"]         # read
    notes = do_research(topic)
    return {"research_notes": notes}  # return only changes

LangGraph merges updates

# You return partial updates
# LangGraph merges them into full state
current_state = {**current_state, **node_return_value}

Conditional edges are just functions

# Returns a string → LangGraph maps to next node
def route_research(state: AgentState) -> str:
    if state["research_passed"]:
        return "summarizer"
    if state["research_attempts"] >= 3:
        return "error"
    return "researcher"  # retry loop

Streaming outputs every token live

# .stream() yields chunks as they generate
for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

🗺️ Roadmap

  • LangSmith integration for tracing and debugging
  • Persistent checkpointing (resume interrupted runs)
  • Human-in-the-loop approval before study guide saves
  • RAG mode — learn from your own documents
  • Web interface with live streaming
  • REST API with FastAPI
  • Cloud deployment

📚 What This Project Teaches

Building this from scratch gives you deep understanding of:

  • LangGraph StateGraph internals
  • How nodes read and write immutable state
  • Conditional routing and retry loop patterns
  • Direct LLM calls via LangChain messages
  • Model agnostic architecture with init_chat_model
  • Live streaming with .stream() and flush=True
  • Quality gate design for production agents
  • Graceful error handling with max retry limits

🔗 Related Project

This project is the pure LangGraph evolution of learning-agent — the same domain rebuilt without framework abstractions to show what's happening under the hood.


📄 License

MIT License — free to use, modify and distribute.


🙏 Acknowledgements