Lead Intelligence Copilot

An event-driven, explainable Lead Intelligence system for wealth management advisors. Detects high-value prospects from financial events, scores them with a transparent weighted model, discovers warm-intro paths through a relationship graph, and generates advisor-ready outreach briefs — all accessible via a Claude-powered MCP chatbot and a Streamlit dashboard.

Core flow: Raw Event → Signal → Feature → Lead Score → Outreach

Screenshots

Claude Desktop (MCP Chatbot)

Top 5 Leads	Score Explanation

Warm Intro Path	Outreach Brief

Lead Comparison	IPO Event Search

Streamlit Dashboard

Lead Scoring & Drill-down	Relationship Graph & Outreach

1. Why This Design

Most lead scoring systems are black-box classifiers trained on click-through data. That approach fails in wealth management because:

Conversion cycles are long (months to years) — supervised labels are sparse and noisy.
Advisors need explanations, not just rankings. A $50M AUM pitch requires a human-readable "why now."
The signal is the event, not the behavior. An IPO or a liquidity event is worth 10,000 website visits.

So this system is built as a decision system, not a model:

Deterministic, explainable scoring — weighted components, no opaque ML.
Signal layer — normalizes heterogeneous events into a common schema.
Relationship graph — turns cold leads into warm intros.
Claude as a copilot, not a classifier — it orchestrates tools and explains reasoning in natural language.

2. Architecture

┌─────────────────────┐
│  Data Ingestion     │  SEC filings, press releases, LinkedIn,
│  (src/ingestion.py) │  internal CRM, referrals
└──────────┬──────────┘
           │ raw events
           ▼
┌─────────────────────┐
│  Signal Processing  │  normalize → Signal schema
│  (src/signals.py)   │  (event_type, recency, est_liquidity, geo, confidence)
└──────────┬──────────┘
           │ signals
           ▼
┌─────────────────────┐
│  Feature Engineering│  per-person feature vector (all 0–1 normalized)
│  (src/features.py)  │  (recency, liquidity, net_worth, relationship, …)
└──────────┬──────────┘
           │ features
           ▼
┌─────────────────────┐
│  Scoring Engine     │  explainable weighted score
│  (src/scoring.py)   │  → lead_score, priority_rank, reason[]
└──────────┬──────────┘
           │ scored leads
           ▼
┌─────────────────────┐
│  Relationship Graph │  warm-intro path discovery (Dijkstra)
│  (src/graph.py)     │  (same company / school / LinkedIn / referral)
└──────────┬──────────┘
           │ enriched leads
           ▼
┌─────────────────────┐
│  Recommendation     │  advisor assignment + outreach brief
│  (src/recommendation.py)
└──────────┬──────────┘
           │
     ┌─────┴─────┐
     ▼           ▼
┌──────────┐ ┌──────────────┐
│ Streamlit│ │ MCP Server   │
│ Dashboard│ │ + Claude     │
│ (app/)   │ │ Desktop      │
└──────────┘ └──────────────┘

Both interfaces share the exact same backend (src/pipeline.py). The numbers you see in Streamlit are identical to what Claude returns through MCP — single source of truth.

3. Tech Stack

Layer	Technology
Data schemas	Pydantic v2
Scoring engine	Pure Python (weighted sum + reason generation)
Relationship graph	NetworkX (Dijkstra shortest path)
Feature engineering	NumPy, math (log-scale, exponential decay)
Dashboard	Streamlit + Plotly
Chatbot	MCP (FastMCP) + Claude Desktop
Data	Pandas, JSON (synthetic mock dataset)
Tests	pytest

4. Quick Start

Prerequisites

Python 3.10+
(Optional) Claude Desktop app for the chatbot interface

Installation

git clone https://github.com/xialGuri/Lead-Intelligence-System.git
cd Lead-Intelligence-System

pip install -r requirements.txt

Generate mock data

PYTHONPATH=. python -m data.generate_mock_data

This creates data/persons.json, data/events.json, and data/relationships.json — 11 persons, 9 events, 7 relationships.

Run the Streamlit dashboard

streamlit run app/streamlit_app.py

Opens at http://localhost:8501.

Run the console demo

PYTHONPATH=. python -m demo.demo_scenario

Prints the full pipeline walkthrough: Top 5 leads, score breakdown, warm-intro path, and outreach brief.

Run tests

PYTHONPATH=. python -m pytest tests/ -v

5. Streamlit Dashboard

The dashboard is a transparency layer — it lets stakeholders (compliance, management, data scientists) visually verify the pipeline's outputs.

Features

Section	What it shows
Top-K leads table	Ranked leads with score progress bars, event type, warm-intro status
Score breakdown bar chart	Per-component contribution (liquidity, recency, relationship, confidence, seniority)
Feature radar chart	Normalized 0–1 feature vector shape per lead
Relationship graph	NetworkX spring layout with warm-intro path highlighted in red
Outreach brief panel	Headline, why-now, talking points, suggested channel, draft message
Pipeline inspector	Expandable tabs showing raw events → signals → features at each stage

Interactive controls (sidebar)

As-of date — change the reference date to see how recency decay affects scores over time
Top-K slider — display 3 to 10 leads

6. MCP + Claude Desktop (Chatbot)

The MCP server exposes the same pipeline as a set of tools that Claude Desktop can call. This turns Claude into an advisor copilot with zero additional UI development.

Setup

Make sure mock data is generated (see Quick Start).
Copy the config to Claude Desktop:

cp mcp_server/claude_desktop_config.json \
   "$HOME/Library/Application Support/Claude/claude_desktop_config.json"

Restart Claude Desktop (⌘Q → reopen).
Verify: click the 🔌 icon in the input bar → lead-intelligence should show as running with 7 tools.

Example conversation

You:    Show me this week's top 5 leads.
Claude: [calls top_leads_this_week] → ranked list with scores and reasons

You:    Why is #1 ranked so high?
Claude: [calls explain_lead_score] → component breakdown

You:    Find a warm intro path to p_001.
Claude: [calls find_warm_intro] → Sarah Chen → Michael Torres (strength 1.0)

You:    Draft an outreach brief for p_001.
Claude: [calls generate_outreach_brief] → full brief with talking points

You:    Compare Michael Torres and Jennifer Walsh.
Claude: [calls compare_leads] → side-by-side with recommendation

7. MCP Tools Reference

Information retrieval

Tool	Input	What it does
`top_leads_this_week(k)`	`k: int = 5`	Returns top-k scored leads
`search_events(event_type, days)`	`event_type: str, days: int = 30`	Filters raw events by type and recency
`score_lead(person_id)`	`person_id: str`	Returns full ScoredLead record for one person

Decision support

Tool	Input	What it does
`explain_lead_score(person_id)`	`person_id: str`	Per-component score breakdown + human-readable reasons
`find_warm_intro(person_id)`	`person_id: str`	Best warm-intro path from any advisor (Dijkstra, max 3 hops)
`generate_outreach_brief(person_id)`	`person_id: str`	Full brief: headline, why-now, talking points, draft message
`compare_leads(person_ids)`	`person_ids: list[str]`	Side-by-side comparison with score contributions

Resources (read-only, URI-based)

Resource URI	Returns
`lead://profile/{person_id}`	Person profile (name, company, schools, net worth)
`lead://timeline/{person_id}`	All events for this person, most recent first
`lead://graph/{person_id}`	Immediate neighbors in the relationship graph

8. Scoring Model (Explainable)

lead_score = 0.30 × liquidity_score       # Is there money in motion?
           + 0.25 × recency_score         # How fresh is the signal?
           + 0.20 × relationship_score    # Can we get a warm intro?
           + 0.15 × signal_confidence     # How much do we trust the source?
           + 0.10 × seniority_score       # Tiebreaker — already captured upstream

Why these weights?

Weight	Component	Rationale
30%	Liquidity	No money in motion → no need for an advisor. This is the primary action trigger.
25%	Recency	Advisors who engage within 2 weeks of a liquidity event win the relationship ~3× more often. Exponential decay with 30-day half-life.
20%	Relationship	Warm intros convert ~3× better than cold outreach. Directly encodes conversion economics.
15%	Confidence	SEC filings (0.98) are more trustworthy than LinkedIn (0.65). Acts as a discount factor on noisy signals.
10%	Seniority	Lowest weight because seniority is already correlated with liquidity and relationship — higher weight would double-count.

Key design decisions

No ML by design. Labels are sparse (conversions take months). Advisors and compliance need to audit every ranking. A weighted sum they can recompute on a whiteboard beats a black-box with higher AUC.
Weights are business-owned hyperparameters, not learned parameters. Advisor teams can tune them directly.
Future evolution: collect advisor feedback (accept/dismiss) → retrain weights via logistic regression per segment, while keeping the weighted-sum structure for explainability.

Feature normalization

Feature	Method	Why
Recency	`exp(-ln(2) × days / 30)`	Exponential decay captures the "cold window" — first 2 weeks matter most
Liquidity	`log10(USD) / 8` capped at 1.0	Log-scale because liquidity spans $100K–$1B (5+ orders of magnitude)
Seniority	Ordinal mapping (C-suite=1.0, VP=0.65, …)	Simple, auditable, no learned embeddings
Relationship	Max edge strength to any advisor	Promotes the single strongest connection
Confidence	Source-based lookup (SEC=0.98, LinkedIn=0.65)	Calibrated from historical source precision

Example output

{
  "lead_score": 0.965,
  "priority_rank": 1,
  "reasons": [
    "Company IPO 12 days ago → est. $45M liquidity event",
    "Fresh signal: referral only 1 day ago — inside the 2-week engagement window",
    "Strong warm-intro path available (connection strength 1.00)",
    "C-suite at Nimbus Robotics"
  ],
  "contributions": {
    "liquidity_score": 0.287,
    "recency_score": 0.244,
    "relationship_score": 0.200,
    "signal_confidence": 0.134,
    "seniority_score": 0.100
  }
}

9. Relationship Graph

Built with NetworkX. Edges represent real-world connections between advisors and prospects.

Edge types

Type	Example	Typical strength
`referral`	Direct referral from advisor	0.9–1.0
`same_company`	Worked together at Goldman Sachs	0.8–0.95
`same_school`	Stanford alumni network	0.6–0.7
`linkedin`	Connected on LinkedIn	0.3–0.5

Warm-intro discovery algorithm

Convert edge strength → distance via -log(strength)
Run Dijkstra from every advisor to the target lead
Select the path that maximizes the product of edge strengths
Limit to 3 hops (beyond that, "warm" is no longer warm)

Why warm leads are promoted

Warm intros convert ~3× better than cold outreach. Rather than using a separate heuristic, relationship_score is directly in the weighted sum (20%), so warm leads rise in the ranking automatically. This is more honest than post-hoc re-ranking.

10. Repository Layout

lead-intelligence-copilot/
├── README.md
├── requirements.txt
├── run_mcp_server.sh              # Claude Desktop launcher script
│
├── app/
│   └── streamlit_app.py           # Streamlit dashboard
│
├── data/
│   ├── generate_mock_data.py      # Synthetic dataset generator
│   ├── persons.json               # 11 persons (3 advisors + 8 prospects)
│   ├── events.json                # 9 financial events
│   └── relationships.json         # 7 relationship edges
│
├── src/
│   ├── __init__.py
│   ├── schemas.py                 # Pydantic models (Event, Signal, Feature, Lead)
│   ├── ingestion.py               # Data loaders
│   ├── signals.py                 # Event → Signal (liquidity imputation, confidence)
│   ├── features.py                # Signal → Feature (decay, log-scale, normalization)
│   ├── scoring.py                 # Explainable weighted scoring + reason generation
│   ├── graph.py                   # Relationship graph + warm-intro Dijkstra
│   ├── recommendation.py          # Outreach brief generation
│   └── pipeline.py                # End-to-end orchestration
│
├── mcp_server/
│   ├── __init__.py
│   ├── server.py                  # FastMCP server (7 tools + 3 resources)
│   └── claude_desktop_config.json # Config for Claude Desktop
│
├── demo/
│   └── demo_scenario.py           # Console walkthrough for interviews
│
└── tests/
    └── test_scoring.py            # 5 invariant tests

11. Demo Scenario (Interview)

demo/demo_scenario.py walks through the full pipeline in 6 steps:

Step	What happens	What it proves
0. Pipeline run	Load 11 persons, 9 events, 7 relationships	Data ingestion works
1. Scoring weights	Print the 5-component weight table	Weights are transparent and sum to 1.0
2. Top 5 leads	Ranked list with scores and reasons	End-to-end scoring works
3. Explain #1	Per-component breakdown for Michael Torres	Explainability is real, not just a label
4. Warm intro	Path: Sarah Chen → Michael Torres (strength 1.0)	Graph algorithm finds the strongest path
5. Outreach brief	Headline, why-now, talking points, draft message	System produces actionable output, not just scores
6. Interview talking points	Why no ML, why signal layer, where Claude fits	Design rationale

12. Design Principles

Principle	Implementation
Explainability over accuracy	Weighted sum with named components. Every score decomposes into 5 auditable numbers.
Signal ≠ Event	Raw events are noisy and source-specific. Signals are normalized, decayed, and confidence-weighted.
Warm > Cold	Relationship score is in the weighted sum (20%), so warm leads rise automatically.
Claude is an orchestrator, not an oracle	All numerics are deterministic Python. Claude selects tools and explains results.
Single source of truth	Both Streamlit and MCP call the same `pipeline.py`. Numbers never diverge.
Business-owned weights	Scoring weights are configurable hyperparameters, not learned. The business tunes them.

13. Future Improvements

Feedback loop: Collect advisor accept/dismiss signals → retrain segment-specific weights via logistic regression / contextual bandits.
More signal sources: Form 4 insider sales, probate filings, 13F institutional holdings.
Graph scale: Replace NetworkX with Neo4j / Neptune when the graph exceeds ~100K nodes.
Real-time ingestion: Replace JSON file loads with a streaming pipeline (Kafka / Pub/Sub) for live event detection.
Multi-tenant: Per-firm advisor coverage maps, custom weight profiles.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
app		app
assets		assets
data		data
demo		demo
mcp_server		mcp_server
src		src
tests		tests
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
run_mcp_server.sh		run_mcp_server.sh

Folders and files

Latest commit

History

Repository files navigation

Lead Intelligence Copilot

Screenshots

Claude Desktop (MCP Chatbot)

Streamlit Dashboard

Table of Contents

1. Why This Design

2. Architecture

3. Tech Stack

4. Quick Start

Prerequisites

Installation

Generate mock data

Run the Streamlit dashboard

Run the console demo

Run tests

5. Streamlit Dashboard

Features

Interactive controls (sidebar)

6. MCP + Claude Desktop (Chatbot)

Setup

Example conversation

7. MCP Tools Reference

Information retrieval

Decision support

Resources (read-only, URI-based)

8. Scoring Model (Explainable)

Why these weights?

Key design decisions

Feature normalization

Example output

9. Relationship Graph

Edge types

Warm-intro discovery algorithm

Why warm leads are promoted

10. Repository Layout

11. Demo Scenario (Interview)

12. Design Principles

13. Future Improvements

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages