An event-driven, explainable Lead Intelligence system for wealth management advisors. Detects high-value prospects from financial events, scores them with a transparent weighted model, discovers warm-intro paths through a relationship graph, and generates advisor-ready outreach briefs — all accessible via a Claude-powered MCP chatbot and a Streamlit dashboard.
Core flow:
Raw Event → Signal → Feature → Lead Score → Outreach
| Top 5 Leads | Score Explanation |
|---|---|
![]() |
![]() |
| Warm Intro Path | Outreach Brief |
|---|---|
![]() |
![]() |
| Lead Comparison | IPO Event Search |
|---|---|
![]() |
![]() |
| Lead Scoring & Drill-down | Relationship Graph & Outreach |
|---|---|
![]() |
![]() |
- Why This Design
- Architecture
- Tech Stack
- Quick Start
- Streamlit Dashboard
- MCP + Claude Desktop (Chatbot)
- MCP Tools Reference
- Scoring Model
- Relationship Graph
- Repository Layout
- Demo Scenario (Interview)
- Design Principles
- Future Improvements
Most lead scoring systems are black-box classifiers trained on click-through data. That approach fails in wealth management because:
- Conversion cycles are long (months to years) — supervised labels are sparse and noisy.
- Advisors need explanations, not just rankings. A $50M AUM pitch requires a human-readable "why now."
- The signal is the event, not the behavior. An IPO or a liquidity event is worth 10,000 website visits.
So this system is built as a decision system, not a model:
- Deterministic, explainable scoring — weighted components, no opaque ML.
- Signal layer — normalizes heterogeneous events into a common schema.
- Relationship graph — turns cold leads into warm intros.
- Claude as a copilot, not a classifier — it orchestrates tools and explains reasoning in natural language.
┌─────────────────────┐
│ Data Ingestion │ SEC filings, press releases, LinkedIn,
│ (src/ingestion.py) │ internal CRM, referrals
└──────────┬──────────┘
│ raw events
▼
┌─────────────────────┐
│ Signal Processing │ normalize → Signal schema
│ (src/signals.py) │ (event_type, recency, est_liquidity, geo, confidence)
└──────────┬──────────┘
│ signals
▼
┌─────────────────────┐
│ Feature Engineering│ per-person feature vector (all 0–1 normalized)
│ (src/features.py) │ (recency, liquidity, net_worth, relationship, …)
└──────────┬──────────┘
│ features
▼
┌─────────────────────┐
│ Scoring Engine │ explainable weighted score
│ (src/scoring.py) │ → lead_score, priority_rank, reason[]
└──────────┬──────────┘
│ scored leads
▼
┌─────────────────────┐
│ Relationship Graph │ warm-intro path discovery (Dijkstra)
│ (src/graph.py) │ (same company / school / LinkedIn / referral)
└──────────┬──────────┘
│ enriched leads
▼
┌─────────────────────┐
│ Recommendation │ advisor assignment + outreach brief
│ (src/recommendation.py)
└──────────┬──────────┘
│
┌─────┴─────┐
▼ ▼
┌──────────┐ ┌──────────────┐
│ Streamlit│ │ MCP Server │
│ Dashboard│ │ + Claude │
│ (app/) │ │ Desktop │
└──────────┘ └──────────────┘
Both interfaces share the exact same backend (src/pipeline.py).
The numbers you see in Streamlit are identical to what Claude returns
through MCP — single source of truth.
| Layer | Technology |
|---|---|
| Data schemas | Pydantic v2 |
| Scoring engine | Pure Python (weighted sum + reason generation) |
| Relationship graph | NetworkX (Dijkstra shortest path) |
| Feature engineering | NumPy, math (log-scale, exponential decay) |
| Dashboard | Streamlit + Plotly |
| Chatbot | MCP (FastMCP) + Claude Desktop |
| Data | Pandas, JSON (synthetic mock dataset) |
| Tests | pytest |
- Python 3.10+
- (Optional) Claude Desktop app for the chatbot interface
git clone https://github.com/xialGuri/Lead-Intelligence-System.git
cd Lead-Intelligence-System
pip install -r requirements.txtPYTHONPATH=. python -m data.generate_mock_dataThis creates data/persons.json, data/events.json, and
data/relationships.json — 11 persons, 9 events, 7 relationships.
streamlit run app/streamlit_app.pyOpens at http://localhost:8501.
PYTHONPATH=. python -m demo.demo_scenarioPrints the full pipeline walkthrough: Top 5 leads, score breakdown, warm-intro path, and outreach brief.
PYTHONPATH=. python -m pytest tests/ -vThe dashboard is a transparency layer — it lets stakeholders (compliance, management, data scientists) visually verify the pipeline's outputs.
| Section | What it shows |
|---|---|
| Top-K leads table | Ranked leads with score progress bars, event type, warm-intro status |
| Score breakdown bar chart | Per-component contribution (liquidity, recency, relationship, confidence, seniority) |
| Feature radar chart | Normalized 0–1 feature vector shape per lead |
| Relationship graph | NetworkX spring layout with warm-intro path highlighted in red |
| Outreach brief panel | Headline, why-now, talking points, suggested channel, draft message |
| Pipeline inspector | Expandable tabs showing raw events → signals → features at each stage |
- As-of date — change the reference date to see how recency decay affects scores over time
- Top-K slider — display 3 to 10 leads
The MCP server exposes the same pipeline as a set of tools that Claude Desktop can call. This turns Claude into an advisor copilot with zero additional UI development.
-
Make sure mock data is generated (see Quick Start).
-
Copy the config to Claude Desktop:
cp mcp_server/claude_desktop_config.json \
"$HOME/Library/Application Support/Claude/claude_desktop_config.json"-
Restart Claude Desktop (⌘Q → reopen).
-
Verify: click the 🔌 icon in the input bar →
lead-intelligenceshould show as running with 7 tools.
You: Show me this week's top 5 leads.
Claude: [calls top_leads_this_week] → ranked list with scores and reasons
You: Why is #1 ranked so high?
Claude: [calls explain_lead_score] → component breakdown
You: Find a warm intro path to p_001.
Claude: [calls find_warm_intro] → Sarah Chen → Michael Torres (strength 1.0)
You: Draft an outreach brief for p_001.
Claude: [calls generate_outreach_brief] → full brief with talking points
You: Compare Michael Torres and Jennifer Walsh.
Claude: [calls compare_leads] → side-by-side with recommendation
| Tool | Input | What it does |
|---|---|---|
top_leads_this_week(k) |
k: int = 5 |
Returns top-k scored leads |
search_events(event_type, days) |
event_type: str, days: int = 30 |
Filters raw events by type and recency |
score_lead(person_id) |
person_id: str |
Returns full ScoredLead record for one person |
| Tool | Input | What it does |
|---|---|---|
explain_lead_score(person_id) |
person_id: str |
Per-component score breakdown + human-readable reasons |
find_warm_intro(person_id) |
person_id: str |
Best warm-intro path from any advisor (Dijkstra, max 3 hops) |
generate_outreach_brief(person_id) |
person_id: str |
Full brief: headline, why-now, talking points, draft message |
compare_leads(person_ids) |
person_ids: list[str] |
Side-by-side comparison with score contributions |
| Resource URI | Returns |
|---|---|
lead://profile/{person_id} |
Person profile (name, company, schools, net worth) |
lead://timeline/{person_id} |
All events for this person, most recent first |
lead://graph/{person_id} |
Immediate neighbors in the relationship graph |
lead_score = 0.30 × liquidity_score # Is there money in motion?
+ 0.25 × recency_score # How fresh is the signal?
+ 0.20 × relationship_score # Can we get a warm intro?
+ 0.15 × signal_confidence # How much do we trust the source?
+ 0.10 × seniority_score # Tiebreaker — already captured upstream
| Weight | Component | Rationale |
|---|---|---|
| 30% | Liquidity | No money in motion → no need for an advisor. This is the primary action trigger. |
| 25% | Recency | Advisors who engage within 2 weeks of a liquidity event win the relationship ~3× more often. Exponential decay with 30-day half-life. |
| 20% | Relationship | Warm intros convert ~3× better than cold outreach. Directly encodes conversion economics. |
| 15% | Confidence | SEC filings (0.98) are more trustworthy than LinkedIn (0.65). Acts as a discount factor on noisy signals. |
| 10% | Seniority | Lowest weight because seniority is already correlated with liquidity and relationship — higher weight would double-count. |
- No ML by design. Labels are sparse (conversions take months). Advisors and compliance need to audit every ranking. A weighted sum they can recompute on a whiteboard beats a black-box with higher AUC.
- Weights are business-owned hyperparameters, not learned parameters. Advisor teams can tune them directly.
- Future evolution: collect advisor feedback (accept/dismiss) → retrain weights via logistic regression per segment, while keeping the weighted-sum structure for explainability.
| Feature | Method | Why |
|---|---|---|
| Recency | exp(-ln(2) × days / 30) |
Exponential decay captures the "cold window" — first 2 weeks matter most |
| Liquidity | log10(USD) / 8 capped at 1.0 |
Log-scale because liquidity spans $100K–$1B (5+ orders of magnitude) |
| Seniority | Ordinal mapping (C-suite=1.0, VP=0.65, …) | Simple, auditable, no learned embeddings |
| Relationship | Max edge strength to any advisor | Promotes the single strongest connection |
| Confidence | Source-based lookup (SEC=0.98, LinkedIn=0.65) | Calibrated from historical source precision |
{
"lead_score": 0.965,
"priority_rank": 1,
"reasons": [
"Company IPO 12 days ago → est. $45M liquidity event",
"Fresh signal: referral only 1 day ago — inside the 2-week engagement window",
"Strong warm-intro path available (connection strength 1.00)",
"C-suite at Nimbus Robotics"
],
"contributions": {
"liquidity_score": 0.287,
"recency_score": 0.244,
"relationship_score": 0.200,
"signal_confidence": 0.134,
"seniority_score": 0.100
}
}Built with NetworkX. Edges represent real-world connections between advisors and prospects.
| Type | Example | Typical strength |
|---|---|---|
referral |
Direct referral from advisor | 0.9–1.0 |
same_company |
Worked together at Goldman Sachs | 0.8–0.95 |
same_school |
Stanford alumni network | 0.6–0.7 |
linkedin |
Connected on LinkedIn | 0.3–0.5 |
- Convert edge strength → distance via
-log(strength) - Run Dijkstra from every advisor to the target lead
- Select the path that maximizes the product of edge strengths
- Limit to 3 hops (beyond that, "warm" is no longer warm)
Warm intros convert ~3× better than cold outreach. Rather than using a
separate heuristic, relationship_score is directly in the weighted
sum (20%), so warm leads rise in the ranking automatically. This is
more honest than post-hoc re-ranking.
lead-intelligence-copilot/
├── README.md
├── requirements.txt
├── run_mcp_server.sh # Claude Desktop launcher script
│
├── app/
│ └── streamlit_app.py # Streamlit dashboard
│
├── data/
│ ├── generate_mock_data.py # Synthetic dataset generator
│ ├── persons.json # 11 persons (3 advisors + 8 prospects)
│ ├── events.json # 9 financial events
│ └── relationships.json # 7 relationship edges
│
├── src/
│ ├── __init__.py
│ ├── schemas.py # Pydantic models (Event, Signal, Feature, Lead)
│ ├── ingestion.py # Data loaders
│ ├── signals.py # Event → Signal (liquidity imputation, confidence)
│ ├── features.py # Signal → Feature (decay, log-scale, normalization)
│ ├── scoring.py # Explainable weighted scoring + reason generation
│ ├── graph.py # Relationship graph + warm-intro Dijkstra
│ ├── recommendation.py # Outreach brief generation
│ └── pipeline.py # End-to-end orchestration
│
├── mcp_server/
│ ├── __init__.py
│ ├── server.py # FastMCP server (7 tools + 3 resources)
│ └── claude_desktop_config.json # Config for Claude Desktop
│
├── demo/
│ └── demo_scenario.py # Console walkthrough for interviews
│
└── tests/
└── test_scoring.py # 5 invariant tests
demo/demo_scenario.py walks through the full pipeline in 6 steps:
| Step | What happens | What it proves |
|---|---|---|
| 0. Pipeline run | Load 11 persons, 9 events, 7 relationships | Data ingestion works |
| 1. Scoring weights | Print the 5-component weight table | Weights are transparent and sum to 1.0 |
| 2. Top 5 leads | Ranked list with scores and reasons | End-to-end scoring works |
| 3. Explain #1 | Per-component breakdown for Michael Torres | Explainability is real, not just a label |
| 4. Warm intro | Path: Sarah Chen → Michael Torres (strength 1.0) | Graph algorithm finds the strongest path |
| 5. Outreach brief | Headline, why-now, talking points, draft message | System produces actionable output, not just scores |
| 6. Interview talking points | Why no ML, why signal layer, where Claude fits | Design rationale |
| Principle | Implementation |
|---|---|
| Explainability over accuracy | Weighted sum with named components. Every score decomposes into 5 auditable numbers. |
| Signal ≠ Event | Raw events are noisy and source-specific. Signals are normalized, decayed, and confidence-weighted. |
| Warm > Cold | Relationship score is in the weighted sum (20%), so warm leads rise automatically. |
| Claude is an orchestrator, not an oracle | All numerics are deterministic Python. Claude selects tools and explains results. |
| Single source of truth | Both Streamlit and MCP call the same pipeline.py. Numbers never diverge. |
| Business-owned weights | Scoring weights are configurable hyperparameters, not learned. The business tunes them. |
- Feedback loop: Collect advisor accept/dismiss signals → retrain segment-specific weights via logistic regression / contextual bandits.
- More signal sources: Form 4 insider sales, probate filings, 13F institutional holdings.
- Graph scale: Replace NetworkX with Neo4j / Neptune when the graph exceeds ~100K nodes.
- Real-time ingestion: Replace JSON file loads with a streaming pipeline (Kafka / Pub/Sub) for live event detection.
- Multi-tenant: Per-firm advisor coverage maps, custom weight profiles.







