Skip to content

MaxDeale/stock_bulb

Repository files navigation

Stock Research Agent

Python FastAPI Pydantic AI License: MIT Docker

A production-grade agentic API service that accepts a stock ticker and autonomously orchestrates multi-step tool calls — fetching live price data, financial statements, and news — then reasons over all of it to emit a fully typed, structured InvestmentBrief with bull/bear cases, conviction ratings, and risk flags.

No frontend. No manual JSON parsing. Pure agentic reasoning, validated output.


Architecture

POST /research  {"ticker": "AAPL"}
        |
        v
  +-----------+
  |  FastAPI  |   uvicorn, CORS, request logging middleware
  +-----------+
        |
        v
  +-----------+
  |   Agent   |   pydantic-ai  Agent(result_type=InvestmentBrief)
  |  agent.py |   GPT-4o via OpenAI API
  +-----------+
        |
        |-- tool: get_price_data(ticker)    --> yfinance .info
        |-- tool: get_financials(ticker)    --> yfinance .financials + .info
        |-- tool: get_news(ticker, name)    --> Google News RSS (feedparser)
        |
        v
  +------------------+
  |  InvestmentBrief |   Pydantic v2 model — validated, typed, no hallucinated schema
  +------------------+
        |
        v
  ResearchResponse  {"success": true, "brief": {...}}

The agent calls tools autonomously in its reasoning loop. The framework (pydantic-ai) handles structured extraction via OpenAI's function-calling API — no regex, no prompt-engineered JSON templates.


Features

  • Autonomous tool orchestration — the agent decides when and in what order to call price, financials, and news tools based on its reasoning
  • Typed structured outputresult_type=InvestmentBrief forces GPT-4o to emit output that validates against the Pydantic v2 model; the framework retries on schema violations
  • Zero manual JSON parsing — pydantic-ai + OpenAI function-calling handles extraction end-to-end
  • Graceful data degradation — all yfinance field access uses .get() with None defaults; missing data never crashes the agent
  • Request logging middleware — every HTTP request logged with method, path, status, and latency
  • Docker + hot-reloaddocker-compose up for development; production image baked separately
  • Swagger UI at /docs — interactive testing with full schema documentation
  • Structured error responses — agent failures return success=false + error message, not raw HTTP 500

Tech Stack

Layer Library / Tool Why
Agent framework pydantic-ai 0.0.14 Native structured output, tool registration, retries
LLM OpenAI gpt-4o Best-in-class reasoning for multi-step financial analysis
API layer fastapi 0.115 + uvicorn Async-native, automatic OpenAPI docs, production-proven
Data validation pydantic v2 10-50x faster than v1, strict mode, JSON schema export
Stock data yfinance 0.2.44 Free, comprehensive, no API key required
News feedparser 6.0.11 Parses Google News RSS — no scraping, no auth
Settings pydantic-settings Type-safe env var loading, .env support
Containerisation Docker + docker-compose Reproducible environments, production-ready

Quick Start

Prerequisites

  • Python 3.11+
  • An OpenAI API key with GPT-4o access
  • Docker + Docker Compose (optional, for containerised dev)

1. Clone and configure

git clone https://github.com/maxericdeale/stock-research-agent.git
cd stock-research-agent

cp .env.example .env
# Edit .env and set your OPENAI_API_KEY

2a. Run with Docker (recommended)

docker-compose up --build

2b. Run locally

pip install -r requirements.txt
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

3. Hit the API

# Interactive Swagger UI
open http://localhost:8000/docs

# Or via curl
curl -X POST http://localhost:8000/research \
  -H "Content-Type: application/json" \
  -d '{"ticker": "AAPL", "exchange": "US"}'

Example Response

A real InvestmentBrief for AAPL (values illustrative):

{
  "success": true,
  "brief": {
    "ticker": "AAPL",
    "company_name": "Apple Inc.",
    "sector": "Technology",
    "summary": "Apple Inc. is the world's largest company by market capitalisation, generating over $385B in annual revenue. Its integrated ecosystem of hardware, software, and services creates durable competitive advantages and industry-leading switching costs.",
    "price_data": {
      "ticker": "AAPL",
      "current_price": 189.25,
      "currency": "USD",
      "day_change_pct": -0.83,
      "week_52_high": 199.62,
      "week_52_low": 164.08,
      "market_cap": 2920000000000,
      "pe_ratio": 29.4,
      "forward_pe": 26.1,
      "dividend_yield": 0.0055
    },
    "financial_highlights": [
      "Revenue TTM $385.6B with gross margin of 44.1% — expanding 80 bps YoY",
      "Free cash flow of $107B enables $90B+ in annual buybacks, reducing diluted share count 3% per year",
      "Services segment growing ~14% YoY and now represents ~22% of total revenue at ~70% gross margins",
      "EPS $6.13 on trailing basis; forward P/E of 26.1x prices in mid-teens earnings growth"
    ],
    "bull_case": {
      "points": [
        "Services gross margin (~70%) is meaningfully higher than hardware (~37%), and its revenue mix is expanding — structurally improving blended margins",
        "Apple Intelligence / on-device AI integration is the strongest iPhone upgrade catalyst since Face ID",
        "India manufacturing expansion reduces single-country supply chain concentration risk from China",
        "$107B FCF supports continued buybacks: AAPL has retired ~40% of its float since 2012"
      ],
      "conviction": "high"
    },
    "bear_case": {
      "points": [
        "China revenue (~18% of total) faces regulatory pressure and growing competition from Huawei in the premium segment",
        "iPhone unit growth is roughly flat — the entire growth thesis rests on Services ARPU expansion",
        "29.4x trailing P/E leaves almost no margin of safety if macro weakens and consumers defer hardware upgrades"
      ],
      "conviction": "medium"
    },
    "overall_sentiment": "bullish",
    "key_risks": [
      "Escalating US-China trade tensions impacting both supply chain costs and China consumer demand",
      "EU Digital Markets Act and US DOJ antitrust scrutiny threatening App Store economics",
      "Macro-driven consumer spending downturn compressing hardware replacement cycles"
    ],
    "news_summary": "Recent news is constructive: Q1 2024 services revenue hit an all-time high of $23.1B, and early iPhone 16 channel checks show sell-through tracking above expectations. No significant negative catalysts in current news flow.",
    "data_sources": ["yfinance", "Google News RSS"]
  }
}

Project Structure

stock_bulb/
├── app/
│   ├── __init__.py
│   ├── config.py         # pydantic-settings Settings; configures root logger
│   ├── models.py         # All Pydantic v2 models (request, tool outputs, InvestmentBrief)
│   ├── tools.py          # get_price_data, get_financials, get_news — plain Python, testable in isolation
│   ├── agent.py          # Pydantic AI agent, @research_agent.tool registrations, run_research()
│   └── main.py           # FastAPI app, CORS middleware, request logger, /health + /research routes
├── tests/
│   ├── __init__.py
│   └── test_agent.py     # Unit tests (model validation) + integration tests (mocked + live)
├── Dockerfile
├── docker-compose.yml
├── requirements.txt
├── pytest.ini
├── .env.example
├── .gitignore
└── README.md

Commands

# Install dependencies
pip install -r requirements.txt

# Run the API (development with hot-reload)
uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload

# Run via Docker
docker-compose up --build

# Run tests (mocked — no API key needed)
pytest tests/ -v -k "not live"

# Run all tests including live LLM call (requires OPENAI_API_KEY)
pytest tests/ -v

# Test a tool standalone
python -c "from app.tools import get_price_data; print(get_price_data('AAPL'))"
python -c "from app.tools import get_financials; print(get_financials('AAPL'))"
python -c "from app.tools import get_news; print(get_news('AAPL', 'Apple Inc'))"

# Hit the API manually
curl -X POST http://localhost:8000/research \
  -H "Content-Type: application/json" \
  -d '{"ticker": "AAPL", "exchange": "US"}'

# Health check
curl http://localhost:8000/health

Swagger UI: http://localhost:8000/docs


Technical Decisions

Why pydantic-ai instead of LangChain?

LangChain is a large, opinionated framework with significant abstraction overhead. pydantic-ai is purpose-built for structured output use cases: result_type=InvestmentBrief integrates directly with OpenAI's function-calling API, so the LLM is constrained to produce output that validates against the Pydantic v2 schema. This eliminates an entire class of bugs (hallucinated schema fields, wrong types, missing required keys) that plague prompt-engineered JSON extraction in LangChain.

Why synchronous tool implementations in tools.py?

The tool functions (get_price_data, get_financials, get_news) are synchronous because yfinance is synchronous and feedparser is synchronous. They are wrapped in async tool handlers inside agent.py — pydantic-ai runs sync tools via the event loop's thread pool executor, so the FastAPI worker is never blocked. This separation also keeps tools.py trivially testable without an event loop.

Why return success=False instead of HTTP 500 on agent errors?

Returning a structured error response (ResearchResponse(success=False, error="...")) is more useful to API consumers than a raw 500, especially for partial failures (e.g. news fetch fails but price data succeeded). It also avoids exposing internal stack traces to callers. The 500 is still possible for truly unhandled exceptions, caught by the global exception handler middleware.


Extending the Project

Add Redis caching

Cache InvestmentBrief results by ticker with a 15-minute TTL to avoid redundant LLM calls:

import redis.asyncio as redis
import json

cache = redis.from_url("redis://localhost:6379")

async def run_research_cached(request: ResearchRequest) -> InvestmentBrief:
    key = f"brief:{request.ticker}"
    cached = await cache.get(key)
    if cached:
        return InvestmentBrief.model_validate_json(cached)
    brief = await run_research(request)
    await cache.setex(key, 900, brief.model_dump_json())
    return brief

Add streaming responses

pydantic-ai supports streaming with agent.run_stream(). Replace the /research endpoint with a StreamingResponse to push partial output to clients as the agent reasons.

Deploy to Azure Container Apps

az containerapp up \
  --name stock-research-agent \
  --resource-group rg-agents \
  --image ghcr.io/youruser/stock-research-agent:latest \
  --env-vars OPENAI_API_KEY=secretref:openai-key \
  --ingress external --target-port 8000

Add a second LLM provider

pydantic-ai supports Anthropic Claude, Gemini, and Mistral with a one-line swap:

from pydantic_ai.models.anthropic import AnthropicModel
model = AnthropicModel("claude-3-5-sonnet-latest", api_key=settings.anthropic_api_key)

Add background task queue

For long-running research jobs, push requests onto a Celery/ARQ queue and return a job ID. Poll GET /research/{job_id} for results. This decouples API latency from LLM latency.


Roadmap

  • Redis caching layer with configurable TTL per ticker
  • Streaming endpoint (POST /research/stream) using pydantic-ai's run_stream()
  • Rate limiting per IP using slowapi
  • Support for non-US exchanges (JSE, LSE, ASX) with ticker normalisation
  • GitHub Actions CI pipeline with linting, type checking, and pytest

License

MIT. See LICENSE.

About

Agentic stock research API — Pydantic AI + FastAPI. Autonomously fetches price data, financials & news to produce structured investment briefs.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors