MAG7 SEC Filings Analyzer

AI-Powered Financial Intelligence Platform

An agentic AI-optimized full-stack RAG platform that turns SEC filings into fast, cited, analyst-grade intelligence with deterministic routing, single-call synthesis, and aggressive latency optimization.

Infrastructure (Terraform + AWS)

Provisioned with Terraform and deployed on AWS with cost guardrails:

EC2 (t3.micro) runs the FastAPI backend container
ECR stores backend Docker images
S3 Static Website hosts the React frontend build
SSM Parameter Store stores runtime secrets/config
AWS Budgets + CloudWatch Billing Alarm + SNS sends cost alerts

flowchart TB
        U[Users / Browser]
        S3[S3 Static Website\nReact Build]
        EC2[EC2 t3.micro\nFastAPI Docker Container]
        ECR[ECR\nBackend Image Repository]
        SSM[SSM Parameter Store\nSecrets + Runtime Config]
        EXT[OpenAI / Anthropic / Ollama]
        PC[Pinecone Vector DB]
        TF[Terraform]
        BUD[AWS Budgets]
        CW[CloudWatch Billing Alarm]
        SNS[SNS Email Alerts]

        U --> S3
        S3 -->|REST API calls| EC2
        EC2 --> EXT
        EC2 --> PC
        EC2 --> SSM
        EC2 -->|docker pull| ECR

        TF --> EC2
        TF --> ECR
        TF --> S3
        TF --> SSM
        TF --> BUD
        TF --> CW
        TF --> SNS

        BUD --> SNS
        CW --> SNS

Diagram source: docs/infra-architecture.mmd

Demo

Single Company Q&A	Multi-Company Compare
Ask any question about a MAG7 stock's SEC filings and receive a cited, LLM-generated answer with source references.	Compare financial metrics, risks, and strategies across multiple companies side-by-side.

🤖 Agentic AI Optimization Focus

This project is deliberately engineered to showcase optimized agentic AI systems design:

Deterministic Router Agent minimizes unnecessary LLM calls and reduces cost/latency.
Fast RAG Agent (single-call synthesis) compresses retrieval + reasoning + reporting into one high-efficiency pass.
Retrieval + Answer Caching delivers ultra-fast repeated queries and benchmark-level responsiveness.
Request Deduplication prevents duplicate concurrent work under load and improves throughput.
Provider-Agnostic LLM Layer enables rapid model switching (OpenAI / Anthropic / Ollama) without architecture changes.

In short: this app is not just “using AI” — it is optimizing agentic AI execution paths for real-world performance.

Key Technologies

🔥 Core Stack (Strong Highlights)

FastAPI + Async Python — blazing-fast APIs, clean architecture, and excellent developer velocity.
LangChain Multi-Agent RAG — optimization-first routing + retrieval + synthesis that demonstrates true agentic orchestration.
Pinecone Vector Database — lightning semantic search over large SEC filing corpora.
React 18 + Vite — ultra-snappy UI feedback and modern frontend productivity.
Terraform on AWS — repeatable, production-style infrastructure with real cost guardrails.

⚡ Why These Technologies Shine

FastAPI: automatic docs, strong typing, and async performance that scales elegantly.
LangChain: flexible orchestration primitives for multi-step reasoning and retrieval workflows.
Pinecone: purpose-built vector infrastructure optimized for low-latency relevance.
React + Vite: excellent DX, fast HMR, and smooth interactive UX for data-heavy applications.
Terraform: infrastructure as code that is predictable, reviewable, and easy to evolve.
AWS (EC2/ECR/S3/SSM): practical cloud primitives that balance control, speed, and cost.

Layer	Tech Stack
LLM Providers	OpenAI GPT-4o-mini · Anthropic Claude 3.5 Haiku · Ollama (local)
RAG Pipeline	LangChain 0.3 · Custom multi-agent architecture · Deterministic routing
Vector Database	Pinecone (serverless) · Sentence-Transformers embeddings
Backend	FastAPI · Pydantic v2 · Async Python · Uvicorn
Frontend	React 18 · Vite · Custom hooks · CSS modules
Data Source	SEC EDGAR API · 10-K & 10-Q filings
DevOps / Infra	Docker · Terraform · AWS EC2/ECR/S3/SSM · AWS Budgets · CloudWatch

Architecture

┌─────────────────────────────────────────────────────────────┐
│                      React Frontend                         │
│   TickerSelector → ChatWindow → ComparePanel → SECPreview   │
└────────────────────────┬────────────────────────────────────┘
                         │ REST API
┌────────────────────────▼────────────────────────────────────┐
│                    FastAPI Backend                           │
│                                                             │
│  ┌──────────┐   ┌──────────────┐   ┌────────────────────┐  │
│  │  Router   │──▶│  Fast RAG    │──▶│  LLM Provider      │  │
│  │  Agent    │   │  Agent       │   │  (OpenAI/Anthropic/ │  │
│  │(deterministic)│ (single call)│   │   Ollama)           │  │
│  └──────────┘   └──────┬───────┘   └────────────────────┘  │
│                         │                                    │
│              ┌──────────▼───────────┐                       │
│              │   Pinecone Vector DB  │                      │
│              │  (semantic retrieval)  │                      │
│              └───────────────────────┘                       │
└─────────────────────────────────────────────────────────────┘

RAG Agent Flow (Mermaid)

flowchart LR
        Q[User Question]
        R[Router Agent\nDeterministic Intent Routing]
        RET[Retriever\nPinecone Semantic Search]
        CONTEXT[Top-k Filing Chunks\n+ Metadata]
        FAST[Fast RAG Agent\nSingle-call synthesis]
        LLM[OpenAI / Anthropic / Ollama]
        A[Final Answer\nwith Citations]

        Q --> R
        R --> RET
        RET --> CONTEXT
        CONTEXT --> FAST
        FAST --> LLM
        LLM --> FAST
        FAST --> A

Diagram source: docs/rag-agent-flow.mmd

Features

Intelligent Q&A with Citations

Ask natural language questions about any MAG7 company's SEC filings. The system retrieves relevant filing excerpts, synthesizes an answer, and returns source citations — all in a single optimized LLM call.

Multi-Agent RAG Pipeline

Router Agent — Deterministic classification (no LLM call) routes queries with optimization-first control.
Fast RAG Agent — Retriever + analyst + reporter fused into a single LLM call (~3x fewer calls than naive chains).
LLM Cache — Reusable LLM instances with provider-aware pooling to reduce warmup overhead.
Request Deduplication Layer — Identical in-flight requests share execution for better concurrency behavior.

Multi-Provider LLM Support

Switch between OpenAI GPT-4o-mini, Anthropic Claude 3.5 Haiku, or Ollama (fully local, offline) with a single click in the UI. No code changes required.

Multi-Company Comparison

Compare financial metrics, risk factors, or business strategies across multiple MAG7 stocks side-by-side. Powered by concurrent API calls for fast results.

Performance Optimizations

Metric	Before	After	Improvement
Repeated query	9.69s	20ms	485x faster
Compare 2 stocks (cached)	12.21s	16ms	610x faster
Frontend re-renders	Excessive	Memoized	React.memo + useCallback
Health check polling	Every 30s	Every 2min	4x reduction

Advanced RAG Controls

Toggle reranking, query rewriting, retrieval caching, section boosting, and hybrid search from the UI control panel — empowering users to experiment with different retrieval strategies.

Real-Time SEC Data Ingestion

Fetch the latest 10-K and 10-Q filings directly from the SEC EDGAR API, chunk and embed them, and store in Pinecone — all from inside the app.

Project Structure

├── backend/
│   ├── app/
│   │   ├── agents/              # Multi-agent RAG system
│   │   │   ├── router_agent.py  #   Deterministic query classifier
│   │   │   ├── fast_rag.py      #   Single-call RAG pipeline
│   │   │   ├── llm_cache.py     #   Provider-aware LLM caching
│   │   │   ├── retriever_agent.py
│   │   │   ├── analyst_agent.py
│   │   │   └── reporter_agent.py
│   │   ├── services/            # SEC EDGAR API, text processing
│   │   ├── utils/               # HTTP client, request deduplication
│   │   ├── main.py              # FastAPI app with lifespan management
│   │   ├── models.py            # Pydantic v2 request/response schemas
│   │   ├── config.py            # Environment-based settings
│   │   └── pinecone_client.py   # Vector DB client
│   ├── tests/                   # pytest suite
│   ├── requirements.txt
│   └── Dockerfile
├── frontend/
│   ├── src/
│   │   ├── components/          # React 18 components (memoized)
│   │   │   ├── ChatWindow.jsx   #   Message display + auto-scroll
│   │   │   ├── ChatInput.jsx    #   User input with model selector
│   │   │   ├── ComparePanel.jsx #   Multi-stock comparison
│   │   │   ├── ControlPanel.jsx #   RAG parameter controls
│   │   │   ├── TickerSelector.jsx
│   │   │   └── SECPreviewModal.jsx
│   │   ├── services/api.js      # API client with timeout/retry
│   │   └── App.jsx
│   ├── vitest.config.js         # Frontend test config
│   ├── package.json
│   └── Dockerfile
├── docker-compose.yml           # One-command full stack launch
├── start-all.sh                 # Dev startup script
└── README.md

Quick Start

Prerequisites

Python 3.9+ — Backend runtime
Node.js 18+ — Frontend tooling
API Keys — Pinecone + at least one LLM provider (OpenAI, Anthropic, or Ollama)

1. Configure

cd backend
cp .env.example .env
# Edit .env with your API keys

2. Install

# Backend
cd backend
python -m venv venv
source venv/bin/activate    # Windows: venv\Scripts\activate
pip install -r requirements.txt

# Frontend
cd ../frontend
npm install

3. Launch

# Option A — One command
bash start-all.sh

# Option B — Docker
docker-compose up -d

Service	URL
Frontend	http://localhost:5173
Backend API	http://localhost:8000
API Docs (Swagger)	http://localhost:8000/docs

Testing

# Backend
cd backend
pytest tests/ -v
pytest tests/ --cov=app          # with coverage

# Frontend
cd frontend
npm test                          # run all tests
npm test -- --coverage            # with coverage

Sample Questions

Financial Performance

"What was Apple's total revenue and operating income in 2023?"
"How did NVIDIA's data center revenue grow compared to last year?"
"What are Tesla's gross margins and how have they changed?"

Risk & Strategy

"What are the key risk factors for Microsoft?"
"What is Google's AI strategy according to their latest filings?"
"What cybersecurity risks does Amazon face?"

Company Comparisons

"Compare NVIDIA and AMD's GPU market performance and revenue"
"How do Apple and Microsoft's R&D investments compare?"
"Compare Amazon and Google's cloud infrastructure spending"

Technical Highlights

Agentic path optimization — Explicitly engineered execution paths that minimize token, latency, and call overhead.
Single-call RAG synthesis — Retrieval + reasoning + reporting in one pass for materially faster responses.
Deterministic routing control — Zero-cost query routing before model invocation.
Retrieval + answer caching — Sub-second repeat behavior and dramatic latency collapse on warm paths.
Request deduplication under concurrency — Identical parallel requests are collapsed into one pipeline run.
Provider-agnostic model orchestration — OpenAI ↔ Anthropic ↔ Ollama switching without architectural rewrites.
Async-first throughput design — End-to-end async processing from API edge to model call.
Preloaded embedding runtime — Startup-time model readiness avoids first-query cold penalties.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
backend		backend
docs		docs
frontend		frontend
infra/aws-free-tier		infra/aws-free-tier
scripts		scripts
.gitignore		.gitignore
CHECKLIST.md		CHECKLIST.md
README.md		README.md
SETUP.md		SETUP.md
TRANSFER_GUIDE.md		TRANSFER_GUIDE.md
docker-compose.yml		docker-compose.yml
start-all.sh		start-all.sh
start-backend.sh		start-backend.sh
start-frontend.sh		start-frontend.sh
stop-all.sh		stop-all.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

MAG7 SEC Filings Analyzer

AI-Powered Financial Intelligence Platform

Infrastructure (Terraform + AWS)

Demo

🤖 Agentic AI Optimization Focus

Key Technologies

🔥 Core Stack (Strong Highlights)

⚡ Why These Technologies Shine

Architecture

RAG Agent Flow (Mermaid)

Features

Intelligent Q&A with Citations

Multi-Agent RAG Pipeline

Multi-Provider LLM Support

Multi-Company Comparison

Performance Optimizations

Advanced RAG Controls

Real-Time SEC Data Ingestion

Project Structure

Quick Start

Prerequisites

1. Configure

2. Install

3. Launch

Testing

Sample Questions

Technical Highlights

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

MAG7 SEC Filings Analyzer

AI-Powered Financial Intelligence Platform

Infrastructure (Terraform + AWS)

Demo

🤖 Agentic AI Optimization Focus

Key Technologies

🔥 Core Stack (Strong Highlights)

⚡ Why These Technologies Shine

Architecture

RAG Agent Flow (Mermaid)

Features

Intelligent Q&A with Citations

Multi-Agent RAG Pipeline

Multi-Provider LLM Support

Multi-Company Comparison

Performance Optimizations

Advanced RAG Controls

Real-Time SEC Data Ingestion

Project Structure

Quick Start

Prerequisites

1. Configure

2. Install

3. Launch

Testing

Sample Questions

Technical Highlights

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages