Skip to content

fukusuket/THuntCloud

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

187 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸͺ½THuntCloudπŸͺ½

AWS CloudTrail Log Threat Hunting Tool

SIEM-equivalent AWS CloudTrail threat hunting on a single ordinary laptop β€” no cloud infrastructure required.

License CI Docker Rust Python

Drop in your CloudTrail logs, run one command, and start hunting threats immediately.

  • No-query hunting β€” select a built-in hunt from the Streamlit dropdown and get instant results β€” no SQL knowledge required
  • AI-assisted analysis β€” OpenAI API (gpt-5.4) automatically analyses query result DataFrames and surfaces key findings in plain language
  • GeoIP enrichment β€” country, city, and ASN for every source IP via MaxMind GeoLite2
  • Built-in BI dashboard β€” Apache Superset with pre-built CloudTrail charts
  • Single-command launch β€” docker compose up -d

Screenshots

AI Chat (Streamlit UI) and Built-in Queries

AI Chat UI

Dashboard (Apache Superset)

Superset Dashboard


Architecture

Three Docker containers share one DuckDB file via a bind mount (docker/data/db/).

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    Docker Compose                       β”‚
β”‚                                                         β”‚
β”‚  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”  β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”   β”‚
β”‚  β”‚   ingester   β”‚   β”‚    agent     β”‚  β”‚  dashboard  β”‚   β”‚
β”‚  β”‚  (Rust)      β”‚   β”‚  (Streamlit) β”‚  β”‚  (Superset) β”‚   β”‚
β”‚  β”‚              β”‚   β”‚              β”‚  β”‚             β”‚   β”‚
β”‚  β”‚ CloudTrail   β”‚   β”‚   AI Chat    β”‚  β”‚  Visualize  β”‚   β”‚
β”‚  β”‚ gz ingest    β”‚   β”‚ SQL gen/exec β”‚  β”‚             β”‚   β”‚
β”‚  β”‚ READ_WRITE   β”‚   β”‚ READ_ONLY    β”‚  β”‚ READ_ONLY   β”‚   β”‚
β”‚  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜   β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜  β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜   β”‚
β”‚         β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜          β”‚
β”‚                            β”‚                            β”‚
β”‚                    β”Œβ”€β”€β”€β”€β”€β”€β”€β–Όβ”€β”€β”€β”€β”€β”€β”                     β”‚
β”‚                    β”‚   DuckDB     β”‚                     β”‚
β”‚                    β”‚ (Bind Mount) β”‚                     β”‚
β”‚                    β”‚  (SSD)       β”‚                     β”‚
β”‚                    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

End-to-End Sequence Diagram

The diagram below shows the full lifecycle from log ingestion through to a completed AI-assisted threat hunting session.

sequenceDiagram
    participant OPS  as Operator
    participant ING  as ingester (Rust)
    participant DB   as DuckDB (bind mount)
    participant APP  as chat / Streamlit
    participant OAI  as OpenAI API
    participant SS   as dashboard / Superset
    participant U    as Analyst (Browser)

    Note over OPS,ING: Phase 1 β€” Ingest
    OPS->>ING: docker compose run ingester ingest --path /data/logs
    ING->>ING: walk & filter files (date, path glob)
    ING->>ING: parallel parse (rayon) + SHA-256 dedup
    ING->>DB: batch insert via DuckDB Appender (READ_WRITE)
    ING->>DB: GeoIP enrich (optional)
    ING-->>OPS: IngestStats printed

    Note over OPS,SS: Phase 2 β€” Start services
    OPS->>APP: docker compose up -d
    OPS->>SS: docker compose up -d
    APP->>DB: open READ_ONLY connection
    SS->>DB: open READ_ONLY connection

    Note over U,OAI: Phase 3 β€” AI-assisted hunting (chat)
    U->>APP: natural language question
    APP->>OAI: generate_sql(question, schema, history)
    OAI-->>APP: SQL string
    APP->>APP: apply_date_filter + apply_row_limit
    APP->>APP: validate_query (blocklist + EXPLAIN)
    APP->>DB: execute SQL (READ_ONLY)
    DB-->>APP: result rows (DataFrame)
    APP->>OAI: generate_analysis(sql, results)
    OAI-->>APP: fact-based Markdown summary
    APP-->>U: table + analysis + chat history

    Note over U,SS: Phase 4 β€” BI dashboard (Superset)
    U->>SS: open http://localhost:8088
    SS->>DB: execute chart queries (READ_ONLY)
    DB-->>SS: aggregated result sets
    SS-->>U: interactive charts + filters
Loading

Prerequisites

Requirement Details
Docker Docker Desktop or Docker Engine + Compose v2
Resources 16 GB RAM minimum, SSD recommended
CloudTrail logs .json or .json.gz files exported from AWS
(Optional) OpenAI API key Required for AI query generation
(Optional) MaxMind GeoLite2 .mmdb files for GeoIP enrichment

Quick Start

# 1. Clone
git clone https://github.com/fukusuket/THuntCloud.git
cd THuntCloud/docker

# 2. Place CloudTrail logs
cp /path/to/cloudtrail/logs/*.json.gz logs/

# 3. Ingest logs
docker compose --profile ingest run --rm ingester ingest --path /data/logs

# 4. Start all services
docker compose up -d --build

Open http://localhost:8501 (AI Chat) or http://localhost:8088 (Dashboard, admin/admin).

With GeoIP enrichment (optional)

Place GeoLite2 .mmdb files in docker/data/geoip/, then:

docker compose --profile ingest run --rm ingester ingest \
  --path /data/logs \
  --geoip-city /data/geoip/GeoLite2-City.mmdb \
  --geoip-asn  /data/geoip/GeoLite2-ASN.mmdb

Common Commands

All commands are run from the docker/ directory.

docker compose down && docker compose up -d --build      # Rebuild & restart
docker compose logs -f                                   # View logs
docker compose --profile resync run --rm superset-resync # Fix blank dashboard after re-ingest

Modules

Module Language Role README
ingester Rust 1.85+ CloudTrail log ingestion (READ_WRITE) ingester/README.md
agent Python 3.12+ / Streamlit AI-assisted interactive chat for threat hunting (READ_ONLY) agent/README.md
dashboard Apache Superset BI visualization (READ_ONLY) dashboard/README.md

License

Apache License 2.0 β€” see LICENSE for details. See NOTICE for third-party license attributions.

Acknowledgements

This project exists thanks to these wonderful projects and datasets :)

About

πŸͺ½Docker Compose–based AWS CloudTrail threat hunting tool. Ingests logs into DuckDB with Rust, and lets you query them in natural language via an AI-powered Streamlit UI β€” no SIEM, no cloud dependency.πŸͺ½

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors