diff --git a/.claude/skills/response-playbooks/SKILL.md b/.claude/skills/response-playbooks/SKILL.md new file mode 100644 index 0000000..aaecb4c --- /dev/null +++ b/.claude/skills/response-playbooks/SKILL.md @@ -0,0 +1,277 @@ +--- +name: response-playbooks +description: > + Detection-to-response mapping and SOAR playbook design. Analyzes detections, + recommends tiered response actions (observe, investigate, contain, remediate), + and produces handoff docs for fusion-workflows to generate workflow YAML. + Use when planning response automation for detections, designing SOAR playbooks, + or mapping detections to Falcon Fusion workflow actions. +allowed-tools: Read, Write, Grep, Glob, Bash +--- + +# Response Playbooks + +Turn detections into automated response. This skill analyzes what a detection looks for, recommends tiered response actions appropriate to the threat, and hands off to `fusion-workflows` for workflow YAML generation. It bridges the gap between "alert fires" and "something happens." + +> **Response architect, not workflow builder.** This skill decides *what response* fits a detection. The `fusion-workflows` skill builds the actual workflow YAML. + +## When to Use This Skill + +- You have detections deployed (or proposed) and need response automation +- You want to design SOAR playbooks for a set of detections +- You need to map detection severity/type to appropriate response actions +- You're deciding which detections warrant automated containment vs. notification-only + +## Handoff Input + +This skill accepts: +- **Direct invocation** — user points at detections by name, file path, or description +- **Handoff doc from source-threat-modeling** — read the doc and use its context for Phase 1 + +If consuming a handoff doc, skip Phase 1 questions already answered in the doc. + +--- + +## Phase 1: Detection Intake + +**Goal:** Understand what each detection looks for and what's at stake. + +For each detection the user wants response automation for: + +### Step 1: Read the detection + +- If it's a deployed detection: read from `resources/detections/` +- If it's from a handoff doc: read the context provided +- If described verbally: ask enough questions to understand the threat + +Gather: +- Detection name and CQL logic +- Severity level +- MITRE ATT&CK mapping (technique + tactic) + +### Step 2: Identify the center entity + +What is the primary subject of this detection? + +| Entity Type | Examples | +|---|---| +| User account | Identity-based threats — credential attacks, privilege escalation | +| Host/endpoint | Malware, LOLBins, persistence, lateral movement | +| IP address | Network-based threats — C2, scanning, exfiltration | +| Cloud resource | Infrastructure threats — config changes, IAM abuse | +| Application | SaaS/app-level threats — data theft, API abuse | + +### Step 3: Assess blast radius + +If this detection fires as a true positive, how bad is it? + +| Blast Radius | Meaning | Examples | +|---|---|---| +| **Critical** | Active compromise, data loss in progress | Ransomware, active exfiltration, admin account takeover | +| **High** | Escalation or movement underway | Privilege escalation, lateral movement, credential theft | +| **Medium** | Suspicious but contained | Policy violation, anomalous login, config change | +| **Low** | Informational, worth tracking | New device, unusual time of day, minor anomaly | + +**Present the intake summary** (detection, entity, blast radius) and confirm with user before proceeding. + +**STOP** — Get user confirmation on the intake assessment. + +--- + +## Phase 2: Response Recommendation + +**Goal:** Propose a tiered response plan for each detection. + +### Response Tiers + +| Tier | Name | Automation Level | Description | +|---|---|---|---| +| **Tier 1** | Observe | Always auto-fire | Create case, log event, notify Slack/email | +| **Tier 2** | Investigate | Always auto-fire | Enrich alert with host details, user history, related alerts | +| **Tier 3** | Contain | Requires human approval | Disable user, isolate host, revoke session, block IP | +| **Tier 4** | Remediate | Manual only | Reset credentials, remove persistence, restore from backup | + +### Recommendation Process + +For each detection: + +**1. Tier 1 — Observe (every detection gets this)** + +- Create case with appropriate severity +- Notify the relevant channel (Slack, email, PagerDuty based on severity) +- Log to SIEM for correlation + +**2. Tier 2 — Investigate (based on center entity)** + +| Entity Type | Enrichment Actions | +|---|---| +| User account | Recent auth history, group memberships, risk score, MFA status | +| Host/endpoint | Running processes, network connections, login history, installed software | +| IP address | Geo lookup, reputation check, historical connections, associated users | +| Cloud resource | Config change history, access logs, associated IAM roles | +| Application | Recent API calls, data access patterns, admin actions | + +**3. Tier 3 — Contain (based on severity + threat type)** + +| Severity | Threat Type | Containment Recommendation | +|---|---|---| +| Critical | Active attack (exfil, ransomware, admin takeover) | Immediate containment with approval gate | +| Critical | Credential compromise (MFA fatigue, brute force success) | Disable user + revoke sessions with approval gate | +| High | Lateral movement | Isolate host with approval gate | +| High | Privilege escalation | Suspend elevated access with approval gate | +| Medium or below | Any | Skip containment — observe and investigate only | + +**Every Tier 3 action MUST have a human approval gate.** Never auto-execute containment. False positive containment is worse than delayed response. + +**4. Tier 4 — Remediate (always manual, but document it)** + +Document what the SOC should do after containment: +- Credential reset procedures +- Host reimaging steps +- Config rollback process +- Evidence preservation requirements +- Scope assessment (who/what else was affected?) + +### Validate available actions + +Before recommending, confirm what's actually available in the tenant: + +```bash +python .claude/skills/fusion-workflows/scripts/action_search.py --vendor +python .claude/skills/fusion-workflows/scripts/action_search.py --use-case +``` + +Only recommend actions that are available. If a recommended action isn't available, note it as a gap and suggest alternatives. + +### Present the response plan + +For each detection, present: + +``` +Detection: +Severity: | Entity: | Threat: + +Tier 1 (auto): + - Create case (severity: ) + - Notify with: detection name, affected entity, key indicators + +Tier 2 (auto): + - + +Tier 3 (approval required): + - — requires SOC analyst approval + +Tier 4 (manual): + - +``` + +**STOP** — Get user approval on the response plan before generating handoff docs. + +--- + +## Phase 3: Workflow Generation & Handoff + +**Goal:** Produce handoff documents for `fusion-workflows` to build the actual workflow YAML. + +For each approved response plan, write a handoff doc to `docs/handoffs/`. + +**Filename:** `YYYY-MM-DD-response-playbooks-to-fusion-workflows-.md` + +**Template:** + +```markdown +# Handoff: Response Playbooks → Fusion Workflows + +## Objective + +Create a Fusion workflow to automate response for: + +## Source + +- **Produced by:** response-playbooks skill +- **Date:** +- **Target skill:** fusion-workflows + +## Context + +| Field | Value | +|---|---| +| Detection name | | +| Detection resource_id | | +| Severity | | +| Center entity | | +| MITRE technique | | +| MITRE tactic | | + +## Approved Response Plan + +### Tier 1 — Observe (auto-fire) + +- **Create case** — severity: , title template: "" +- **Notify** — channel: , include: detection name, affected entity, source IP, event count + +### Tier 2 — Investigate (auto-fire) + +- **Enrich** — + +### Tier 3 — Contain (approval required) + +- **** —
+- **Approval gate:** SOC analyst must approve before execution via + +### Tier 4 — Remediate (manual) + +- + +## Decisions Made + +These have been reviewed and approved by the user. The receiving skill should NOT re-ask: + +- +- + +## Constraints + +- Tier 3 actions MUST have a human approval gate — never auto-execute containment +- Notification must include: +- + +## Workflow Structure + +- **Trigger:** Detection alert (detection name: ) +- **Flow:** Tier 1 actions → Tier 2 enrichment → conditional Tier 3 containment (with approval) +- **Approval mechanism:** + +## Artifacts + +- +``` + +**Tell the user:** "Handoff doc written to ``. Invoke the `fusion-workflows` skill and point it at this file to generate the workflow YAML." + +--- + +## Response Pattern Library + +Common detection-to-response mappings. These inform recommendations but don't override human judgment — use them as starting points. + +| Detection Type | Entity | Severity | Tier 1 | Tier 2 | Tier 3 | Tier 4 | +|---|---|---|---|---|---|---| +| Credential attack (brute force, stuffing, MFA fatigue) | User | Critical | Case + Slack | Auth history, risk score | Disable user, revoke sessions | Credential reset | +| Privilege escalation (new admin, role change) | User | High | Case + Slack | Access audit, change history | — | Review and revert access | +| Data exfiltration (bulk download, unusual export) | User/App | Critical | Case + Slack | Download history, data classification | Revoke sessions, block IP | Audit data exposure | +| Suspicious network (C2 beacon, unusual dest) | Host | High | Case + Slack | Process list, network connections | Isolate host | Reimage, hunt for lateral | +| Cloud config change (SG, IAM policy) | Cloud Resource | Medium | Case + Slack | Config diff, who-changed-what | — | Revert change | +| Anomalous login (impossible travel, new device) | User | Medium | Case + Slack | Login history, device inventory | — | — | +| Audit log tampering | Source | Critical | Case + Slack + page | Log gap analysis | Isolate source, freeze state | Forensic investigation | +| Service account abuse | User | High | Case + Slack | Service account scope, recent API calls | Rotate credentials | Audit all service account access | + +--- + +## Key Principles + +1. **Never auto-execute containment.** Tier 3 actions always require human approval. False positive containment is worse than delayed response. +2. **Every detection gets Tier 1.** At minimum: create a case and notify someone. +3. **Match response to blast radius.** A medium-severity anomalous login doesn't need host isolation. +4. **Validate before recommending.** Use action discovery to confirm what's available in the tenant. +5. **Document Tier 4 even though it's manual.** The SOC needs to know what comes after containment. diff --git a/.claude/skills/source-threat-modeling/SKILL.md b/.claude/skills/source-threat-modeling/SKILL.md new file mode 100644 index 0000000..a957a47 --- /dev/null +++ b/.claude/skills/source-threat-modeling/SKILL.md @@ -0,0 +1,284 @@ +--- +name: source-threat-modeling +description: > + Threat-model-first detection planning for data sources without OOTB coverage. + Analyzes what threats apply to a source type, validates against live log data, + and produces a prioritized detection backlog with handoff docs for authoring skills. + Use when onboarding a new data source, planning detection coverage for a source + without OOTB templates, or assessing what threats a source can detect. +allowed-tools: Read, Write, Grep, Glob, Bash +--- + +# Source Threat Modeling + +Turn a data source into detection coverage. This skill reasons about what threats are relevant to a source type, validates which are detectable in your actual log data, and produces a prioritized detection backlog. It does NOT write detections — it hands off to authoring skills (`behavioral-detections`, `cql-patterns`, `logscale-security-queries`) via handoff documents. + +> **Orchestrator, not author.** This skill decides *what* to detect. Existing skills decide *how* to write it. + +## When to Use This Skill + +- A new data source is connected to NGSIEM and has no OOTB detection templates +- You want to assess detection coverage gaps for an existing source +- You need a structured threat model before building bespoke detections +- You're onboarding a source type you haven't worked with before + +## Handoff Input + +This skill can be invoked directly or via a handoff document. If a handoff doc is provided, read it first and skip questions already answered. + +--- + +## Phase 1: Source Identification + +**Goal:** Establish what source we're working with and how to query it. + +Ask the user: + +1. **What product/vendor is the data source?** (e.g., Okta, GitHub audit logs, Cisco ASA, Zscaler) +2. **What log types are being ingested?** (authentication, admin activity, network flow, API audit, etc.) +3. **What is the NGSIEM scope filter?** + - e.g., `#Vendor="okta"`, `#repo="some_repo"`, `#event.module="some_module"` + - If the user doesn't know, help discover it: + ``` + * | groupBy([@repo, #Vendor, #event.module], limit=20) + ``` +4. **What role does this source play in the environment?** (identity provider, network perimeter, cloud infrastructure, application-level, endpoint, email/collaboration) + +**Check for existing coverage:** +- Scan `resources/detections/` for any rules already targeting this source +- Note what's covered so we don't duplicate + +**Output:** A CQL scope filter and source profile that constrains all subsequent work. + +**STOP** — Confirm the scope filter and source profile with the user before proceeding to threat modeling. + +--- + +## Phase 2: Threat Modeling + +**Goal:** Enumerate threats this source can observe, mapped to MITRE ATT&CK. + +Work through three threat categories systematically: + +### Category A: Abuse of the monitored system + +What can an attacker do *through* the system this source monitors? + +| Source Role | Example Threats | +|---|---| +| Identity provider | Credential stuffing, MFA bypass, session hijacking, account takeover | +| Network perimeter | C2 communication, lateral movement, data exfiltration, port scanning | +| Cloud infrastructure | Resource abuse, privilege escalation, config tampering, data access | +| Application | Injection, unauthorized access, data theft, API abuse | +| Endpoint | Malware execution, LOLBins, persistence mechanisms, credential dumping | +| Email/collaboration | Phishing, BEC, forwarding rules, delegation abuse | + +### Category B: Compromise of the source itself + +What does it look like when the source system is the target? +- Admin account takeover on the source platform +- Audit log tampering or deletion +- Security configuration changes (weakened settings) +- API key or token compromise +- Integration or connector manipulation + +### Category C: Lateral movement and escalation + +What cross-system activity is visible through this source? +- Privilege escalation (new admin grants, role changes) +- Cross-tenant or cross-account activity +- Service account abuse +- Access to new resources or scopes not previously seen + +**For each threat scenario, document:** + +| Field | Description | +|---|---| +| Threat scenario | Plain-language description of the attack | +| MITRE technique | ATT&CK technique ID (e.g., T1621) | +| MITRE tactic | ATT&CK tactic (e.g., Credential Access) | +| Expected event types | What log events would reveal this activity | +| Expected severity | Critical / High / Medium / Low | + +**Present the threat model to the user as a ranked table.** Rank by severity and likelihood. Discuss and refine before proceeding to log validation. + +**STOP** — Get user approval on the threat model before querying live data. + +--- + +## Phase 3: Log Validation + +**Goal:** Confirm which threats from Phase 2 are actually detectable in the live data. + +For each threat scenario, run exploratory CQL queries using `mcp__crowdstrike__ngsiem_query`: + +### Step 1: Event type discovery + +Map what the source actually emits: + +``` + +| groupBy([event.type, event.action, event.category], limit=50) +``` + +### Step 2: Per-scenario validation + +For each threat scenario from Phase 2: + +**a) Do the required event types exist?** +``` + event.type="" +| count() +``` + +**b) What's the volume?** (informs threshold decisions) +``` + event.type="" +| bucket(span=1d) +| count() +``` + +**c) What fields are available?** (informs detection logic) +``` + event.type="" +| head(10) +``` + +**d) What does "normal" look like?** (informs baselines) +``` + event.type="" +| groupBy([], function=count()) +| sort(_count, order=desc) +| head(20) +``` + +### Step 3: Feasibility classification + +For each threat scenario, classify: + +| Classification | Meaning | Action | +|---|---|---| +| **Detectable** | Required events exist, fields available, reasonable volume | Keep — proceed to backlog | +| **Partially detectable** | Some events present but missing key fields or context | Discuss with user — worth pursuing with limitations? | +| **Not detectable** | Required events don't exist in the data | Prune from backlog | +| **Surprising find** | Unexpected patterns worth investigating | Flag for user — potential quick win or live incident | + +**Present results.** For each scenario, show: classification, event types found, key fields available, daily volume estimate, and any surprising observations. + +**STOP** — Discuss findings with user. Prune not-detectable scenarios. Decide on partial detections. + +--- + +## Phase 4: Detection Backlog & Handoff + +**Goal:** Produce a prioritized detection backlog and hand off selected detections to authoring skills. + +### Build the backlog + +For each validated threat scenario, create a backlog entry: + +| Field | Description | +|---|---| +| **Priority** | 1 (highest) through N — based on severity + feasibility | +| **Threat scenario** | Plain-language description | +| **MITRE mapping** | Technique ID + tactic | +| **Detection approach** | `simple` (single event match), `threshold` (aggregation), `behavioral` (multi-event correlation) | +| **Complexity** | Low (single event) / Medium (aggregation/threshold) / High (multi-event correlation) | +| **Key event types** | From log validation | +| **Key fields** | From log validation | +| **Volume estimate** | Events/day from log validation | +| **Recommended skill** | Which authoring skill should build this | + +### Skill routing + +| Detection Approach | Route To | Why | +|---|---|---| +| Multi-event attack chains (deny-then-success, create-then-escalate) | `behavioral-detections` | Needs `correlate()` function | +| Threshold/aggregation rules (N events in T time) | `cql-patterns` | Pattern-based aggregation | +| Simple event matching or complex field logic | `logscale-security-queries` | General CQL development | + +### Present the backlog + +Show the full backlog as a prioritized table. User selects which detections to pursue. + +### Generate handoff documents + +For each selected detection, write a handoff doc to `docs/handoffs/`. + +**Filename:** `YYYY-MM-DD-threat-model-to--.md` + +**Template:** + +```markdown +# Handoff: Source Threat Modeling → + +## Objective + +Author a detection for: + +## Source + +- **Produced by:** source-threat-modeling skill +- **Date:** +- **Target skill:** + +## Context + +| Field | Value | +|---|---| +| Data source | | +| CQL scope filter | `` | +| Source role | | +| Threat scenario | | +| MITRE technique | | +| MITRE tactic | | +| Detection approach | | +| Estimated severity | | + +### Key Event Types + +- `` — + +### Key Fields + +- `` — + +### Volume Notes + + + +## Decisions Made + +These have been reviewed and approved by the user. The receiving skill should NOT re-ask: + +- +- + +## Constraints + +- 120s NGSIEM query timeout — keep correlation windows reasonable +- +- + +## Artifacts + +- +``` + +**Tell the user:** "Handoff doc written to ``. Invoke the `` skill and point it at this file to begin authoring." + +--- + +## Reference: Common Source Types + +| Source Category | Examples | Typical Threat Focus | +|---|---|---| +| Identity Provider | Okta, EntraID, Ping, Auth0 | Credential attacks, MFA bypass, admin takeover, privilege escalation | +| Cloud Infrastructure | AWS CloudTrail, GCP Audit, Azure Activity | Resource abuse, IAM escalation, config tampering, data access | +| Network Security | Cisco ASA, Palo Alto, Zscaler, Akamai | C2 comms, lateral movement, exfiltration, scanning | +| Source Code / DevOps | GitHub, GitLab, Bitbucket | Code theft, secret exposure, pipeline compromise, access changes | +| SaaS Applications | Salesforce, Workday, ServiceNow | Data exfiltration, privilege abuse, config changes | +| Endpoint | CrowdStrike EDR, Carbon Black, SentinelOne | Malware, LOLBins, persistence, credential dumping | +| Email / Collaboration | M365, Google Workspace | Phishing, BEC, forwarding rules, delegation abuse | + +This table is a starting point for Phase 2 — reason about threats specific to the actual product, not just the category. diff --git a/.gitignore b/.gitignore index ab84673..a4d0d02 100644 --- a/.gitignore +++ b/.gitignore @@ -34,3 +34,9 @@ docs/superpowers/specs/ # OS .DS_Store Thumbs.db + +# docs/superpowers (specs, plans — working documents, not committed) +docs/superpowers/ +# Handoff docs (ephemeral working artifacts between skills) +docs/handoffs/ +!docs/handoffs/.gitkeep diff --git a/CLAUDE.md b/CLAUDE.md index 46589fc..a38454e 100644 --- a/CLAUDE.md +++ b/CLAUDE.md @@ -65,8 +65,11 @@ Skills live in `.claude/skills/` and are invoked via Claude Code commands. | `logscale-security-queries` | LogScale/NGSIEM query reference and investigation playbooks | Stable | | `fusion-workflows` | Falcon Fusion workflow templates and YAML schema | Stable | | `detection-tuning` | FP tuning patterns with enrichment function catalog | Stable | +| `source-threat-modeling` | Threat-model-first detection planning for new data sources | New | +| `response-playbooks` | Detection-to-response mapping and SOAR playbook design | New | | `threat-hunting` | Autonomous PEAK-based threat hunting — hypothesis, intel, baseline hunts | Experimental | + ### Commands | Command | Description | diff --git a/docs/handoffs/.gitkeep b/docs/handoffs/.gitkeep new file mode 100644 index 0000000..e69de29