Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 41 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,17 @@
Conventions for authoring this skill. This governs how skill content is **written** and
**validated**.

# General rules

Never open responses with filler phrases like "Great question!", "Of course!", "Certainly!", or similar warmups. Start every response with the actual answer. No preamble, no acknowledgment of the question.

Match response length to task complexity. Simple questions get direct, short answers. Complex tasks get full, detailed responses. Never pad responses with restatements of the question or closing sentences that repeat what you just said.

Before any significant task, show me 2-3 ways you could approach this work. Wait for me to choose before proceeding.

If you are uncertain about any fact, statistic, date, or piece of technical information: say so explicitly before including it. Never fill gaps in your knowledge with plausible-sounding information. When in doubt, say so.


## Writing style

- **Be concise.** Technical documentation, not an essay. Favor tables, command recipes, and short
Expand All @@ -20,6 +31,36 @@ Conventions for authoring this skill. This governs how skill content is **writte
- **Anchor to canonical docs.** Each reference doc cites the upstream CrowdSec docs URL it derives
from. Claims trace to canonical documentation, not to memory.

## Content structure

`SKILL.md` is the router — a symptom/intent-indexed table that points into `references/`.
All depth lives in `references/<area>/`, organized by the axis that fits the area:

| Dir | Organized by | Notes |
|---|---|---|
| `install/` | **platform** (one file each) | `bare-metal.md` (apt/dnf + systemd), `docker.md`, `kubernetes.md`, `console.md` (enrollment) — install mechanics genuinely diverge per platform. |
| `configure/` | **config domain** | `acquisition`, `hub`, `profiles`, `notifications`, `allowlists`; platforms merged inline. `configure/bouncers/` nests one level by **service type** (`firewall`, `web-servers`). |
| `operate/` | **task** | `health-check`, `upgrades`, `multi-server`. |
| `appsec/` | **lifecycle** | `overview` → `deploy` → `configure` → `troubleshoot` (the WAF/AppSec feature silo). |
| `debug/` | **kind** | `common/` (`triage`, `errors`, `platform-gotchas`) + `symptoms/` (`parsing`, `no-alerts`, `not-blocked`). Feature troubleshooting is *routed to* the feature's own dir (e.g. AppSec → `appsec/troubleshoot.md`), not duplicated under debug/. |
| `migrate/` | **source product** | `from-fail2ban`. |
| `scripts/` | — | helper scripts (`diagnose.sh`, `check-verification.py`); stdlib/bash only, runnable in static checks. |

**Split files vs inline the prefix.** When deciding whether a platform variant gets its own file:

- **Split into separate files** only when the *content itself* diverges — package managers, file
paths, install/upgrade mechanics. `install/` is the canonical case.
- **Keep one file with inline command-prefix notes** when the task is identical and only the
invocation differs (`sudo cscli …` → `docker exec <name> …` → `kubectl exec -n <ns> <pod> -- …`).
This is the default across `configure/`, `operate/`, `appsec/`, and `debug/`.
- **Genuinely platform-specific *failure modes*** (not just prefixes — e.g. container mounts,
SELinux/AppArmor, k8s RBAC) collect in one place (`debug/common/platform-gotchas.md`) rather than
fragmenting a single symptom across per-platform files.

**Keep this current.** When you add, move, or remove a `references/` directory — or change an
area's organizing axis — update the table above in the *same* change. This section is the
authoritative map of the layout; let it drift and it stops being trustworthy.

## Testing

- **Nothing ships unverified.** Every command and every expected outcome must have been
Expand Down
14 changes: 8 additions & 6 deletions skills/crowdsec/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -75,11 +75,13 @@ Docker/k8s commands run inside the container/pod and do not need this.
| "upgrade", "back up", "roll back", "new version", "tainted items after upgrade" | [references/operate/upgrades.md](./references/operate/upgrades.md) |
| "multiple agents", "remote LAPI", "mTLS", "postgres backend" | [references/operate/multi-server.md](./references/operate/multi-server.md) *(TODO — stub)* |
| "is it working?", "smoke test", "validate install", "verify setup", "did detection / WAF / blocking actually wire up?" | [references/operate/health-check.md](./references/operate/health-check.md) |
| "it's broken" / "not working" / general diagnosis | [references/debug/triage.md](./references/debug/triage.md) → run `bash ${CLAUDE_SKILL_DIR}/scripts/diagnose.sh` |
| "logs not parsed", "0 parsed" | [references/debug/parsing.md](./references/debug/parsing.md) |
| "no alerts firing" | [references/debug/no-alerts.md](./references/debug/no-alerts.md) |
| "decision exists but not blocked" | [references/debug/bouncer-not-blocking.md](./references/debug/bouncer-not-blocking.md) |
| Specific error message | [references/debug/common-errors.md](./references/debug/common-errors.md) |
| **Debug — common** · "it's broken" / "not working" / general diagnosis | [references/debug/common/triage.md](./references/debug/common/triage.md) → run `bash ${CLAUDE_SKILL_DIR}/scripts/diagnose.sh` |
| **Debug — common** · specific error string | [references/debug/common/errors.md](./references/debug/common/errors.md) |
| **Debug — common** · "container can't see logs", "mount", "SELinux/AppArmor denied", "k8s RBAC / DaemonSet" | [references/debug/common/platform-gotchas.md](./references/debug/common/platform-gotchas.md) |
| **Debug — by symptom** · "logs not parsed", "0 parsed" | [references/debug/symptoms/parsing.md](./references/debug/symptoms/parsing.md) |
| **Debug — by symptom** · "no alerts firing" | [references/debug/symptoms/no-alerts.md](./references/debug/symptoms/no-alerts.md) |
| **Debug — by symptom** · "decision exists but not blocked" | [references/debug/symptoms/not-blocked.md](./references/debug/symptoms/not-blocked.md) |
| **Debug — by feature** · AppSec/WAF not blocking, false positives, captcha | [references/appsec/troubleshoot.md](./references/appsec/troubleshoot.md) |
| "switch from fail2ban" | [references/migrate/from-fail2ban.md](./references/migrate/from-fail2ban.md) *(TODO — stub)* |

For anything debug-shaped, the first move is almost always:
Expand Down Expand Up @@ -134,7 +136,7 @@ Where things live on a default bare-metal install:
Confirm with the user before any of these:

- `cscli decisions delete --all` — wipes every active ban including CAPI-pulled blocklists. Use targeted `delete -i`, `delete -r`, `delete --id`, `delete --origin lists --scenario <name>`.
- Editing hub-managed files under `/etc/crowdsec/{parsers,scenarios,collections,postoverflows,contexts}/` instead of the sibling `_custom/` directory — see [references/debug/triage.md](./references/debug/triage.md) § Hard don'ts.
- Editing hub-managed files under `/etc/crowdsec/{parsers,scenarios,collections,postoverflows,contexts}/` instead of the sibling `_custom/` directory — see [references/debug/common/triage.md](./references/debug/common/triage.md) § Hard don'ts.
- Disabling a signature collection wholesale to silence a false positive — pick the right suppression layer (allowlist / whitelist parser / postoverflow) per [references/configure/allowlists.md](./references/configure/allowlists.md) § Suppression mechanisms.
- Mutating host firewall state (firewall bouncer install, `ipset` flush, iptables↔nftables switch) without confirming — the firewall bouncer can wipe rule chains other tools depend on.
- Skipping `--reset-then-reuse-values` on `helm upgrade crowdsec` — silently drops values.
Expand Down
2 changes: 1 addition & 1 deletion skills/crowdsec/references/configure/acquisition.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Acquisition tells the engine **what logs to read and how to label them**. Each s
declares a `source:` (the datasource type) and a `labels.type:` (the parser hint). If the
engine reads lines but they show up as **`Lines unparsed`**, acquisition is usually fine
and the problem is the `type:` or the parser — debug that with
[../debug/parsing.md](../debug/parsing.md). If a source shows **0 `Lines read`**, the
[../debug/symptoms/parsing.md](../debug/symptoms/parsing.md). If a source shows **0 `Lines read`**, the
problem is here.

## Where acquisition lives
Expand Down
4 changes: 2 additions & 2 deletions skills/crowdsec/references/configure/bouncers/firewall.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ Only register manually when the bouncer runs on a **different host** than LAPI
> `/var/log/crowdsec-firewall-bouncer.log` (and the dpkg `--configure` step errors).
> Re-register: `cscli bouncers delete <name>`, `KEY=$(cscli bouncers add fw-local -o raw)`,
> write it into the yaml's `api_key:`, `systemctl restart crowdsec-firewall-bouncer`.
> See [../../debug/bouncer-not-blocking.md](../../debug/bouncer-not-blocking.md) § 3.
> See [../../debug/symptoms/not-blocked.md](../../debug/symptoms/not-blocked.md) § 3.

## 3 — What it creates in nftables

Expand Down Expand Up @@ -140,7 +140,7 @@ sudo cscli decisions delete -i 192.0.2.66
container-to-container blocking matters.
- **"Banned but still reachable"** → almost always `update_frequency` not
elapsed, `disable_ipv6` masking a v6 client, or the bouncer service stopped.
Full decision tree: [../../debug/bouncer-not-blocking.md](../../debug/bouncer-not-blocking.md).
Full decision tree: [../../debug/symptoms/not-blocked.md](../../debug/symptoms/not-blocked.md).

## Teardown

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -340,7 +340,7 @@ docker exec crowdsec cscli metrics show appsec # Processed/Blocked increment
- **WAF off silently:** `crowdsecAppsecEnabled` defaults to `false`, and AppSec must listen on
`0.0.0.0:7422` (not loopback) for a containerized Traefik to reach it.
- **`stream` lag:** a fresh ban lands within `updateIntervalSeconds`; immediate ban-then-curl
looks like a failure. (See [../../debug/bouncer-not-blocking.md](../../debug/bouncer-not-blocking.md).)
looks like a failure. (See [../../debug/symptoms/not-blocked.md](../../debug/symptoms/not-blocked.md).)

### Kubernetes (Helm) — extra gotchas

Expand Down
2 changes: 1 addition & 1 deletion skills/crowdsec/references/configure/hub.md
Original file line number Diff line number Diff line change
Expand Up @@ -101,7 +101,7 @@ editing them taints the item and your change is lost on the next `--force` upgra
Instead, drop an override file in the sibling `_custom/` directory for that type
(`scenarios/.../_custom/`, `parsers/.../_custom/`, etc.). Overrides are merged on top of the
hub item by `name`, survive upgrades, and keep the hub item pristine. See
[../debug/triage.md](../debug/triage.md) § Hard don'ts and the SKILL.md Hard don'ts list.
[../debug/common/triage.md](../debug/common/triage.md) § Hard don'ts and the SKILL.md Hard don'ts list.

To remove a collection and its pulled items:

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -30,32 +30,32 @@ Match the error string the engine/bouncer printed to the row below.

| Error string | Cause | Fix |
|---|---|---|
| `datasource of type appsec: … cannot parse appsec configuration: [2:3] cannot unmarshal []interface {} into Go struct field Configuration.AppsecConfig of type string` | `appsec_config:` (singular) given a **list** | Use the **plural** key `appsec_configs:` for a list; singular takes one string. See [../appsec/configure.md](../appsec/configure.md). |
| `unable to initialize inband engine : invalid WAF config from string: failed to compile the directive "secrule": duplicated rule id 100` | Two appsec-configs on one listener pull the **same** underlying rule (e.g. both include `base-config`/`vpatch-*`) | Use non-overlapping configs, or just `crowdsecurity/appsec-default` alone. See [../appsec/configure.md](../appsec/configure.md). |
| `no appsec-rules found for pattern <name>` | A bare appsec-config was installed without its rules; engine expands globs at load, `cscli` does not | Install via the **collection** (`cscli collections install crowdsecurity/appsec-virtual-patching`), which pulls the rule graph. See [../appsec/deploy.md](../appsec/deploy.md). |
| `datasource of type appsec: … cannot parse appsec configuration: [2:3] cannot unmarshal []interface {} into Go struct field Configuration.AppsecConfig of type string` | `appsec_config:` (singular) given a **list** | Use the **plural** key `appsec_configs:` for a list; singular takes one string. See [../appsec/configure.md](../../appsec/configure.md). |
| `unable to initialize inband engine : invalid WAF config from string: failed to compile the directive "secrule": duplicated rule id 100` | Two appsec-configs on one listener pull the **same** underlying rule (e.g. both include `base-config`/`vpatch-*`) | Use non-overlapping configs, or just `crowdsecurity/appsec-default` alone. See [../appsec/configure.md](../../appsec/configure.md). |
| `no appsec-rules found for pattern <name>` | A bare appsec-config was installed without its rules; engine expands globs at load, `cscli` does not | Install via the **collection** (`cscli collections install crowdsecurity/appsec-virtual-patching`), which pulls the rule graph. See [../appsec/deploy.md](../../appsec/deploy.md). |
| `no such datasource` / source type unknown | `source:`/`labels.type:` typo or a datasource the build doesn't support | Fix the key in the `acquis.d/*.yaml`; `crowdsec -t` points at the file:line. |
| Source reads lines but **0 parsed** | `type:` label doesn't match any installed parser | [parsing.md](./parsing.md). |
| Source reads lines but **0 parsed** | `type:` label doesn't match any installed parser | [parsing.md](../symptoms/parsing.md). |

## Permissions / OS

| Symptom | Cause | Fix |
|---|---|---|
| `permission denied` opening a log file; or source present but 0 lines read | `crowdsec` user can't read the file | `sudo -u crowdsec head <path>`; fix ownership/ACL. If that user *can* read it but the engine still can't, it's **SELinux/AppArmor** — `ausearch -m avc -ts recent` / `dmesg | grep DENIED`, then relabel/add policy (don't disable enforcement). |
| `permission denied` opening a log file; or source present but 0 lines read | `crowdsec` user can't read the file | `sudo -u crowdsec head <path>`; fix ownership/ACL. If that user *can* read it but the engine still can't, it's **SELinux/AppArmor** → [platform-gotchas.md](./platform-gotchas.md). |
| apt install of a bouncer hangs: `Failed to open terminal … debconf: whiptail output the above errors, giving up!` | A debconf dialog (e.g. pending-kernel notice) on a non-interactive shell | Re-run with `sudo DEBIAN_FRONTEND=noninteractive apt install -y …`. |

## LAPI / CAPI / auth

| Error | Cause | Fix |
|---|---|---|
| Agent: `unable to authenticate … machine not validated` | Agent machine not registered/validated with LAPI | `cscli machines list`; validate with `cscli machines validate <name>` (or re-`cscli machines add` on the agent). |
| Bouncer log: **HTTP 401** on decision pull | Bouncer key ≠ LAPI key (rotated, stale config, re-added) | `cscli bouncers list`; re-issue and paste the key into the bouncer config. [bouncer-not-blocking.md](./bouncer-not-blocking.md) §3. |
| Bouncer log: **HTTP 401** on decision pull | Bouncer key ≠ LAPI key (rotated, stale config, re-added) | `cscli bouncers list`; re-issue and paste the key into the bouncer config. [not-blocked.md](../symptoms/not-blocked.md) §3. |
| `cscli capi status` fails / CAPI register errors | Missing `online_api_credentials.yaml`, **clock skew**, or egress blocked to `api.crowdsec.net` | `cscli capi register` then reload; check `timedatectl` (TLS fails on skew); allow egress / set proxy. |

## Database

| Error | Cause | Fix |
|---|---|---|
| `database is locked` (sqlite) | Concurrent writers / slow disk; sqlite single-writer | Reduce write pressure; move `crowdsec.db` to faster storage; for multi-agent or high volume switch the backend to PostgreSQL — see [../operate/multi-server.md](../operate/multi-server.md). |
| `database is locked` (sqlite) | Concurrent writers / slow disk; sqlite single-writer | Reduce write pressure; move `crowdsec.db` to faster storage; for multi-agent or high volume switch the backend to PostgreSQL — see [../operate/multi-server.md](../../operate/multi-server.md). |
| sqlite errors + `df` shows full `/var/lib/crowdsec` | Disk full → silent alert-write failure | Free space / rotate; alerts resume. |

## Hub
Expand All @@ -69,10 +69,10 @@ Match the error string the engine/bouncer printed to the row below.

| Symptom | Likely cause | Confirm |
|---|---|---|
| Expected ban "not happening" for an IP | The IP matches an **allowlist** | `cscli allowlists check <ip>` → [../configure/allowlists.md](../configure/allowlists.md). |
| Decision exists, traffic still passes | Bouncer latency / scope / key / IP family | Full ladder: [bouncer-not-blocking.md](./bouncer-not-blocking.md). |
| Expected ban "not happening" for an IP | The IP matches an **allowlist** | `cscli allowlists check <ip>` → [../../configure/allowlists.md](../../configure/allowlists.md). |
| Decision exists, traffic still passes | Bouncer latency / scope / key / IP family | Full ladder: [not-blocked.md](../symptoms/not-blocked.md). |

When the string isn't here, capture the full forensic bundle with
[`scripts/diagnose.sh`](../../scripts/diagnose.sh) and read the agent log around
[`scripts/diagnose.sh`](../../../scripts/diagnose.sh) and read the agent log around
the first `level=error`/`FATAL` — the *first* error is usually the root cause;
later ones are fallout.
Loading
Loading