From 605c69ac54181ab2e6a0ad839f6abd16193f36b8 Mon Sep 17 00:00:00 2001 From: Thibault Koechlin Date: Thu, 21 May 2026 14:39:53 +0200 Subject: [PATCH 1/4] first round of fixes from discord threads: acquis common pitfalls, traefik and caddy config --- CHANGELOG.md | 19 ++ PUBLISHING.md | 81 --------- crowdsec/SKILL.md | 9 +- crowdsec/references/appsec/deploy.md | 2 +- crowdsec/references/configure/acquisition.md | 143 ++++++++++++++- .../configure/bouncers/web-servers.md | 170 +++++++++++++++++- crowdsec/references/configure/hub.md | 127 ++++++++++++- crowdsec/references/configure/profiles.md | 147 ++++++++++++++- 8 files changed, 587 insertions(+), 111 deletions(-) delete mode 100644 PUBLISHING.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 4f55a68..705f5fb 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,25 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 ## [Unreleased] +### Added +- `references/configure/acquisition.md` — file/journald/docker datasources, the + `labels.type` model, verification with `crowdsec -t` / `cscli metrics show acquisition` + / `cscli explain`, and common pitfalls. +- `references/configure/profiles.md` — alert→decision flow, why alerts don't always ban, + `profiles.yaml` structure, ban/captcha/throttle, `duration_expr` escalation, simulation + mode, and allowlist interaction. +- `references/configure/hub.md` — collections vs items, `update` vs `upgrade`, tainted-item + detection and repair, `_custom/` overrides, and the `sed -i` symlink-break pitfall. +- `references/configure/bouncers/web-servers.md` — full Traefik + (`maxlerebourg/crowdsec-bouncer-traefik-plugin`) and Caddy + (`hslatman/caddy-crowdsec-bouncer`) setup, AppSec wiring, and real-client-IP handling, + replacing the previous canonical-pointer stubs. + +### Changed +- `crowdsec/SKILL.md` — dropped the stub markers on acquisition/profiles/hub, added a + real-client-IP / reverse-proxy routing cue, and corrected the cheat sheet (`cscli profiles + list` does not exist; read `/etc/crowdsec/profiles.yaml`). + ## [0.1.0] - 2026-05-20 ## [0.1.0] - 2026-05-19 diff --git a/PUBLISHING.md b/PUBLISHING.md deleted file mode 100644 index 0914fc4..0000000 --- a/PUBLISHING.md +++ /dev/null @@ -1,81 +0,0 @@ -# Publishing & releasing the CrowdSec skill - -This is the runbook for cutting releases and distributing the skill across -marketplaces. The source of truth is this repo (`crowdsecurity/crowdsec-skill`); -the same plugin can be listed in several marketplaces with no conflict because -each uses a distinct namespace (`@crowdsecurity`, `@claude-community`). - -## Releasing a new version - -Versioning is **semver at the plugin level** (`SKILL.md` has no version field). -The version lives in `.claude-plugin/plugin.json` and is mirrored into -`.claude-plugin/marketplace.json` and `CHANGELOG.md`. - -You don't edit those JSON files by hand. The release flow is: - -1. Make sure everything you want to ship is on `main` and the `[Unreleased]` - section of `CHANGELOG.md` lists the changes. -2. **Publish a GitHub Release** with a tag like `v0.2.0` (semver, leading `v`). -3. The [`release.yml`](.github/workflows/release.yml) workflow then: - - writes `0.2.0` into `plugin.json` and the `marketplace.json` entry, - - moves `[Unreleased]` notes under a dated `[0.2.0]` heading and refreshes the - compare links, - - runs `claude plugin validate .` as a gate, - - commits the bumped files back to `main`. - -**Bump rules:** patch = doc/reference fixes · minor = new coverage area or -script · major = breaking change to scope or structure. - -Every PR is also checked by [`validate.yml`](.github/workflows/validate.yml), -which runs `claude plugin validate .` and fails if `plugin.json` and -`marketplace.json` versions have drifted apart. - -## Pre-flight checklist (before any publication) - -- `claude plugin validate .` passes locally. -- Repo is **public**; `README.md` and `LICENSE` are present. -- `crowdsec/SKILL.md` frontmatter `name`/`description` are accurate. -- `CHANGELOG.md` references only files that exist. - -## A. CrowdSec marketplace (live now) - -This repo is the official CrowdSec marketplace: it contains -`.claude-plugin/marketplace.json` (`name: crowdsecurity`) with the plugin served -from `"source": "./"`, so users can install today: - -```text -/plugin marketplace add crowdsecurity/crowdsec-skill -/plugin install crowdsec@crowdsecurity -``` - -## B. Anthropic community marketplace - -The public, Anthropic-curated catalog (`anthropics/claude-plugins-community`). - -1. Submit via (form). -2. Submission goes through automated validation + safety screening. -3. On approval the plugin is **pinned to a commit SHA** in the community catalog; - CI bumps the pin as you push new commits, and the public catalog syncs nightly. -4. Users install as `crowdsec@claude-community`. - -Requirements: public repo, README + LICENSE present, `claude plugin validate` -clean (it runs the same checks Anthropic's pipeline does). - -## C. skills.sh (exploratory — verify first) - -`skills.sh` is a **third-party** community directory, not part of Anthropic's -official tooling. Before advertising it in the README, confirm: - -- whether it indexes plugins or raw `SKILL.md` skills, -- its submission flow (PR vs web form vs auto-crawl of public repos), -- any manifest/format it expects. - -Do **not** add a skills.sh install line to the README until the flow is confirmed. - -## Keeping one source of truth - -- Code stays in one repo; bump semver there once per release. -- The community catalog pins SHAs (CI-managed); self-hosted marketplaces track - the same source repo. -- Never set the version in both `plugin.json` and `marketplace.json` to different - values — `plugin.json` wins silently. The release workflow keeps them equal. diff --git a/crowdsec/SKILL.md b/crowdsec/SKILL.md index 14d04b5..c365c70 100644 --- a/crowdsec/SKILL.md +++ b/crowdsec/SKILL.md @@ -57,14 +57,15 @@ Docker/k8s commands run inside the container/pod and do not need this. | Cue from user | Go to | |---|---| | "install", "set up", "fresh box", "how do I start" | [references/install/](./references/install/) (pick file by env) | -| "configure logs / acquisition", "read journald / syslog / docker logs" | [references/configure/acquisition.md](./references/configure/acquisition.md) *(TODO — stub)* | -| "install a collection / parser / scenario", "hub", "tainted" | [references/configure/hub.md](./references/configure/hub.md) *(TODO — stub)* | -| "ban duration", "captcha", "decisions", "simulation" | [references/configure/profiles.md](./references/configure/profiles.md) *(TODO — stub)* | +| "configure logs / acquisition", "read journald / syslog / docker logs" | [references/configure/acquisition.md](./references/configure/acquisition.md) | +| "install a collection / parser / scenario", "hub", "tainted" | [references/configure/hub.md](./references/configure/hub.md) | +| "ban duration", "captcha", "decisions", "simulation", "alerts but no bans" | [references/configure/profiles.md](./references/configure/profiles.md) | | "allowlist my office / CDN / monitoring IP", "I'm getting blocked by CAPI", "exclude IP from any ban" | [references/configure/allowlists.md](./references/configure/allowlists.md) | | "whitelist vs allowlist vs postoverflow", "which suppression layer should I use" | [references/configure/allowlists.md](./references/configure/allowlists.md) § Suppression mechanisms | | "alert me on slack/email/webhook" | [references/configure/notifications.md](./references/configure/notifications.md) *(TODO — stub)* | | "block at the firewall", "iptables", "nftables", "ipset" | [references/configure/bouncers/firewall.md](./references/configure/bouncers/firewall.md) | | "nginx / traefik / caddy bouncer" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) | +| "wrong source IP", "real client IP", "behind Cloudflare / reverse proxy / NPM", "X-Forwarded-For", "everyone shows as the proxy IP" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) — per-bouncer real-IP/trusted-proxy sections | | "AppSec", "WAF", "virtual patching", "block by request shape" | [references/appsec/](./references/appsec/) — overview, deploy, configure, troubleshoot | | "Console", "enroll", "share signals" | [references/install/console.md](./references/install/console.md) | | "upgrade", "back up", "roll back" | [references/operate/upgrades.md](./references/operate/upgrades.md) *(TODO — stub)* | @@ -110,7 +111,7 @@ These work in every environment. On bare-metal/systemd, prefix with `sudo` (unle | Replay a single log line | `cscli explain --log '' --type ` | | Validate config after editing any yaml (acquisition/profiles/config) | `crowdsec -t` (bare-metal; also auto-runs on `systemctl reload`) — then confirm the source reads with `cscli metrics show acquisition` | | See simulation state (alerts but no decisions) | `cscli simulation status` | -| List decision profiles (filters / ban duration) | `cscli profiles list` — full content in `/etc/crowdsec/profiles.yaml` | +| Inspect decision profiles (filters / ban duration) | `cat /etc/crowdsec/profiles.yaml` — there is **no** `cscli profiles` command (through v1.7.8); see [references/configure/profiles.md](./references/configure/profiles.md) | Where things live on a default bare-metal install: diff --git a/crowdsec/references/appsec/deploy.md b/crowdsec/references/appsec/deploy.md index 9ba2cf6..01c9e17 100644 --- a/crowdsec/references/appsec/deploy.md +++ b/crowdsec/references/appsec/deploy.md @@ -112,7 +112,7 @@ The smoke test above proves the WAF works. For production you point a real bounc | Bouncer | Where to set the AppSec endpoint | |---|---| | `crowdsec-nginx-bouncer` (lua module) | `APPSEC_URL=http://127.0.0.1:7422` in `/etc/crowdsec/bouncers/crowdsec-nginx-bouncer.conf` (shell-style `KEY=VALUE`, empty by default = WAF off). The self-registered `API_KEY` already serves AppSec — reuse it. | -| `crowdsec-traefik-bouncer` (middleware plugin) | `crowdsec.appsec.enabled: true`, `crowdsec.appsec.url`, and the AppSec-aware API key in `crowdsec.crowdsecLapiKey`. | +| Traefik (`maxlerebourg/crowdsec-bouncer-traefik-plugin`) | Flat plugin options: `crowdsecAppsecEnabled: true` (default false), `crowdsecAppsecHost: crowdsec:7422` (host:port, no scheme), and the bouncer key in `crowdsecLapiKey`. Full recipe in [../configure/bouncers/web-servers.md](../configure/bouncers/web-servers.md) § Traefik. | | `crowdsec-caddy-bouncer` (Caddy module) | Equivalent `appsec_url` directive on the bouncer block. | | Any other AppSec-aware bouncer | Look for an `appsec_url` / `appsec.url` field; auth is always the bouncer's existing API key. | diff --git a/crowdsec/references/configure/acquisition.md b/crowdsec/references/configure/acquisition.md index ee83140..8112013 100644 --- a/crowdsec/references/configure/acquisition.md +++ b/crowdsec/references/configure/acquisition.md @@ -2,10 +2,139 @@ Canonical docs: · datasources index -> STUB. To cover: -> - `acquis.yaml` vs. `acquis.d/*.yaml` -> - File datasource (paths, type/labels, multi-file globs) -> - journald datasource (filters, units) -> - syslog, kinesis, k8s_audit, docker, AppSec — when to pick each -> - Verify a source after editing: `crowdsec -t` (validate config), `cscli metrics show acquisition` (confirm it's read), `cscli explain` (confirm a line parses) -> - Common pitfalls: missing `type:` label (parser won't match), permission denied on log files, journald unit filter typos +Acquisition tells the engine **what logs to read and how to label them**. Each source +declares a `source:` (the datasource type) and a `labels.type:` (the parser hint). If the +engine reads lines but they show up as **`Lines unparsed`**, acquisition is usually fine +and the problem is the `type:` or the parser — debug that with +[../debug/parsing.md](../debug/parsing.md). If a source shows **0 `Lines read`**, the +problem is here. + +## Where acquisition lives + +| | Path / mechanism | +|---|---| +| Single legacy file | `/etc/crowdsec/acquis.yaml` (`acquisition_path` in `config.yaml`) | +| Drop-in dir (preferred) | `/etc/crowdsec/acquis.d/*.yaml` (`acquisition_dir` in `config.yaml`) — one file per source set | +| Docker | Bind-mount or env (`COLLECTIONS`, plus a mounted `acquis.d`); see Per-environment notes | +| Kubernetes | The chart's `config.acquisition` values render into the same `acquis.d` files | + +Both `acquisition_path` and `acquisition_dir` load if set — check `config.yaml`: + +```bash +sudo grep -E 'acquisition_(path|dir)' /etc/crowdsec/config.yaml +# acquisition_path: /etc/crowdsec/acquis.yaml +# acquisition_dir: /etc/crowdsec/acquis.d +``` + +Each YAML doc is **one source**. Multiple sources per file are allowed if separated by +`---`. Put unrelated sources in their own files under `acquis.d/`. + +## The label model — every source needs `labels.type` + +`labels.type` is the parser router. A source with no `type` (or the wrong one) is read but +never parsed — every line lands in `Lines unparsed`. Set it to the family the lines belong +to: `syslog`, `nginx`, `haproxy`, `appsec`, etc. (the value the relevant parser matches on). + +## File datasource + +```yaml +source: file +filenames: + - /var/log/nginx/*.log # globs allowed + - /var/log/auth.log # list as many paths as you need +labels: + type: nginx +``` + +Glob expansion is evaluated at startup; files created later that match are **not** picked +up until reload. For high-rotation logs prefer the directory plus a glob over naming each +file. + +## journald datasource + +```yaml +source: journalctl +journalctl_filter: + - "_SYSTEMD_UNIT=ssh.service" # journalctl-style match; one filter per list entry +labels: + type: syslog +``` + +The filter strings are passed straight to `journalctl`. After reload the source appears in +metrics as `journalctl:journalctl-_SYSTEMD_UNIT=ssh.service`. A typo in the unit name is +silent — the source reads **0 lines** rather than erroring. + +## docker datasource + +For a CrowdSec **container** reading other containers' stdout/stderr via the Docker socket: + +```yaml +source: docker +container_name: + - acq-nginx # exact names; container_name_regexp / labels also supported +labels: + type: nginx +``` + +Requires `/var/run/docker.sock` mounted into the CrowdSec container. The source shows up as +`docker:`. Use this instead of a file source when apps log to stdout (the +12-factor norm in Docker/compose) — there is no log file to bind-mount. + +## When to pick which source + +| Logs come from… | `source:` | +|---|---| +| A file or files on disk | `file` | +| systemd journal (no file written, e.g. modern sshd) | `journalctl` | +| Other containers' stdout (CrowdSec runs in Docker) | `docker` | +| A remote host shipping over syslog | `syslog` (listener) | +| Kubernetes audit webhook | `k8s_audit` | +| AWS Kinesis / CloudWatch | `kinesis` / `cloudwatch` | +| The WAF listener (not a log — request inspection) | `appsec` (see [../appsec/deploy.md](../appsec/deploy.md)) | + +## Verify after editing + +```bash +# 1. Validate config — silent + exit 0 means OK. A bad source prints FATAL. +sudo crowdsec -t +# e.g. FATAL crowdsec init: while loading acquisition config: +# /etc/crowdsec/acquis.d/foo.yaml: unknown data source nonexistent_ds + +# 2. Apply (reload picks up acquisition changes without dropping the API) +sudo systemctl reload crowdsec + +# 3. Confirm the source is actually read — find your source, check 'Lines read' climbs +sudo cscli metrics show acquisition +# | file:/var/log/nginx/access.log | 19 | 19 | - ... +# | journalctl:journalctl-_SYSTEMD_UNIT=ssh.service | 2 | 2 | - ... +# | docker:acq-nginx | 5 | 5 | - ... + +# 4. Confirm a representative line parses with the chosen type +sudo cscli explain --log 'May 21 09:00:00 host sshd[123]: Failed password for invalid user admin from 1.2.3.4 port 22 ssh2' --type syslog +# s01-parse → 🟢 crowdsecurity/sshd-logs ... parser success 🟢 +sudo cscli explain --file /var/log/nginx/access.log --type nginx # replay a whole file +``` + +## Pitfalls + +- **Missing/wrong `labels.type`:** lines read but all `unparsed`. The single most common + acquisition mistake. Match `type` to the parser family. +- **Permission denied on log files:** on bare-metal the engine runs as root and reads most + logs, but tightly-permissioned files (e.g. some `/var/log` set to `0640 root:adm`) can + still block it under a non-root setup — check ownership/ACLs if a file source reads 0. +- **journald unit typo:** wrong `_SYSTEMD_UNIT` → 0 lines, no error. Verify with + `journalctl _SYSTEMD_UNIT=ssh.service` first. +- **Docker bind-mount path mismatch:** for a *file* source inside a CrowdSec container, the + `filenames:` must be the **container** path, not the host path. Mismatch → 0 lines. (Use + the `docker` source to avoid the problem entirely.) +- **Globs are startup-only:** new files matching a glob need a reload to be acquired. +- **Edited but not applied:** `crowdsec -t` validates the file but does not load it — you + still need `systemctl reload crowdsec` (or recreate the container / `helm upgrade`). + +## Per-environment notes + +| Env | What changes | +|---|---| +| **systemd / bare-metal** | Recipes above as-is. Edit `acquis.d/*.yaml`, `crowdsec -t`, `systemctl reload crowdsec`. | +| **Docker / compose** | Mount `./acquis.d:/etc/crowdsec/acquis.d` (and `/var/run/docker.sock` for the docker source). `COLLECTIONS=`/`PARSERS=` env install hub items at start. Run cscli with `docker exec cscli metrics show acquisition`. Recreate the container to apply (a reload signal also works). | +| **Kubernetes / Helm** | Define sources under `config.acquisition` in values; `helm upgrade --reset-then-reuse-values`. Inspect with `kubectl exec -n -- cscli metrics show acquisition`. The `k8s_audit` source needs the API server's audit webhook pointed at the agent. | diff --git a/crowdsec/references/configure/bouncers/web-servers.md b/crowdsec/references/configure/bouncers/web-servers.md index a7712d6..f7afd32 100644 --- a/crowdsec/references/configure/bouncers/web-servers.md +++ b/crowdsec/references/configure/bouncers/web-servers.md @@ -217,10 +217,174 @@ sudo cscli decisions delete --ip 127.0.0.1 - **No WAF:** there is no AppSec path for the apache bouncer. To run the CrowdSec WAF in front of apache, terminate with nginx/haproxy SPOA (which forward to `:7422`) ahead of apache, or use a firewall bouncer for IP-level blocking. - **Early version:** apache bouncer is v0.1 — expect rough edges like the unsubstituted key above. -## Traefik — `crowdsec-traefik-bouncer` +## Traefik — `crowdsec-bouncer-traefik-plugin` -Middleware plugin (Yaegi) or a standalone bouncer container. Per the canonical page, AppSec is wired via `crowdsec.appsec.enabled` + `crowdsec.appsec.url`, with the AppSec-aware key in `crowdsec.crowdsecLapiKey`. Follow the canonical [Traefik bouncer page](https://docs.crowdsec.net/u/bouncers/intro). +WAF-capable. The canonical Traefik integration is the community middleware plugin +**[`maxlerebourg/crowdsec-bouncer-traefik-plugin`](https://github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin)** +(loaded via Traefik's Yaegi engine — no separate binary). It checks LAPI decisions and, +optionally, forwards each request to the AppSec listener. + +### Load the plugin (static config) + +```yaml +# traefik.yml (static) +experimental: + plugins: + bouncer: + moduleName: github.com/maxlerebourg/crowdsec-bouncer-traefik-plugin + version: v1.6.0 +``` + +### Configure the middleware (dynamic config) + +The bouncer needs a LAPI key. With the official CrowdSec container, set a fixed key via +`BOUNCER_KEY_traefik: ` in its env (auto-registers on start); on bare-metal LAPI mint +one with `cscli bouncers add traefik -o raw`. + +```yaml +# dynamic config (file provider) — same key serves LAPI decisions AND AppSec +http: + middlewares: + crowdsec: + plugin: + bouncer: + enabled: true + crowdsecMode: stream # poll the full decision list (recommended) + updateIntervalSeconds: 10 # stream poll cadence; a new ban lands within this + crowdsecLapiKey: + crowdsecLapiScheme: http + crowdsecLapiHost: crowdsec:8080 # service:port on the Docker network + crowdsecAppsecEnabled: true # turn on inline WAF + crowdsecAppsecHost: crowdsec:7422 # AppSec listener (must listen 0.0.0.0:7422) + forwardedHeadersTrustedIPs: + - "172.16.0.0/12" # the proxy/LB hop(s) in front, if any + routers: + whoami: + rule: "PathPrefix(`/`)" + service: whoami + middlewares: [crowdsec] +``` + +| Key | Set to | Notes | +|---|---|---| +| `crowdsecMode` | `stream` | `live` = query LAPI per request; `stream` = poll list (lower latency, prod default); `appsec` = WAF only; `none`/`alone`. | +| `crowdsecLapiKey` | (bouncer key) | Serves both decisions and AppSec. | +| `crowdsecLapiHost` | `crowdsec:8080` | Host:port, no scheme (scheme is `crowdsecLapiScheme`). | +| `crowdsecAppsecEnabled` | `false` | **WAF is off by default.** `true` to forward requests to AppSec. | +| `crowdsecAppsecHost` | `crowdsec:7422` | AppSec must `listen_addr: 0.0.0.0:7422` so the Traefik container can reach it. | +| `forwardedHeadersTrustedIPs` | `[]` | Plugin-side trust for `X-Forwarded-For`. **Not sufficient alone — see real-IP pitfall.** | +| `clientTrustedIPs` | `[]` | IPs that **bypass the bouncer entirely**. Do **not** put your proxy/Docker range here or every request is allowed. | + +### Real client IP — the #1 Traefik gotcha + +Traefik **rewrites `X-Forwarded-For` to the immediate peer** unless the *entrypoint* trusts +that hop. Without it, the plugin only ever sees the proxy/Docker-gateway IP, so bans on the +real client never match. Set it on the entrypoint **in addition to** the plugin option: + +```yaml +# traefik.yml (static) +entryPoints: + web: + address: ":80" + forwardedHeaders: + trustedIPs: + - "172.16.0.0/12" # the upstream proxy/LB (or Docker network) in front of Traefik +``` + +### Verify end-to-end (through Traefik, not directly to LAPI/:7422) + +```bash +curl -sS -o /dev/null -w 'normal: %{http_code}\n' http://127.0.0.1:8081/ # 200 +curl -sS -o /dev/null -w 'appsec block: %{http_code}\n' 'http://127.0.0.1:8081/vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php' # 403 +# ban an IP, wait one stream interval, present it as the forwarded client: +docker exec crowdsec cscli decisions add --ip 198.51.100.123 --duration 5m --reason test +sleep 12 # updateIntervalSeconds=10 +curl -sS -o /dev/null -w 'banned XFF: %{http_code}\n' -H 'X-Forwarded-For: 198.51.100.123' http://127.0.0.1:8081/ # 403 +curl -sS -o /dev/null -w 'clean XFF: %{http_code}\n' -H 'X-Forwarded-For: 203.0.113.5' http://127.0.0.1:8081/ # 200 +docker exec crowdsec cscli decisions delete --ip 198.51.100.123 +docker exec crowdsec cscli metrics show appsec # Processed/Blocked increment +``` + +### Pitfalls + +- **Real IP rewritten:** if bans never match, you almost certainly skipped the *entrypoint* + `forwardedHeaders.trustedIPs` above. The plugin's `forwardedHeadersTrustedIPs` is a second, + separate layer — you usually need both. +- **`clientTrustedIPs` bypass:** anything in this list skips the bouncer. Putting your Docker + range here makes every request return 200 (no AppSec, no ban). Use `forwardedHeadersTrustedIPs` + for proxy trust, not this. +- **WAF off silently:** `crowdsecAppsecEnabled` defaults to `false`, and AppSec must listen on + `0.0.0.0:7422` (not loopback) for a containerized Traefik to reach it. +- **`stream` lag:** a fresh ban lands within `updateIntervalSeconds`; immediate ban-then-curl + looks like a failure. (See [../../debug/bouncer-not-blocking.md](../../debug/bouncer-not-blocking.md).) ## Caddy — `caddy-crowdsec-bouncer` -Caddy module; per the canonical page, set the `appsec_url` directive on the bouncer block, auth via the bouncer's API key. Follow the canonical [Caddy bouncer page](https://docs.crowdsec.net/u/bouncers/intro). +WAF-capable Caddy module ([`hslatman/caddy-crowdsec-bouncer`](https://github.com/hslatman/caddy-crowdsec-bouncer)). +Caddy has no plugin loader, so the module must be **compiled in** — build a custom binary/image +with `xcaddy`. + +### Build with the module + +```dockerfile +FROM caddy:2.10-builder AS builder +RUN xcaddy build \ + --with github.com/hslatman/caddy-crowdsec-bouncer/http \ + --with github.com/hslatman/caddy-crowdsec-bouncer/appsec +FROM caddy:2.10 +COPY --from=builder /usr/bin/caddy /usr/bin/caddy +``` + +(`/http` enforces decisions; `/appsec` adds the WAF handler. Add `/layer4` only for L4 +proxying.) Mint a bouncer key with `cscli bouncers add caddy -o raw`. + +### Caddyfile + +```caddyfile +{ + crowdsec { + api_url http://crowdsec:8080 + api_key + appsec_url http://crowdsec:7422 # omit to run decisions-only (no WAF) + ticker_interval 10s # stream poll cadence + #disable_streaming # switch to live (per-request) lookups + #enable_hard_fails # fail-closed if LAPI is unreachable (default fails open) + } + servers { + trusted_proxies static 172.16.0.0/12 # real-IP: trust the upstream hop + client_ip_headers X-Forwarded-For + } +} + +:80 { + route { + appsec # WAF inspection first + crowdsec # then LAPI decision enforcement + reverse_proxy whoami:80 + } +} +``` + +### Verify end-to-end (through Caddy) + +```bash +curl -sS -o /dev/null -w 'normal: %{http_code}\n' http://127.0.0.1:8082/ # 200 +curl -sS -o /dev/null -w 'appsec block: %{http_code}\n' 'http://127.0.0.1:8082/vendor/phpunit/phpunit/src/Util/PHP/eval-stdin.php' # 403 +docker exec crowdsec cscli decisions add --ip 198.51.100.200 --duration 5m --reason test +sleep 12 # ticker_interval=10s +curl -sS -o /dev/null -w 'banned XFF: %{http_code}\n' -H 'X-Forwarded-For: 198.51.100.200' http://127.0.0.1:8082/ # 403 +curl -sS -o /dev/null -w 'clean XFF: %{http_code}\n' -H 'X-Forwarded-For: 203.0.113.9' http://127.0.0.1:8082/ # 200 +docker exec crowdsec cscli decisions delete --ip 198.51.100.200 +``` + +### Pitfalls + +- **Module not compiled in:** the stock `caddy` image has no `crowdsec` directive — Caddy + errors on the Caddyfile. You must `xcaddy build` (above) or use a prebuilt image that bundles + the module. +- **Real IP:** without `trusted_proxies` + `client_ip_headers`, Caddy treats the proxy/Docker + hop as the client and bans never match. Set both in the global `servers` block. +- **Handler order:** put `appsec` before `crowdsec` in the route so WAF inspection runs ahead of + decision enforcement. +- **WAF off:** omit `appsec_url` and the module enforces decisions only. AppSec must listen on + `0.0.0.0:7422` for a containerized Caddy to reach it. diff --git a/crowdsec/references/configure/hub.md b/crowdsec/references/configure/hub.md index ea51045..facb9a4 100644 --- a/crowdsec/references/configure/hub.md +++ b/crowdsec/references/configure/hub.md @@ -2,10 +2,123 @@ Canonical docs: · `cscli hub` reference -> STUB. To cover: -> - `cscli hub update / upgrade / list` -> - Installing collections / parsers / scenarios / postoverflows -> - `_custom/` overrides — the right way to tweak hub items -> - Pinning to a hub branch / specific version -> - Dependency resolution (collection pulls parsers + scenarios) -> - Hard don't: editing hub-managed files in place; use overrides +The hub is the catalog of detection content. Items come in types — **parsers**, +**scenarios**, **postoverflows**, **contexts**, **appsec-configs**, **appsec-rules** — and +**collections**, which are curated bundles of the others. + +## Collections vs items + +Install a **collection** and it pulls every item it depends on. Installing +`crowdsecurity/wordpress` downloads and enables its scenarios: + +```bash +sudo cscli collections install crowdsecurity/wordpress +# scenarios: crowdsecurity/http-bf-wordpress_bf, crowdsecurity/http-wordpress_user-enum, +# crowdsecurity/http-wordpress_wpconfig +# collections: crowdsecurity/wordpress +# Run 'sudo systemctl reload crowdsec' for the new configuration to be effective. +``` + +Prefer collections over hand-picking items — they track the dependencies for you. Reach for +individual `cscli parsers/scenarios/postoverflows/appsec-rules install ` only when you +need one item a collection doesn't include. + +## Inventory — `cscli hub list` + +```bash +sudo cscli hub list # everything, grouped by type +sudo cscli hub list -o raw # CSV: name,status,version,description,type +sudo cscli scenarios list # one type +``` + +The first line summarizes what's loaded; the **Status** column is what you read during +debugging: + +| Icon / status | Meaning | +|---|---| +| `✔️ enabled` | Pristine hub item, tracked, up to date. | +| `⚠️ enabled,tainted` | Hub item whose on-disk content no longer matches the hub version (someone edited it). Version shows `?`. | +| `🏠 enabled,local` | A local item not tracked by the hub (e.g. your own, **or** a hub item whose symlink was clobbered — see Pitfalls). | +| (missing / disabled) | Not installed or installed-but-disabled. | + +## Update vs upgrade + +Two different verbs — users conflate them: + +```bash +sudo cscli hub update # refresh the catalog INDEX (what versions exist). No item changes. +# e.g. "Nothing to do, the hub index is up to date." +sudo cscli hub upgrade # update INSTALLED items to the latest indexed version. +sudo systemctl reload crowdsec +``` + +Always `update` before `upgrade`, or `upgrade` won't see new versions. `upgrade` **skips +tainted and local items** rather than clobbering them: + +``` +level=warning msg="scenarios:crowdsecurity/http-wordpress_wpconfig is tainted, use '--force' to overwrite" +``` + +## Tainted items — detect and fix + +An item becomes **tainted** when its content diverges from the hub version. `cscli hub list` +flags it `⚠️ tainted`, a collection that contains it reports `… is tainted by scenarios:…`, +and inspect confirms: + +```bash +sudo cscli scenarios inspect crowdsecurity/http-wordpress_wpconfig | grep -E 'tainted|local_version' +# local_version: '?' +# tainted: true +sudo cscli scenarios inspect --diff crowdsecurity/http-wordpress_wpconfig # shows exactly what changed +``` + +**Fix — restore the pristine hub version** by reinstalling with `--force`: + +```bash +sudo cscli scenarios install crowdsecurity/http-wordpress_wpconfig --force +sudo systemctl reload crowdsec +``` + +This discards the local edits. If you needed those edits, move them to an override first +(below). + +## The right way to customize — `_custom/` overrides + +**Never edit a hub-managed file to change its behavior.** Hub items live in +`/etc/crowdsec/hub/...` and are symlinked into `/etc/crowdsec/{parsers,scenarios,...}/`; +editing them taints the item and your change is lost on the next `--force` upgrade. + +Instead, drop an override file in the sibling `_custom/` directory for that type +(`scenarios/.../_custom/`, `parsers/.../_custom/`, etc.). Overrides are merged on top of the +hub item by `name`, survive upgrades, and keep the hub item pristine. See +[../debug/triage.md](../debug/triage.md) § Hard don'ts and the SKILL.md Hard don'ts list. + +To remove a collection and its pulled items: + +```bash +sudo cscli collections remove crowdsecurity/wordpress --force +sudo systemctl reload crowdsec +``` + +## Pitfalls + +- **`update` ≠ `upgrade`.** `update` only refreshes the index; `upgrade` changes items. +- **`sudo sed -i` on a hub item breaks the symlink.** `sed -i` writes a *new* file, replacing + the symlink with a plain file — the item flips to `🏠 local` and detaches from the hub + entirely (no more upgrades). If you must inspect/edit, never edit in place; use a `_custom/` + override. To recover a clobbered item, delete the stray file and reinstall it. +- **Editing the symlink target taints, doesn't detach.** Appending to the hub target file + (e.g. `tee -a`) keeps the symlink but marks the item `⚠️ tainted`; `--force` reinstall + restores it. +- **Forgetting to reload.** Every hub change needs `systemctl reload crowdsec` (or container + recreate / `helm upgrade`) to take effect. +- **`upgrade` silently skips your local/tainted items** — by design. Reconcile them + deliberately with `--force` (after saving any edits to an override). + +## Per-environment notes + +| Env | What changes | +|---|---| +| **systemd / bare-metal** | `cscli hub …` / `cscli …` as above, then `systemctl reload crowdsec`. | +| **Docker / compose** | Install items declaratively at start with `COLLECTIONS=`, `PARSERS=`, `SCENARIOS=`, `POSTOVERFLOWS=` env vars. Items installed only via `docker exec … cscli` are lost on container recreate unless `/etc/crowdsec` is persisted — prefer the env vars for reproducibility. | +| **Kubernetes / Helm** | Declare hub items in the chart values (e.g. agent `collections`); `helm upgrade --reset-then-reuse-values`. Avoid imperative `cscli install` inside pods — it won't survive a reschedule. | diff --git a/crowdsec/references/configure/profiles.md b/crowdsec/references/configure/profiles.md index e6613da..5827e0e 100644 --- a/crowdsec/references/configure/profiles.md +++ b/crowdsec/references/configure/profiles.md @@ -2,11 +2,142 @@ Canonical docs: · post-install profiles -> STUB. To cover: -> - `profiles.yaml` structure (filters, decisions, on_success) -> - Decision types: ban / captcha / throttle -> - Duration syntax + escalation patterns -> - Simulation mode for safe rollout (`cscli simulation enable`) -> - Notification triggers from profiles -> - Interaction with allowlists: even if a profile matches, an allowlisted target IP causes the decision to be silently dropped at LAPI write time. To exempt specific IPs/ranges, prefer an allowlist (see [allowlists.md](./allowlists.md)) over a profile filter expression. -> - Pitfalls: filter expression typos silently no-op; profile order matters +A scenario firing produces an **alert**. Whether that alert becomes a **decision** (a ban, +captcha, etc.) is decided by `profiles.yaml`, evaluated at LAPI. This is the layer that +answers the #2 support question: *"I see alerts but nothing gets banned."* + +## Alert → decision flow + +1. A scenario overflows → LAPI receives an **alert**. +2. LAPI walks `profiles.yaml` **top to bottom**. For each profile whose `filters` match the + alert, it emits that profile's `decisions`. +3. `on_success: break` stops after the first matching profile (the default). Without it, + later profiles can also match and stack decisions. +4. The decision is written — **unless** the target is allowlisted (silently dropped) or + simulation is on (written but flagged, not enforced). + +### Why an alert produces no ban + +| Cause | How to confirm | +|---|---| +| Target is allowlisted (incl. loopback `127.0.0.1` via `crowdsecurity/whitelists`) | Alert exists, `decisions` column empty. `cscli allowlists check `. See [allowlists.md](./allowlists.md). | +| Simulation mode on (global or per-scenario) | `cscli simulation status`; decision shows `(simul)` action | +| No profile `filters` match the alert | The alert's scope/value doesn't satisfy any filter expression | +| AppSec out-of-band rule | Alert `kind: waf` with empty `decisions` — it's asynchronous, not an inline block (see [../appsec/troubleshoot.md](../appsec/troubleshoot.md)) | +| Filter expression typo | Silent no-op — the expr just never matches | + +## `profiles.yaml` structure + +Default `/etc/crowdsec/profiles.yaml` (two profiles, IP and Range), trimmed: + +```yaml +name: default_ip_remediation +filters: + - Alert.Remediation == true && Alert.GetScope() == "Ip" +decisions: + - type: ban + duration: 4h +# duration_expr: Sprintf('%dh', (GetDecisionsCount(Alert.GetValue()) + 1) * 4) +# notifications: +# - slack_default +on_success: break +--- +name: default_range_remediation +filters: + - Alert.Remediation == true && Alert.GetScope() == "Range" +decisions: + - type: ban + duration: 4h +on_success: break +``` + +| Key | Meaning | +|---|---| +| `name` | Identifier (appears in logs). | +| `filters` | List of [expr](https://docs.crowdsec.net/docs/next/expr/intro) expressions against the `Alert` object. Any one matching makes the profile apply. | +| `decisions` | What to emit: `type` + `duration` (and optional `scope`). | +| `duration` | Static TTL — `4h`, `30m`, `168h`, etc. | +| `duration_expr` | Dynamic TTL (expr). Overrides `duration`. Used for escalation. | +| `on_success` | `break` (stop here) or omit (keep evaluating later profiles). | +| `notifications` | Plugin names to fire (see [notifications.md](./notifications.md)). | + +### Decision types + +| `type` | Effect | Notes | +|---|---|---| +| `ban` | Block the IP/range | The default; every bouncer enforces it. | +| `captcha` | Serve a challenge | Only **web-server / AppSec bouncers** can render captcha; a firewall bouncer can't and treats it as no-op. Needs captcha provider config. | +| `throttle` | Rate-limit | Bouncer-dependent support. | + +### Escalation with `duration_expr` + +Longer bans for repeat offenders — uncomment in the default profile: + +```yaml +duration_expr: Sprintf('%dh', (GetDecisionsCount(Alert.GetValue()) + 1) * 4) +``` + +First offense → 4h, second → 8h, and so on. `GetDecisionsCount` queries prior decisions for +that value. + +## Simulation mode — safe rollout + +Simulation lets scenarios fire and decisions be recorded **without enforcing them** — ideal +for tuning before going live. + +```bash +sudo cscli simulation status # global simulation: disabled +sudo cscli simulation enable --global # all scenarios simulated (note: bare 'enable' just prints help) +sudo cscli simulation enable crowdsecurity/ssh-bf # one scenario only +sudo cscli simulation disable --global +sudo systemctl reload crowdsec # REQUIRED — the toggle is read at load +``` + +Under simulation, the decision still appears in the list but the action is prefixed +`(simul)` and **no bouncer enforces it**: + +``` +| ID | ... | Reason | Action | ... | +| 5 | ... | crowdsecurity/ssh-slow-bf | (simul)ban | ... | +``` + +## Verify a profile change + +```bash +sudo cscli simulation status # know your baseline first +# edit /etc/crowdsec/profiles.yaml +sudo crowdsec -t # validate — silent + exit 0 = OK +sudo systemctl reload crowdsec +# trigger the scenario (or, to test plumbing only, add a manual decision): +sudo cscli decisions add --ip 203.0.113.77 --duration 4h --reason test +sudo cscli decisions list # confirm Action + expiration match the profile +sudo cscli decisions delete --ip 203.0.113.77 +``` + +To confirm the *type/duration a real alert yields*, feed the scenario and read the decision +row — `Action` (e.g. `ban`) and `expiration` (e.g. `3h59m54s` for a 4h ban) reflect the +profile that matched. + +## Pitfalls + +- **`cscli profiles list` does not exist** (through at least v1.7.8). Read the file: + `sudo cat /etc/crowdsec/profiles.yaml`. +- **Filter typos are silent.** A misspelled field or `==`/`=` slip just never matches — no + error, no decision. Test against a known-firing scenario. +- **Profile order + `on_success: break`.** The first matching profile with `break` wins; + put narrower profiles above broader ones. +- **Reload required.** Editing `profiles.yaml` or toggling simulation does nothing until + `systemctl reload crowdsec` (or container recreate / `helm upgrade`). +- **Allowlist beats profile.** Even a perfect filter match is dropped at write time if the + target is allowlisted. To exempt IPs, use an [allowlist](./allowlists.md), not a profile + filter expression. +- **`captcha` needs a capable bouncer.** A firewall bouncer can't render a challenge — use a + web-server/AppSec bouncer for captcha decisions. + +## Per-environment notes + +| Env | What changes | +|---|---| +| **systemd / bare-metal** | Edit `/etc/crowdsec/profiles.yaml`, `crowdsec -t`, `systemctl reload crowdsec`. | +| **Docker / compose** | Bind-mount `profiles.yaml` from the host (`./profiles.yaml:/etc/crowdsec/profiles.yaml`). Recreate or send a reload to apply. cscli via `docker exec cscli ...`. | +| **Kubernetes / Helm** | Provide `profiles.yaml` via the chart's config values / a mounted ConfigMap; `helm upgrade --reset-then-reuse-values`. cscli via `kubectl exec -n -- cscli ...`. | From 276cc5b296bd2464f2a71c694bc628bcef3a48e5 Mon Sep 17 00:00:00 2001 From: Thibault Koechlin Date: Thu, 21 May 2026 15:43:01 +0200 Subject: [PATCH 2/4] bouncer changes --- crowdsec/SKILL.md | 6 +++++- .../references/configure/bouncers/web-servers.md | 16 ++++++++++++++-- 2 files changed, 19 insertions(+), 3 deletions(-) diff --git a/crowdsec/SKILL.md b/crowdsec/SKILL.md index c365c70..529f73a 100644 --- a/crowdsec/SKILL.md +++ b/crowdsec/SKILL.md @@ -64,7 +64,11 @@ Docker/k8s commands run inside the container/pod and do not need this. | "whitelist vs allowlist vs postoverflow", "which suppression layer should I use" | [references/configure/allowlists.md](./references/configure/allowlists.md) § Suppression mechanisms | | "alert me on slack/email/webhook" | [references/configure/notifications.md](./references/configure/notifications.md) *(TODO — stub)* | | "block at the firewall", "iptables", "nftables", "ipset" | [references/configure/bouncers/firewall.md](./references/configure/bouncers/firewall.md) | -| "nginx / traefik / caddy bouncer" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) | +| "nginx bouncer", "lua / openresty module" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § nginx | +| "haproxy bouncer", "SPOA / SPOE" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § haproxy | +| "apache bouncer", "mod_crowdsec" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § apache | +| "traefik bouncer", "traefik plugin / middleware" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § Traefik | +| "caddy bouncer", "caddy module / xcaddy" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § Caddy | | "wrong source IP", "real client IP", "behind Cloudflare / reverse proxy / NPM", "X-Forwarded-For", "everyone shows as the proxy IP" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) — per-bouncer real-IP/trusted-proxy sections | | "AppSec", "WAF", "virtual patching", "block by request shape" | [references/appsec/](./references/appsec/) — overview, deploy, configure, troubleshoot | | "Console", "enroll", "share signals" | [references/install/console.md](./references/install/console.md) | diff --git a/crowdsec/references/configure/bouncers/web-servers.md b/crowdsec/references/configure/bouncers/web-servers.md index f7afd32..64cbca0 100644 --- a/crowdsec/references/configure/bouncers/web-servers.md +++ b/crowdsec/references/configure/bouncers/web-servers.md @@ -1,6 +1,6 @@ -# Bouncers — Web servers (nginx, Traefik, Caddy) +# Bouncers — Web servers (nginx, haproxy, apache, Traefik, Caddy) -Canonical docs: (per-bouncer pages: nginx, traefik, caddy) +Canonical docs: (per-bouncer pages: nginx, haproxy, apache, traefik, caddy) A web-server bouncer enforces two things at the edge: 1. **LAPI decisions** — IPs banned by scenarios/CTI get a 403 (or captcha). @@ -8,6 +8,18 @@ A web-server bouncer enforces two things at the edge: Both are served by the **same bouncer API key**. Wiring the WAF is just pointing the bouncer's AppSec URL at the `:7422` listener — see [../../appsec/deploy.md](../../appsec/deploy.md). +## Pick your bouncer + +Jump to the section for your web server. The shared model above (decisions + optional WAF, one key) and the stream-lag / real-IP pitfalls recur across all of them. + +| Section | Package / module | WAF (AppSec)? | +|---|---|---| +| § nginx | `crowdsec-nginx-bouncer` (lua) | ✅ | +| § haproxy | `crowdsec-haproxy-spoa-bouncer` (SPOA) | ✅ | +| § apache | `crowdsec-apache2-bouncer` (`mod_crowdsec`) | ❌ decisions only | +| § Traefik | `crowdsec-bouncer-traefik-plugin` (Yaegi middleware) | ✅ | +| § Caddy | `caddy-crowdsec-bouncer` (compiled-in module) | ✅ | + ## nginx — `crowdsec-nginx-bouncer` Targets Ubuntu 24.04 / nginx 1.24, engine v1.7.8. From 1d923dced2282ce5670073ffc6bb5bbecdb2ff19 Mon Sep 17 00:00:00 2001 From: Thibault Koechlin Date: Thu, 21 May 2026 16:23:20 +0200 Subject: [PATCH 3/4] upgrade docs --- CHANGELOG.md | 12 ++- crowdsec/SKILL.md | 2 +- crowdsec/references/operate/upgrades.md | 131 ++++++++++++++++++++++-- 3 files changed, 130 insertions(+), 15 deletions(-) diff --git a/CHANGELOG.md b/CHANGELOG.md index 705f5fb..8d4293b 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -19,12 +19,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0 - `references/configure/bouncers/web-servers.md` — full Traefik (`maxlerebourg/crowdsec-bouncer-traefik-plugin`) and Caddy (`hslatman/caddy-crowdsec-bouncer`) setup, AppSec wiring, and real-client-IP handling, - replacing the previous canonical-pointer stubs. + replacing the previous canonical-pointer stubs; plus a "Pick your bouncer" section index. +- `references/operate/upgrades.md` — lean per-environment upgrade runbook (backward-compatible + happy path, independent bouncer cadence), the hub-upgrade-skips-tainted consequence, + backup-when-it-matters, and a verified rollback note. ### Changed -- `crowdsec/SKILL.md` — dropped the stub markers on acquisition/profiles/hub, added a - real-client-IP / reverse-proxy routing cue, and corrected the cheat sheet (`cscli profiles - list` does not exist; read `/etc/crowdsec/profiles.yaml`). +- `crowdsec/SKILL.md` — dropped the stub markers on acquisition/profiles/hub/upgrades; split + the single web-servers bouncer row into per-bouncer `§`-section routing rows (also adding + haproxy/apache cues); added a real-client-IP / reverse-proxy routing cue; and corrected the + cheat sheet (`cscli profiles list` does not exist; read `/etc/crowdsec/profiles.yaml`). ## [0.1.0] - 2026-05-20 diff --git a/crowdsec/SKILL.md b/crowdsec/SKILL.md index 529f73a..bde5972 100644 --- a/crowdsec/SKILL.md +++ b/crowdsec/SKILL.md @@ -72,7 +72,7 @@ Docker/k8s commands run inside the container/pod and do not need this. | "wrong source IP", "real client IP", "behind Cloudflare / reverse proxy / NPM", "X-Forwarded-For", "everyone shows as the proxy IP" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) — per-bouncer real-IP/trusted-proxy sections | | "AppSec", "WAF", "virtual patching", "block by request shape" | [references/appsec/](./references/appsec/) — overview, deploy, configure, troubleshoot | | "Console", "enroll", "share signals" | [references/install/console.md](./references/install/console.md) | -| "upgrade", "back up", "roll back" | [references/operate/upgrades.md](./references/operate/upgrades.md) *(TODO — stub)* | +| "upgrade", "back up", "roll back", "new version", "tainted items after upgrade" | [references/operate/upgrades.md](./references/operate/upgrades.md) | | "multiple agents", "remote LAPI", "mTLS", "postgres backend" | [references/operate/multi-server.md](./references/operate/multi-server.md) *(TODO — stub)* | | "is it working?", "smoke test", "validate install", "verify setup", "did detection / WAF / blocking actually wire up?" | [references/operate/health-check.md](./references/operate/health-check.md) | | "it's broken" / "not working" / general diagnosis | [references/debug/triage.md](./references/debug/triage.md) → run `~/.claude/skills/crowdsec/scripts/diagnose.sh` | diff --git a/crowdsec/references/operate/upgrades.md b/crowdsec/references/operate/upgrades.md index c32beb8..fa5993d 100644 --- a/crowdsec/references/operate/upgrades.md +++ b/crowdsec/references/operate/upgrades.md @@ -2,13 +2,124 @@ Canonical docs: · `cscli` reference -> STUB. To cover: -> - Pre-upgrade: backup `/var/lib/crowdsec/data/` (LAPI sqlite/postgres) and `/etc/crowdsec/` -> - Per-env upgrade flow: -> - bare-metal: `apt upgrade crowdsec` + restart; check `cscli version` -> - docker: pull new tag, recreate container with same volumes -> - k8s: helm upgrade with `--reset-then-reuse-values` -> - Hub upgrade: `cscli hub upgrade` -> - Bouncer upgrades (separate package per bouncer) -> - Rollback procedure (snapshot, package downgrade, restore DB) -> - Breaking-change checklist between minor versions (link to release notes) +Upgrading the engine and bouncers is **a no-brainer for most setups**: releases are +backward-compatible, the database migrates forward automatically on first start, and engine +↔ bouncer version skew is fine. The one part that needs attention is **locally modified +(tainted) hub items** — see below. + +## Upgrade the engine — the happy path + +| Env | Upgrade | +|---|---| +| **bare-metal** | `sudo apt upgrade crowdsec` (or `sudo dnf upgrade crowdsec`) → `sudo systemctl restart crowdsec` | +| **Docker** | Pull the new tag, recreate with the **same named volumes** (the DB migrates on first start): `docker compose pull && docker compose up -d` | +| **Kubernetes** | `helm repo update` → `helm upgrade crowdsec crowdsec/crowdsec --reset-then-reuse-values` | + +`--reset-then-reuse-values` is mandatory on helm — omitting it silently drops your values +(see [../install/kubernetes.md](../install/kubernetes.md)). + +Verify: + +```bash +sudo cscli version # the engine version bumped +sudo cscli lapi status # LAPI still reachable +# then a quick smoke test — see ../operate/health-check.md +``` + +The DB migrating forward is automatic and transparent: an engine upgraded across a minor +version (e.g. v1.6 → v1.7) on the same data volume keeps all existing decisions and machines. + +## Bouncers upgrade on their own cadence + +Each bouncer is its **own package**, versioned independently of the engine — they're LAPI +clients and need no lockstep: + +```bash +sudo apt upgrade crowdsec-firewall-bouncer-nftables # or crowdsec-nginx-bouncer, etc. +sudo systemctl restart crowdsec-firewall-bouncer +``` + +It's normal to see, say, engine `v1.7.8` alongside firewall-bouncer `0.0.34`. Upgrade +bouncers when their changelog warrants it, not because the engine moved. + +## Hub items — the part that needs care + +Hub content (parsers, scenarios, collections, AppSec rules) upgrades **separately** from the +engine binary: + +```bash +sudo cscli hub update # refresh the catalog index +sudo cscli hub upgrade # pull newer versions of installed items +sudo systemctl reload crowdsec +``` + +**`cscli hub upgrade` skips any item you've locally modified (tainted).** Your edits are +preserved — but that item then **stops receiving new versions and security fixes**, silently: + +``` +level=warning msg="scenarios:crowdsecurity/http-wordpress_wpconfig is tainted, use '--force' to overwrite" +``` + +To get the update, reconcile the item: move your change into a `_custom/` override (which +survives upgrades) and `--force` the item back to pristine. The full detect → diff → fix flow +is in [../configure/hub.md](../configure/hub.md) § Tainted items. After any upgrade, scan for +items left behind: + +```bash +sudo cscli hub list | grep -i tainted +``` + +## Backup — only when it actually matters + +Because upgrades are backward-compatible, a **routine minor bump does not need a backup +ritual**. Take a snapshot deliberately before the genuinely risky changes: + +- a **major-version** jump, +- changing the **DB backend** (sqlite → postgres/mysql) or running a backend migration, +- before a large hand-edit to config you're unsure about. + +What to copy (bare-metal paths): + +```bash +sudo systemctl stop crowdsec +sudo cp -a /etc/crowdsec /etc/crowdsec.bak # config, hub symlinks, _custom/ overrides +sudo cp -a /var/lib/crowdsec/data /var/lib/crowdsec/data.bak # sqlite crowdsec.db + geoip/datafiles +sudo systemctl start crowdsec +``` + +A postgres/mysql backend lives in that database, not the data dir — dump it with the DB's own +tools (`pg_dump` / `mysqldump`). In Docker the equivalents are the `cs-config` and `cs-data` +named volumes; in Kubernetes it's the LAPI PVC (or the external DB). + +## Rollback (rare) + +Reinstall the prior version and restart: + +```bash +sudo apt install crowdsec= # e.g. crowdsec=1.7.7; see 'apt-cache policy crowdsec' +sudo systemctl restart crowdsec +``` + +In Docker, repoint the image tag and `docker compose up -d`. A minor-version rollback against +a forward-migrated sqlite DB generally works (a v1.6 ↔ v1.7 round-trip on the same volume +keeps decisions intact). For a **major** jump, don't rely on that — restore the pre-upgrade DB +snapshot you took above rather than just downgrading the package. + +## Pitfalls + +- **Tainted items silently miss fixes.** `hub upgrade` leaves them on the old version with no + error beyond one warning line. Audit with `cscli hub list | grep tainted` after upgrading; + reconcile via [../configure/hub.md](../configure/hub.md) § Tainted items. +- **helm `--reset-then-reuse-values`.** Skipping it drops your chart values. +- **Read the release notes on minor bumps.** Backward-compatible ≠ zero behavior changes; + scan the changelog for defaults that moved. +- **Reload vs restart.** Config/acquisition/hub changes need `reload`; a new engine binary + needs `restart` (or container/pod recreate). + +## Per-environment notes + +| Env | Apply | +|---|---| +| **systemd / bare-metal** | `apt`/`dnf` upgrade → `systemctl restart crowdsec`. Repo at `packagecloud.io/crowdsec/crowdsec`. | +| **Docker / compose** | `docker compose pull && docker compose up -d` — keep the same named volumes so the DB persists and migrates. Pin a minor tag (`:v1.7`) in prod rather than `:latest`. | +| **Kubernetes / Helm** | `helm repo update` → `helm upgrade … --reset-then-reuse-values`. Engine version follows the chart's app version. | From 1616c9102dd8a7c620adc518a924af31255cddad Mon Sep 17 00:00:00 2001 From: Thibault Koechlin Date: Thu, 21 May 2026 16:38:55 +0200 Subject: [PATCH 4/4] fix non existing cscli commands --- crowdsec/references/debug/triage.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/crowdsec/references/debug/triage.md b/crowdsec/references/debug/triage.md index 6b76fdd..8b90cf8 100644 --- a/crowdsec/references/debug/triage.md +++ b/crowdsec/references/debug/triage.md @@ -79,7 +79,7 @@ cscli decisions list ``` - **No active alerts** → step 3 lied about overflows, or LAPI write failed. Check `tail -n 200 /var/log/crowdsec_api.log` for `database is locked` / disk-full / migration errors. -- **Alerts exist, no decisions** → check `cscli profiles list` and `/etc/crowdsec/profiles.yaml` — the profile filter may not match, or the duration is `0s`. See [../configure/profiles.md](../configure/profiles.md). +- **Alerts exist, no decisions** → inspect `/etc/crowdsec/profiles.yaml` (there is no `cscli profiles` command) — the profile filter may not match, or the duration is `0s`. See [../configure/profiles.md](../configure/profiles.md). - **Decisions exist** → continue to step 5. ### 4½. Is the IP allowlisted?