crowdsecurity · buixor · May 21, 2026 · May 21, 2026 · May 21, 2026 · May 21, 2026
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -7,6 +7,29 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ## [Unreleased]
 
+### Added
+- `references/configure/acquisition.md` — file/journald/docker datasources, the
+  `labels.type` model, verification with `crowdsec -t` / `cscli metrics show acquisition`
+  / `cscli explain`, and common pitfalls.
+- `references/configure/profiles.md` — alert→decision flow, why alerts don't always ban,
+  `profiles.yaml` structure, ban/captcha/throttle, `duration_expr` escalation, simulation
+  mode, and allowlist interaction.
+- `references/configure/hub.md` — collections vs items, `update` vs `upgrade`, tainted-item
+  detection and repair, `_custom/` overrides, and the `sed -i` symlink-break pitfall.
+- `references/configure/bouncers/web-servers.md` — full Traefik
+  (`maxlerebourg/crowdsec-bouncer-traefik-plugin`) and Caddy
+  (`hslatman/caddy-crowdsec-bouncer`) setup, AppSec wiring, and real-client-IP handling,
+  replacing the previous canonical-pointer stubs; plus a "Pick your bouncer" section index.
+- `references/operate/upgrades.md` — lean per-environment upgrade runbook (backward-compatible
+  happy path, independent bouncer cadence), the hub-upgrade-skips-tainted consequence,
+  backup-when-it-matters, and a verified rollback note.
+
+### Changed
+- `crowdsec/SKILL.md` — dropped the stub markers on acquisition/profiles/hub/upgrades; split
+  the single web-servers bouncer row into per-bouncer `§`-section routing rows (also adding
+  haproxy/apache cues); added a real-client-IP / reverse-proxy routing cue; and corrected the
+  cheat sheet (`cscli profiles list` does not exist; read `/etc/crowdsec/profiles.yaml`).
+
 ## [0.1.0] - 2026-05-20
 
 ## [0.1.0] - 2026-05-19

diff --git a/PUBLISHING.md b/PUBLISHING.md
diff --git a/crowdsec/SKILL.md b/crowdsec/SKILL.md
@@ -57,17 +57,22 @@ Docker/k8s commands run inside the container/pod and do not need this.
 | Cue from user | Go to |
 |---|---|
 | "install", "set up", "fresh box", "how do I start" | [references/install/](./references/install/) (pick file by env) |
-| "configure logs / acquisition", "read journald / syslog / docker logs" | [references/configure/acquisition.md](./references/configure/acquisition.md) *(TODO — stub)* |
-| "install a collection / parser / scenario", "hub", "tainted" | [references/configure/hub.md](./references/configure/hub.md) *(TODO — stub)* |
-| "ban duration", "captcha", "decisions", "simulation" | [references/configure/profiles.md](./references/configure/profiles.md) *(TODO — stub)* |
+| "configure logs / acquisition", "read journald / syslog / docker logs" | [references/configure/acquisition.md](./references/configure/acquisition.md) |
+| "install a collection / parser / scenario", "hub", "tainted" | [references/configure/hub.md](./references/configure/hub.md) |
+| "ban duration", "captcha", "decisions", "simulation", "alerts but no bans" | [references/configure/profiles.md](./references/configure/profiles.md) |
 | "allowlist my office / CDN / monitoring IP", "I'm getting blocked by CAPI", "exclude IP from any ban" | [references/configure/allowlists.md](./references/configure/allowlists.md) |
 | "whitelist vs allowlist vs postoverflow", "which suppression layer should I use" | [references/configure/allowlists.md](./references/configure/allowlists.md) § Suppression mechanisms |
 | "alert me on slack/email/webhook" | [references/configure/notifications.md](./references/configure/notifications.md) *(TODO — stub)* |
 | "block at the firewall", "iptables", "nftables", "ipset" | [references/configure/bouncers/firewall.md](./references/configure/bouncers/firewall.md) |
-| "nginx / traefik / caddy bouncer" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) |
+| "nginx bouncer", "lua / openresty module" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § nginx |
+| "haproxy bouncer", "SPOA / SPOE" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § haproxy |
+| "apache bouncer", "mod_crowdsec" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § apache |
+| "traefik bouncer", "traefik plugin / middleware" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § Traefik |
+| "caddy bouncer", "caddy module / xcaddy" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) § Caddy |
+| "wrong source IP", "real client IP", "behind Cloudflare / reverse proxy / NPM", "X-Forwarded-For", "everyone shows as the proxy IP" | [references/configure/bouncers/web-servers.md](./references/configure/bouncers/web-servers.md) — per-bouncer real-IP/trusted-proxy sections |
 | "AppSec", "WAF", "virtual patching", "block by request shape" | [references/appsec/](./references/appsec/) — overview, deploy, configure, troubleshoot |
 | "Console", "enroll", "share signals" | [references/install/console.md](./references/install/console.md) |
-| "upgrade", "back up", "roll back" | [references/operate/upgrades.md](./references/operate/upgrades.md) *(TODO — stub)* |
+| "upgrade", "back up", "roll back", "new version", "tainted items after upgrade" | [references/operate/upgrades.md](./references/operate/upgrades.md) |
 | "multiple agents", "remote LAPI", "mTLS", "postgres backend" | [references/operate/multi-server.md](./references/operate/multi-server.md) *(TODO — stub)* |
 | "is it working?", "smoke test", "validate install", "verify setup", "did detection / WAF / blocking actually wire up?" | [references/operate/health-check.md](./references/operate/health-check.md) |
 | "it's broken" / "not working" / general diagnosis | [references/debug/triage.md](./references/debug/triage.md) → run `~/.claude/skills/crowdsec/scripts/diagnose.sh` |
@@ -110,7 +115,7 @@ These work in every environment. On bare-metal/systemd, prefix with `sudo` (unle
 | Replay a single log line | `cscli explain --log '<line>' --type <type>` |
 | Validate config after editing any yaml (acquisition/profiles/config) | `crowdsec -t` (bare-metal; also auto-runs on `systemctl reload`) — then confirm the source reads with `cscli metrics show acquisition` |
 | See simulation state (alerts but no decisions) | `cscli simulation status` |
-| List decision profiles (filters / ban duration) | `cscli profiles list` — full content in `/etc/crowdsec/profiles.yaml` |
+| Inspect decision profiles (filters / ban duration) | `cat /etc/crowdsec/profiles.yaml` — there is **no** `cscli profiles` command (through v1.7.8); see [references/configure/profiles.md](./references/configure/profiles.md) |
 
 Where things live on a default bare-metal install:
 

diff --git a/crowdsec/references/appsec/deploy.md b/crowdsec/references/appsec/deploy.md
@@ -112,7 +112,7 @@ The smoke test above proves the WAF works. For production you point a real bounc
 | Bouncer | Where to set the AppSec endpoint |
 |---|---|
 | `crowdsec-nginx-bouncer` (lua module) | `APPSEC_URL=http://127.0.0.1:7422` in `/etc/crowdsec/bouncers/crowdsec-nginx-bouncer.conf` (shell-style `KEY=VALUE`, empty by default = WAF off). The self-registered `API_KEY` already serves AppSec — reuse it. |
-| `crowdsec-traefik-bouncer` (middleware plugin) | `crowdsec.appsec.enabled: true`, `crowdsec.appsec.url`, and the AppSec-aware API key in `crowdsec.crowdsecLapiKey`. |
+| Traefik (`maxlerebourg/crowdsec-bouncer-traefik-plugin`) | Flat plugin options: `crowdsecAppsecEnabled: true` (default false), `crowdsecAppsecHost: crowdsec:7422` (host:port, no scheme), and the bouncer key in `crowdsecLapiKey`. Full recipe in [../configure/bouncers/web-servers.md](../configure/bouncers/web-servers.md) § Traefik. |
 | `crowdsec-caddy-bouncer` (Caddy module) | Equivalent `appsec_url` directive on the bouncer block. |
 | Any other AppSec-aware bouncer | Look for an `appsec_url` / `appsec.url` field; auth is always the bouncer's existing API key. |
 

diff --git a/crowdsec/references/configure/acquisition.md b/crowdsec/references/configure/acquisition.md
@@ -2,10 +2,139 @@
 
 Canonical docs: <https://docs.crowdsec.net/docs/next/getting_started/post_installation/acquisition> · datasources index <https://docs.crowdsec.net/docs/next/data_sources/intro>
 
-> STUB. To cover:
-> - `acquis.yaml` vs. `acquis.d/*.yaml`
-> - File datasource (paths, type/labels, multi-file globs)
-> - journald datasource (filters, units)
-> - syslog, kinesis, k8s_audit, docker, AppSec — when to pick each
-> - Verify a source after editing: `crowdsec -t` (validate config), `cscli metrics show acquisition` (confirm it's read), `cscli explain` (confirm a line parses)
-> - Common pitfalls: missing `type:` label (parser won't match), permission denied on log files, journald unit filter typos
+Acquisition tells the engine **what logs to read and how to label them**. Each source
+declares a `source:` (the datasource type) and a `labels.type:` (the parser hint). If the
+engine reads lines but they show up as **`Lines unparsed`**, acquisition is usually fine
+and the problem is the `type:` or the parser — debug that with
+[../debug/parsing.md](../debug/parsing.md). If a source shows **0 `Lines read`**, the
+problem is here.
+
+## Where acquisition lives
+
+| | Path / mechanism |
+|---|---|
+| Single legacy file | `/etc/crowdsec/acquis.yaml` (`acquisition_path` in `config.yaml`) |
+| Drop-in dir (preferred) | `/etc/crowdsec/acquis.d/*.yaml` (`acquisition_dir` in `config.yaml`) — one file per source set |
+| Docker | Bind-mount or env (`COLLECTIONS`, plus a mounted `acquis.d`); see Per-environment notes |
+| Kubernetes | The chart's `config.acquisition` values render into the same `acquis.d` files |
+
+Both `acquisition_path` and `acquisition_dir` load if set — check `config.yaml`:
+
+```bash
+sudo grep -E 'acquisition_(path|dir)' /etc/crowdsec/config.yaml
+# acquisition_path: /etc/crowdsec/acquis.yaml
+# acquisition_dir: /etc/crowdsec/acquis.d
+```
+
+Each YAML doc is **one source**. Multiple sources per file are allowed if separated by
+`---`. Put unrelated sources in their own files under `acquis.d/`.
+
+## The label model — every source needs `labels.type`
+
+`labels.type` is the parser router. A source with no `type` (or the wrong one) is read but
+never parsed — every line lands in `Lines unparsed`. Set it to the family the lines belong
+to: `syslog`, `nginx`, `haproxy`, `appsec`, etc. (the value the relevant parser matches on).
+
+## File datasource
+
+```yaml
+source: file
+filenames:
+  - /var/log/nginx/*.log        # globs allowed
+  - /var/log/auth.log           # list as many paths as you need
+labels:
+  type: nginx
+```
+
+Glob expansion is evaluated at startup; files created later that match are **not** picked
+up until reload. For high-rotation logs prefer the directory plus a glob over naming each
+file.
+
+## journald datasource
+
+```yaml
+source: journalctl
+journalctl_filter:
+  - "_SYSTEMD_UNIT=ssh.service"   # journalctl-style match; one filter per list entry
+labels:
+  type: syslog
+```
+
+The filter strings are passed straight to `journalctl`. After reload the source appears in
+metrics as `journalctl:journalctl-_SYSTEMD_UNIT=ssh.service`. A typo in the unit name is
+silent — the source reads **0 lines** rather than erroring.
+
+## docker datasource
+
+For a CrowdSec **container** reading other containers' stdout/stderr via the Docker socket:
+
+```yaml
+source: docker
+container_name:
+  - acq-nginx            # exact names; container_name_regexp / labels also supported
+labels:
+  type: nginx
+```
+
+Requires `/var/run/docker.sock` mounted into the CrowdSec container. The source shows up as
+`docker:<container-name>`. Use this instead of a file source when apps log to stdout (the
+12-factor norm in Docker/compose) — there is no log file to bind-mount.
+
+## When to pick which source
+
+| Logs come from… | `source:` |
+|---|---|
+| A file or files on disk | `file` |
+| systemd journal (no file written, e.g. modern sshd) | `journalctl` |
+| Other containers' stdout (CrowdSec runs in Docker) | `docker` |
+| A remote host shipping over syslog | `syslog` (listener) |
+| Kubernetes audit webhook | `k8s_audit` |
+| AWS Kinesis / CloudWatch | `kinesis` / `cloudwatch` |
+| The WAF listener (not a log — request inspection) | `appsec` (see [../appsec/deploy.md](../appsec/deploy.md)) |
+
+## Verify after editing
+
+```bash
+# 1. Validate config — silent + exit 0 means OK. A bad source prints FATAL.
+sudo crowdsec -t
+#   e.g. FATAL crowdsec init: while loading acquisition config:
+#        /etc/crowdsec/acquis.d/foo.yaml: unknown data source nonexistent_ds
+
+# 2. Apply (reload picks up acquisition changes without dropping the API)
+sudo systemctl reload crowdsec
+
+# 3. Confirm the source is actually read — find your source, check 'Lines read' climbs
+sudo cscli metrics show acquisition
+#   | file:/var/log/nginx/access.log | 19 | 19 | -   ...
+#   | journalctl:journalctl-_SYSTEMD_UNIT=ssh.service | 2 | 2 | - ...
+#   | docker:acq-nginx | 5 | 5 | - ...
+
+# 4. Confirm a representative line parses with the chosen type
+sudo cscli explain --log 'May 21 09:00:00 host sshd[123]: Failed password for invalid user admin from 1.2.3.4 port 22 ssh2' --type syslog
+#   s01-parse → 🟢 crowdsecurity/sshd-logs ... parser success 🟢
+sudo cscli explain --file /var/log/nginx/access.log --type nginx   # replay a whole file
+```
+
+## Pitfalls
+
+- **Missing/wrong `labels.type`:** lines read but all `unparsed`. The single most common
+  acquisition mistake. Match `type` to the parser family.
+- **Permission denied on log files:** on bare-metal the engine runs as root and reads most
+  logs, but tightly-permissioned files (e.g. some `/var/log` set to `0640 root:adm`) can
+  still block it under a non-root setup — check ownership/ACLs if a file source reads 0.
+- **journald unit typo:** wrong `_SYSTEMD_UNIT` → 0 lines, no error. Verify with
+  `journalctl _SYSTEMD_UNIT=ssh.service` first.
+- **Docker bind-mount path mismatch:** for a *file* source inside a CrowdSec container, the
+  `filenames:` must be the **container** path, not the host path. Mismatch → 0 lines. (Use
+  the `docker` source to avoid the problem entirely.)
+- **Globs are startup-only:** new files matching a glob need a reload to be acquired.
+- **Edited but not applied:** `crowdsec -t` validates the file but does not load it — you
+  still need `systemctl reload crowdsec` (or recreate the container / `helm upgrade`).
+
+## Per-environment notes
+
+| Env | What changes |
+|---|---|
+| **systemd / bare-metal** | Recipes above as-is. Edit `acquis.d/*.yaml`, `crowdsec -t`, `systemctl reload crowdsec`. |
+| **Docker / compose** | Mount `./acquis.d:/etc/crowdsec/acquis.d` (and `/var/run/docker.sock` for the docker source). `COLLECTIONS=`/`PARSERS=` env install hub items at start. Run cscli with `docker exec <name> cscli metrics show acquisition`. Recreate the container to apply (a reload signal also works). |
+| **Kubernetes / Helm** | Define sources under `config.acquisition` in values; `helm upgrade --reset-then-reuse-values`. Inspect with `kubectl exec -n <ns> <agent-pod> -- cscli metrics show acquisition`. The `k8s_audit` source needs the API server's audit webhook pointed at the agent. |