Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
16889c5
test: scaffold integration harness directory and config
May 2, 2026
5b33887
test: add exhook-init container that registers floodgate via EMQX REST
May 2, 2026
2836d7c
test: add docker-compose.test.yaml with emqx+floodgate+exhook-init
May 2, 2026
cd23232
fix: add curl connect/max timeouts to exhook-init register.sh
May 2, 2026
b5c6a80
test: add test-driver image with subscribe-only smoke run
May 2, 2026
16e81a4
test: add test-driver helpers (subscriber, publisher, envelope builder)
May 2, 2026
940bf59
test: add zerohop integration case
May 2, 2026
afa14a7
test: add drop integration case
May 2, 2026
56e9d6d
test: add passthru integration case
May 2, 2026
1a820f6
test: add noop integration case
May 2, 2026
9387ce7
test: add custom-key channel passthru integration case
May 2, 2026
8cf71ba
test: add meshtasticd (sim mode) + MQTT init sidecar to integration s…
May 2, 2026
1cf4cbd
test: add meshtasticd round-trip integration case
May 2, 2026
1a18f89
test: add scripts/run-integration.sh orchestrator
May 2, 2026
71d235c
ci: add integration job that runs the compose harness on PR
May 2, 2026
3fde442
docs: document integration harness and --keep workflow in CONTRIBUTING
May 2, 2026
fb16664
docs: reference integration harness in CLAUDE.md
May 2, 2026
efd472d
test: address pre-PR code review on integration harness
May 2, 2026
86fc613
fix: meshtasticd image is on Docker Hub, override sh-wrapped CMD
May 2, 2026
8a811e7
fix: meshtasticd-init CLI invocation and runner log dump on failure
May 2, 2026
0610d4b
test: drop meshtasticd from integration harness
May 2, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -57,6 +57,7 @@ jobs:
run: |
pytest tests/ -q \
--ignore=tests/test_container_smoke.py \
--ignore=tests/integration \
--cov=src/floodgate \
--cov-report=term-missing \
--cov-report=xml
Expand Down Expand Up @@ -91,6 +92,31 @@ jobs:
- name: Run container smoke tests
run: pytest tests/test_container_smoke.py -m smoke -v

# ---------------------------------------------------------------------------
# Integration test — full Docker Compose stack: emqx + floodgate + meshtasticd
# End-to-end validation of drop / zerohop / passthru / noop / custom-key /
# meshtasticd round-trip via subscriber capture and /health stats.
# ---------------------------------------------------------------------------
integration:
name: Integration test (compose stack)
runs-on: ubuntu-latest
needs: smoke
timeout-minutes: 15

steps:
- uses: actions/checkout@v4

- name: Run integration harness
run: ./scripts/run-integration.sh

- name: Dump compose logs on failure
if: failure()
run: docker compose -f docker-compose.test.yaml logs --no-color || true

- name: Always clean up
if: always()
run: docker compose -f docker-compose.test.yaml down -v --remove-orphans || true

# ---------------------------------------------------------------------------
# Manifest validation — verify k8s YAML is valid against the k8s schema
# ---------------------------------------------------------------------------
Expand Down
4 changes: 4 additions & 0 deletions CLAUDE.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,9 @@ Gateway → EMQX → [ExHook gRPC] → floodgate → drop / modify / passthru
| `src/floodgate/health.py` | HTTP health check server on `health_port`. |
| `src/floodgate/__main__.py` | CLI entry point. |
| `proto/emqx/exhook.proto` | EMQX ExHook interface definition. |
| `docker-compose.test.yaml` | Integration test stack (emqx, floodgate, exhook-init, test-driver) on an isolated bridge network. |
| `scripts/run-integration.sh` | Integration harness orchestrator — `--keep` leaves the stack up, `--teardown` removes it. |
| `tests/integration/` | Integration test assets: floodgate config, ExHook init container, test-driver image + cases. |

## Dev Setup

Expand All @@ -35,6 +38,7 @@ cd floodgate
pip install -e ".[dev]"
pytest tests/ --ignore=tests/test_container_smoke.py -q # no Docker required
pytest tests/ -q # full suite including container smoke test (requires Docker)
./scripts/run-integration.sh # full Docker Compose end-to-end test (requires Docker)
```

Routing-logic tests mock the low-level zerohop functions, so the suite runs
Expand Down
26 changes: 25 additions & 1 deletion CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,13 +15,14 @@ All PRs are squash-merged. One PR per feature or fix.

## CI jobs

Every PR and push to `main` runs four jobs in sequence:
Every PR and push to `main` runs five jobs in sequence:

| Job | What it checks |
|-----|----------------|
| **lint** | `ruff` style and import checks |
| **unit tests** | Pure Python tests across Python 3.11/3.12/3.13 — no external services needed. CI generates the Meshtastic protobuf stubs before running so the unmocked protobuf payload tests in `tests/payloads/protobuf/` are exercised. Mocked tests in the rest of the suite still run without protobufs (handy for fast local iteration). |
| **container smoke** | Builds the Docker image, starts the container, and verifies `/health` returns `200 OK`. Catches Dockerfile bugs and runtime import errors that unit tests cannot. |
| **integration** | Brings up `docker-compose.test.yaml` (EMQX + floodgate + test-driver) and runs `drop` / `zerohop` / `passthru` / `noop` / `custom-key passthru` end-to-end. See "Integration testing" below. |
| **manifest validation** | Validates `k8s/*.yaml` against the Kubernetes schema with `kubeconform`. |

### Running locally
Expand Down Expand Up @@ -50,6 +51,29 @@ pytest tests/test_container_smoke.py -m smoke -v
ruff check src/ tests/
```

### Integration testing

The integration harness (`scripts/run-integration.sh`) brings up a full Docker Compose stack — EMQX + floodgate + an ExHook auto-registration container + a Python test-driver — on an isolated bridge network and runs end-to-end checks for `drop`, `zerohop`, `passthru`, `noop`, and `custom-key channel passthru`. The test-driver crafts real Meshtastic `ServiceEnvelope` protobufs (using the same `meshtastic` Python library the firmware uses internally), so each case exercises the exact wire format floodgate sees in production.

Each case verifies BOTH what the subscriber received (delivered MQTT bytes) AND floodgate's `/health` stats — a behavior change with no stat increment, or a stat increment with no delivery effect, both fail the case. One PASS or FAIL line is printed per case.

Requirements: `docker` and `bash`. The `pytest` suite never runs the harness — it's opt-in via the script.

```bash
# One-shot verification — brings the stack up, runs cases, tears down, exits 0/non-zero.
./scripts/run-integration.sh

# Ad-hoc poking — leave the stack running after the cases finish.
./scripts/run-integration.sh --keep
# floodgate /health: http://localhost:18089/health
# EMQX dashboard: http://localhost:18083 (admin / public)

# Tear the stack down (and volumes/network) without running cases.
./scripts/run-integration.sh --teardown
```

CI runs the same script in the `integration` job after the `smoke` job passes. A failed case dumps service logs into the workflow output before tearing down.

## Commit style

Conventional commits: `feat:`, `fix:`, `docs:`, `test:`, `chore:`
Expand Down
78 changes: 78 additions & 0 deletions docker-compose.test.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
# Integration test stack. Brought up by scripts/run-integration.sh.
# All inter-service traffic stays on the floodgate-test-net bridge.
# Only floodgate /health (8080->18089) and EMQX REST (18083) are exposed
# to the host so the runner can poll readiness — MQTT 1883 and gRPC 9000
# are container-internal only.

services:
emqx:
image: emqx/emqx:6.1.1
container_name: floodgate-test-emqx
networks: [floodgate-test-net]
ports:
- "18083:18083"
environment:
EMQX_NAME: "emqx-test"
EMQX_DASHBOARD__DEFAULT_PASSWORD: "public"
healthcheck:
test: ["CMD", "/opt/emqx/bin/emqx", "ctl", "status"]
interval: 5s
timeout: 10s
retries: 12
start_period: 10s

floodgate:
build:
context: .
dockerfile: Dockerfile
image: floodgate-test:ci
container_name: floodgate-test-floodgate
networks: [floodgate-test-net]
ports:
- "18089:8080"
environment:
FLOODGATE_CONFIG: /app/config.yaml
volumes:
- ./tests/integration/config.yaml:/app/config.yaml:ro
healthcheck:
test: ["CMD", "python", "-c", "import urllib.request; urllib.request.urlopen('http://localhost:8080/health')"]
interval: 5s
timeout: 5s
retries: 12
start_period: 5s

exhook-init:
build:
context: tests/integration/exhook-init
container_name: floodgate-test-exhook-init
networks: [floodgate-test-net]
depends_on:
emqx:
condition: service_healthy
floodgate:
condition: service_healthy
restart: "no"

test-driver:
build:
context: tests/integration/test-driver
image: floodgate-test-driver:ci
container_name: floodgate-test-driver
networks: [floodgate-test-net]
depends_on:
emqx:
condition: service_healthy
floodgate:
condition: service_healthy
exhook-init:
condition: service_completed_successfully
profiles: ["driver"]
environment:
EMQX_HOST: "emqx"
EMQX_PORT: "1883"
FLOODGATE_HEALTH_URL: "http://floodgate:8080/health"

networks:
floodgate-test-net:
name: floodgate-test-net
driver: bridge
105 changes: 105 additions & 0 deletions scripts/run-integration.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
#!/usr/bin/env bash
# Local + CI driver for the integration harness.
#
# Modes:
# (default) bring stack up, run all cases, tear stack down, exit 0/non-zero
# --keep bring stack up, run all cases, leave stack running for poking
# --teardown tear the stack down (and volumes/networks); skip running cases

set -euo pipefail

COMPOSE_FILE="docker-compose.test.yaml"
HEALTH_URL="http://localhost:18089/health"
EMQX_URL="http://localhost:18083/api/v5/status"

usage() {
cat <<EOF
Usage: $0 [--keep | --teardown]
--keep Run cases, then leave stack running.
--teardown Tear stack down (no cases run).
(no args) Run cases and tear stack down on exit.
EOF
}

mode="run-and-teardown"
case "${1:-}" in
--keep) mode="run-and-keep" ;;
--teardown) mode="teardown-only" ;;
-h|--help) usage; exit 0 ;;
"") ;;
*) usage; exit 2 ;;
esac

repo_root="$(cd "$(dirname "$0")/.." && pwd)"
cd "$repo_root"

teardown() {
echo "==> Tearing down integration stack"
docker compose -f "$COMPOSE_FILE" down -v --remove-orphans
}

if [ "$mode" = "teardown-only" ]; then
teardown
exit 0
fi

echo "==> Bringing up integration stack"
docker compose -f "$COMPOSE_FILE" up -d --build

cleanup_on_error() {
rc=$?
if [ $rc -ne 0 ] && [ "$mode" != "run-and-keep" ]; then
echo "==> Run failed — collecting service logs before teardown"
docker compose -f "$COMPOSE_FILE" logs --no-color --tail=200 || true
teardown
fi
exit $rc
}
trap cleanup_on_error EXIT

echo "==> Waiting for EMQX REST"
for i in $(seq 1 60); do
if curl -sf --connect-timeout 2 --max-time 5 -o /dev/null "$EMQX_URL"; then
echo " EMQX REST ready after ${i}s"; break
fi
sleep 1
[ "$i" -eq 60 ] && { echo "EMQX REST never came up" >&2; exit 1; }
done

echo "==> Waiting for floodgate /health"
for i in $(seq 1 60); do
if curl -sf --connect-timeout 2 --max-time 5 -o /dev/null "$HEALTH_URL"; then
echo " floodgate /health ready after ${i}s"; break
fi
sleep 1
[ "$i" -eq 60 ] && { echo "floodgate /health never came up" >&2; exit 1; }
done

echo "==> Running test-driver"
set +e
docker compose -f "$COMPOSE_FILE" run --rm test-driver
rc=$?
set -e

echo "==> Test-driver exit code: $rc"

if [ "$mode" = "run-and-keep" ]; then
echo "==> --keep: leaving stack running."
echo " floodgate /health: $HEALTH_URL"
echo " EMQX dashboard: http://localhost:18083 (admin/public)"
echo " Tear down with: $0 --teardown"
trap - EXIT
exit $rc
fi

trap - EXIT

if [ $rc -ne 0 ]; then
# Dump service logs BEFORE teardown on the explicit-failure path. Without
# this, the workflow's `if: failure()` log-dump step fires after teardown
# finishes — by then all containers are gone and nothing remains to log.
echo "==> Run failed — dumping service logs before teardown"
docker compose -f "$COMPOSE_FILE" logs --no-color --tail=400 || true
fi
teardown
exit $rc
3 changes: 3 additions & 0 deletions tests/integration/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
# Integration test harness

End-to-end harness using Docker Compose. See `CONTRIBUTING.md` ("Integration testing") for usage. Top-level entry point: `scripts/run-integration.sh`. Stack defined in `docker-compose.test.yaml`.
23 changes: 23 additions & 0 deletions tests/integration/config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
zerohop_enabled: true
zerohop_channels:
- "LongTurbo"
- "LongFast"
- "LongModerate"
- "MediumFast"
- "MediumSlow"
- "ShortFast"
- "ShortSlow"
- "ShortTurbo"

drop_enabled: true
drop_channels: "zerohop_channels"
drop_portnums:
- "RANGE_TEST_APP"

grpc_port: 9000
health_port: 8080
topic_filter: "msh/#"
stats_interval_s: 10
log_level: "INFO"
log_format: "text"
stats_log: true
8 changes: 8 additions & 0 deletions tests/integration/exhook-init/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
FROM alpine:3.20

RUN apk add --no-cache curl jq bash

COPY register.sh /register.sh
RUN chmod +x /register.sh

ENTRYPOINT ["/register.sh"]
73 changes: 73 additions & 0 deletions tests/integration/exhook-init/register.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/usr/bin/env bash
# Register floodgate as an ExHook in EMQX once the broker REST API is up.
# Idempotent: if a hook named "floodgate" already exists, PUT to update it
# instead of POSTing a new one.

set -euo pipefail

EMQX_URL="${EMQX_URL:-http://emqx:18083}"
EMQX_USER="${EMQX_USER:-admin}"
EMQX_PASS="${EMQX_PASS:-public}"
HOOK_URL="${HOOK_URL:-http://floodgate:9000}"
HOOK_NAME="floodgate"

echo "exhook-init: waiting for EMQX REST at ${EMQX_URL} ..."
for i in $(seq 1 60); do
if curl -sf --connect-timeout 2 --max-time 5 -o /dev/null "${EMQX_URL}/api/v5/status"; then
echo "exhook-init: EMQX REST is up after ${i}s"
break
fi
sleep 1
[ "$i" -eq 60 ] && { echo "exhook-init: EMQX REST never came up" >&2; exit 1; }
done

echo "exhook-init: logging in"
TOKEN=$(curl -sf --connect-timeout 2 --max-time 5 -X POST "${EMQX_URL}/api/v5/login" \
-H 'Content-Type: application/json' \
-d "{\"username\":\"${EMQX_USER}\",\"password\":\"${EMQX_PASS}\"}" \
| jq -r .token)

if [ -z "${TOKEN}" ] || [ "${TOKEN}" = "null" ]; then
echo "exhook-init: failed to obtain EMQX REST token" >&2
exit 1
fi

BODY=$(cat <<EOF
{
"name": "${HOOK_NAME}",
"url": "${HOOK_URL}",
"auto_reconnect": "5s",
"failed_action": "deny",
"request_timeout": "5s"
}
EOF
)

if curl -sf --connect-timeout 2 --max-time 5 -o /dev/null "${EMQX_URL}/api/v5/exhooks/${HOOK_NAME}" \
-H "Authorization: Bearer ${TOKEN}"; then
echo "exhook-init: hook '${HOOK_NAME}' exists — updating"
curl -sf --connect-timeout 2 --max-time 5 -X PUT "${EMQX_URL}/api/v5/exhooks/${HOOK_NAME}" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d "${BODY}" >/dev/null
else
echo "exhook-init: hook '${HOOK_NAME}' does not exist — creating"
curl -sf --connect-timeout 2 --max-time 5 -X POST "${EMQX_URL}/api/v5/exhooks" \
-H "Authorization: Bearer ${TOKEN}" \
-H 'Content-Type: application/json' \
-d "${BODY}" >/dev/null
fi

echo "exhook-init: verifying registration"
STATUS=$(curl -sf --connect-timeout 2 --max-time 5 "${EMQX_URL}/api/v5/exhooks/${HOOK_NAME}" \
-H "Authorization: Bearer ${TOKEN}" | jq -r '.status // "unknown"')
echo "exhook-init: hook '${HOOK_NAME}' status=${STATUS}"

if [ "${STATUS}" = "connected" ] || [ "${STATUS}" = "running" ]; then
echo "exhook-init: success"
exit 0
fi

echo "exhook-init: hook registered but status=${STATUS}; floodgate may still be starting."
echo "exhook-init: init container exits 0; floodgate->EMQX gRPC will recover on its own."
exit 0
Loading
Loading