Skip to content

[kafka pr-1 · 04/6] Slice 04 — Opportunistic Kafka probe (4 auth modes + ?force)#393

Open
zsculac wants to merge 31 commits intofeat/kafka-walking-skeletonfrom
feat/kafka-probe
Open

[kafka pr-1 · 04/6] Slice 04 — Opportunistic Kafka probe (4 auth modes + ?force)#393
zsculac wants to merge 31 commits intofeat/kafka-walking-skeletonfrom
feat/kafka-probe

Conversation

@zsculac
Copy link
Copy Markdown
Contributor

@zsculac zsculac commented May 4, 2026

Stack position

Sub-PR stacking on the kafka foundation branch (PR #390).

```
main
└── feat/kafka-walking-skeleton (foundation rollup, PR #390 — draft)
└── feat/kafka-probe ← THIS PR
```

Independent siblings (no merge order requirement):

  • Slice 02 — Explicit local-vs-shared CG choice
  • Slice 03 — Default-private KAs

Mechanical rebase conflict in `routes/kafka.ts` after a sibling lands.

What slice 04 adds

The `kafka-probe` deep module: one function, one-shot kafkajs admin
`describeTopics` against the broker. Supports all four auth modes
(PLAINTEXT, SASL_PLAINTEXT, SASL_SSL, SSL). PEMs inline (default) or
filesystem paths (escape hatch). Credentials discarded immediately —
never logged, never persisted in a KA.

`endpoint.register` becomes opportunistic per ADR-0002:

caller's input behavior
creds present, probe ok KA written, `verificationStatus: "verified"`, `verifiedAt` set
creds present, probe fail HTTP 4xx, no KA written
creds present, probe fail, `?force=true` KA written, `verificationStatus: "failed"`
creds absent no probe, KA written, `verificationStatus: "unattempted"`

KA gains `dkg:verificationStatus`, `dkg:verifiedAt`, `dkg:securityProtocol`
(advertised hint — never raw creds).

This is the heaviest slice in the foundation:

  • `packages/kafka/src/kafka-probe.ts` — new deep module
  • testcontainers test infra: `packages/kafka/test/helpers/` + plain compose YAML at `test/fixtures/`
  • Synthetic producer helper (1 topic, 1 message)
  • First runtime `kafkajs` dep on the package — pinned and documented

Test plan

  • Unit: kafka-probe with mocked kafkajs admin, all 4 auth-mode wiring branches
  • Integration: testcontainers Kafka (PLAINTEXT) — topic-exists / topic-absent / broker-unreachable / auth-fail
  • Integration: SASL_SSL if cheap, otherwise documented follow-up
  • Credential-discard: probe with creds → `JSON.stringify(result)` contains no credential substring
  • Logger discipline: capture-mode test, no creds in any log line
  • endpoint.register tests cover all 4 status outcomes (verified/failed/forced-failed/unattempted)
  • e2e: live testcontainer Kafka + register with creds → SPARQL confirms `verificationStatus = "verified"`
  • ADR-0001 invariant: no consumer code (`grep -E 'consumer.run|consumer.subscribe|groupId' packages/kafka/src/` empty)
  • Coverage ratchet updated

Risk notes

  • Most external integration in the foundation. testcontainers can be flaky in some CI environments; if so, integration tests fall back to
    env-gated execution.
  • `kafkajs` is the chosen primitive; deviations would need a separate ADR.

Related

  • ADR-0001 (metadata-only invariant — the probe is admin-only, not a consumer)
  • ADR-0002 (opportunistic verification)
  • Issue: `.scratch/kafka-registry/issues/04-kafka-probe.md`

Zvonimir and others added 20 commits May 4, 2026 17:14
Introduces a single-function deep module `probe(opts) → ProbeResult` that
opens a one-shot kafkajs Admin client, calls fetchTopicMetadata for the
target topic, and returns a structured outcome (verified | failed |
unreachable). Wires all four broker auth modes (PLAINTEXT, SASL_PLAINTEXT,
SASL_SSL, SSL/mTLS) and accepts PEM material inline (default) or via
filesystem paths (escape hatch). Credentials are scoped to a single call
and never appear on the returned ProbeResult.

kafkajs@2.2.4 is added as the first runtime dependency on the kafka
package (deliberate). kafkajs' built-in logger is silenced
(`logLevel.NOTHING`) to remove any chance of credentials surfacing
through its log payloads, and retries are pinned to 0 so unreachable
brokers resolve quickly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends `ka-builder` with three optional fields — `verificationStatus`,
`verifiedAt`, `securityProtocol` — that surface on the published KA as
`dkg:verificationStatus`, `dkg:verifiedAt`, and `dkg:securityProtocol`.
Without those inputs the KA shape is identical to slice-01.

Extends `registerKafkaEndpoint` to consume a probe outcome (run by the
caller — this package's pure layer never opens Kafka connections of
its own per ADR 0001/0002) and a `force` override. Decision tree:

  probe absent              → status: unattempted (slice-01 behavior)
  probe verified            → status: verified, verifiedAt = probedAt
  probe failed/unreachable  → throw KafkaEndpointProbeFailedError
  probe failed + force=true → status: failed, verifiedAt = probedAt

The bare-KA contract on `KafkaEndpointPublisher.publish` is untouched.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
- Daemon route at POST /api/kafka/endpoint now parses optional
  securityProtocol/sasl/ssl fields, runs the kafka-probe at the
  network boundary when creds are present, and feeds the outcome to
  registerKafkaEndpoint. Failed probes return 422 with a sanitized
  error payload (no credentials in the response). The `?force=true`
  query param overrides probe failure and lets the caller register
  the KA with `verificationStatus: "failed"`.
- CLI gains --security-protocol, --username, --password, --ca-pem-path,
  --cert-pem-path, --key-pem-path, and --force. The CLI resolves
  --ca-pem-path / --cert-pem-path / --key-pem-path to inline PEM
  strings before posting; the daemon's caPath/certPath/keyPath
  filesystem-path mode remains as an escape hatch for direct API
  callers.
- api-client.registerKafkaEndpoint() typed for the new fields and
  appends ?force=true to the URL when force is requested (rather than
  smuggling it into the body).

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…probe

- `test/helpers/kafka-container.ts`: spins up a single-broker
  confluentinc/cp-kafka:7.5.0 KRaft node via @testcontainers/kafka and
  surfaces the bootstrap string (`host:mappedPort`).
- `test/helpers/synthetic-producer.ts`: creates a topic and produces a
  single message so the probe has something concrete to find via
  `fetchTopicMetadata`.
- `test/fixtures/docker-compose.yml`: plain compose YAML mirroring the
  same image and listener config the testcontainers helper uses, for
  manual debugging only. Tests do NOT consume this file directly.
- `test/integration/kafka-probe.test.ts`: covers verified (topic
  exists), failed (topic absent), unreachable (wrong port), and
  credential-discarding (creds against PLAINTEXT broker → no creds in
  ProbeResult).

PLAINTEXT mode is exercised end-to-end. SASL_SSL coverage is documented
as a follow-up — wiring up a TLS listener in testcontainers needs a
fixture cert generator that's beyond this slice's scope. The SASL_SSL
config-wiring branch stays covered in the unit tests.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Extends `walking-skeleton.test.ts` with a live-probe flow:
- spins up a testcontainer Kafka,
- creates a synthetic topic via the producer helper,
- runs the CLI registration with `--security-protocol PLAINTEXT`,
- queries the CG via SPARQL and asserts
  `dkg:verificationStatus = "verified"`, `dkg:securityProtocol =
  "PLAINTEXT"`, and `dkg:verifiedAt` is a valid ISO-8601 timestamp
  recorded within the last minute.

Also tightens the no-creds path: the original walking-skeleton test
now asserts `verificationStatus = "unattempted"` lands on the KA when
no probe runs.

`kosavaKafkaCoverage` is updated to the new measured floors after the
slice — lines/statements at 95, branches at 85, functions at 100. The
two uncovered lines are the best-effort disconnect catch handlers in
endpoint.ts and kafka-probe.ts, both `catch {}` paths the v8 tool
won't credit synthetically.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ion guard

Three small follow-ups from spec review:

- `endpoint.register.test.ts`: add a test for the missing branch
  `probe.status === 'verified' + force=true`. Asserts the resulting KA
  is identical to the `force=false + verified` case — i.e. `force` is
  ignored on a successful probe (ADR 0002).
- `kafka-probe.test.ts`: add a regression guard that spies on
  `Logger.prototype.{info,warn,error,debug}` and `console.{log,warn,
  error,debug}`, runs the probe with credentials supplied, and asserts
  no credential substring (username, password, CA/cert/key PEMs) ever
  reaches a logging primitive. The probe deliberately logs nothing
  today; this guards against a future contributor hooking a logger in
  and accidentally leaking credentials.
- `routes/kafka.ts`: add a TL;DR line to `shouldProbe`'s docstring
  making the PLAINTEXT opt-in rule explicit. Comment-only change.

No production behavior changes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous vitest config measured v8 coverage across the whole package, so
test helpers (`test/helpers/synthetic-producer.ts`, `kafka-container.ts`)
were pulled into the report and dragged measured coverage below the floor
when `DKG_KAFKA_INTEGRATION=0` and Docker is absent. Restrict coverage to
`src/**` (excluding the re-export barrel `src/index.ts`) and re-baseline
`kosavaKafkaCoverage` to the new actuals. The follow-up commit drives the
remaining uncovered branches up so the floor can be raised.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…/100

Add unit tests that cover the previously-uncovered code paths in `src/**`:
- `endpoint.ts:88`: `registerKafkaEndpoint` without an explicit `issuedAt`
  falls back to `new Date().toISOString()`.
- `kafka-probe.ts` `buildKafkaConfig` default arm: an unrecognized
  `securityProtocol` (cast through the type system, since real callers can
  never reach it) throws the defensive guard.
- `classifyError` named arms (`KafkaJSBrokerNotFound`,
  `KafkaJSNumberOfRetriesExceeded`, `KafkaJSRequestTimeoutError`,
  `KafkaJSConnectionClosedError`) plus the unknown-name fallback — strips
  the inner credential-bearing message to a stable class name.
- `buildSsl` with no `ssl` block at all → falls back to `{}` and still
  fires the mTLS guard.

With these tests the package reaches 100% lines/statements/functions and
97.36% branches (only two micro-defensive paths remain). Raise
`kosavaKafkaCoverage` to `lines: 100, statements: 100, functions: 100,
branches: 96` and document the leftover branches.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…to 400

The `probe()` JSDoc claimed "Does not throw" but the function does throw on
ill-formed input options (missing SASL creds, missing mTLS material,
unreadable PEM paths, unsupported `securityProtocol`). Tests already assert
this behaviour. Update the JSDoc to describe the real contract: structured
results for network/auth failures, throws only for ill-formed options.

Wrap the route's `kafkaProbe()` call in a try/catch so an ill-formed payload
produces HTTP 400 with a sanitized message instead of an uncaught 500. The
error strings emitted by the kafka package are already credential-free.

Resolves I1 and I5.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…r output

`ApiClient.registerKafkaEndpoint()` already declares `verificationStatus`
and `verifiedAt` in its return type, so the four `(result as any)` casts
were redundant. Use the typed properties directly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…Error

`KafkaEndpointProbeOutcome` now exposes the optional `error` string. The
route propagates `probeResult.error` into the outcome so the typed error
class carries it across the throw boundary, and the 422 response reads it
back from `err.outcome.error` instead of reaching outside the typed error
to grab a side-channel local. Eliminates the brittle implicit dependency
between the probe call and the catch block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`parseSasl` now requires non-empty `username` AND `password`; `parseSsl`
requires non-empty PEM / path strings. Empty-string fields collapse to
`undefined`, which propagates through `shouldProbe` so the registration
records `verificationStatus: "unattempted"` instead of firing a probe with
empty credentials and reporting a confusing kafkajs auth failure.

Export `parseSasl`, `parseSsl`, `parseSecurityProtocol`, `shouldProbe`, and
`KafkaEndpointRequestBody` so unit tests can pin the gate's behaviour
without standing up the daemon HTTP surface. Add `kafka-route-parsers.test.ts`
covering the empty-creds collapse for SASL_PLAINTEXT, SASL_SSL, and SSL
plus the positive paths.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…aming

Rename `KafkaProbeSslMaterial` → `KafkaSslMaterial` and
`KafkaProbeSaslCredentials` → `KafkaSaslCredentials` in the kafka package.
The types describe Kafka auth material in general (probe today, future
operations tomorrow), not anything probe-specific.

Use the imported types directly in the CLI:
- `packages/cli/src/daemon/routes/kafka.ts` — `KafkaEndpointRequestBody`
  and the `parse*` helpers consume the renamed types.
- `packages/cli/src/api-client.ts` — `securityProtocol` on the inline
  request shape now uses the imported `SecurityProtocol` union instead
  of duplicating it. The `sasl` / `ssl` shapes stay inline because they
  describe the wire format (`ca`/`cert`/`key`), which differs from the
  parsed `KafkaSslMaterial` (`caPem`/`certPem`/`keyPem`).
- `packages/cli/src/cli.ts` — drops the duplicated `securityProtocol`
  string-union cast in favor of the imported type.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Move the parsing/validation surface of `/api/kafka/endpoint` out of
`packages/cli/src/daemon/routes/kafka.ts` into a new module at
`packages/cli/src/daemon/parsers/kafka-request.ts`. The route file now
only carries route-handler logic, error translation, and HTTP glue.

Moved verbatim (no rename, no signature change):
- `isNonEmptyString`
- `VALID_PROTOCOLS`, `VALID_SASL_MECHANISMS`
- `parseSecurityProtocol`, `parseSasl`, `parseSsl`, `shouldProbe`
- `KafkaEndpointRequestBody`

The route imports the helpers back from the new module; the parser
test (`packages/cli/test/kafka-route-parsers.test.ts`) updates its
import path to match. No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The kafka package now exports `toKafkaEndpointProbeOutcome(result)`,
which converts a `ProbeResult` from `kafka-probe.ts` into the narrower
`KafkaEndpointProbeOutcome` shape that `registerKafkaEndpoint` consumes.
The route handler in `packages/cli/src/daemon/routes/kafka.ts` now calls
the adapter instead of inlining the deconstruction; the conditional
`error` carry stays inside the adapter and the `securityProtocol` echo
stays dropped (the route already passes it via
`RegisterKafkaEndpointInput.securityProtocol`).

Three unit tests cover the adapter: verified pass-through (no error),
error pass-through, and absent-error omission.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add one-line JSDoc summaries to the kafka package's exported types and
functions that lacked one (`SecurityProtocol`, `KafkaSslMaterial`,
`KafkaSaslCredentials`, `KafkaProbeOptions`, `ProbeStatus`, `ProbeResult`,
`KafkaEndpointKnowledgeAsset`, `RegisterKafkaEndpointInput`,
`RegisterKafkaEndpointResult`, `registerKafkaEndpoint`,
`BuildKafkaEndpointKnowledgeAssetInput`, `buildKafkaEndpointKnowledgeAsset`,
`KafkaEndpointIdentity`, `buildKafkaEndpointUri`).

Mark the two module-scoped private types in `kafka-probe.ts`
(`RawProbeOutcome`, `SslConnectionOptions`) with `@internal` so the
TypeScript-convention "module scope is ambient public" doesn't promote
them.

Move the existing dependency-inversion JSDoc in `endpoint.ts` to sit
above `KafkaEndpointPublisher` (the symbol it actually describes) and
add a fresh summary above the type alias `KafkaEndpointKnowledgeAsset`.
No content was deleted.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add a comment above `DEFAULT_TIMEOUT_MS` describing it as the wall-clock
ceiling for the entire probe round-trip, and rewrite the existing
comment on `connectionTimeout`/`requestTimeout` to explain how the
inner kafkajs bounds split (TCP/TLS reach vs slow broker response) and
why their sum matches the outer ceiling. Cross-references go both
directions so a future maintainer tuning either value sees the
relationship without having to grep.

No behavior change.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
With `exactOptionalPropertyTypes: false`, `field: undefined` satisfies
`field?: T`. Conditional spreads `...(x ? { f: x } : {})` are noise; direct
assignment is equivalent and shorter. The single remaining conditional
spread in `toKafkaEndpointProbeOutcome` is contract-locked: its caller
test asserts `'error' in out === false` for verified outcomes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The named alias was used in exactly two adjacent locations inside the same
file. Replacing both with the inline structural type
`{ status: ProbeStatus; error?: string }` removes one named-type indirection
without changing any public surface.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Three single-line JSDoc summaries on `SecurityProtocol`, `KafkaSaslCredentials`,
and `ProbeStatus` told the reader nothing the symbol name + the literal-union
or field list did not already say. The substantive comments (file header,
credential invariants, throws contracts, `KafkaEndpointPublisher` boundary)
are retained.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/kafka/src/kafka-probe.ts Outdated
Comment thread packages/cli/src/daemon/routes/kafka.ts Outdated
Comment thread packages/cli/src/cli.ts Outdated
Zvonimir and others added 3 commits May 5, 2026 00:17
`buildSsl` previously threw when invoked for `securityProtocol: 'SSL'`
without a client cert+key, forcing mTLS for every TLS probe. Real-world
SSL deployments are commonly server-cert-only (CA bundle in trust store,
no client cert/key); the probe could not reach those brokers at all.

Drop the `requireMtls` parameter from `buildSsl`. Both auth-mode arms
(`SSL`, `SASL_SSL`) now call `buildSsl(opts.ssl)` and pass through whatever
PEMs the caller supplied. CA-only inputs produce a one-way-TLS config;
CA+cert+key inputs produce an mTLS config. Brokers that require mTLS
reject the handshake on their own and the failure surfaces as a structured
`ProbeResult` — not a thrown exception — keeping the contract uniform.

Tests:
- "SSL without cert+key throws — mTLS material is required" replaced with
  "SSL (one-way TLS): CA-only succeeds; the kafkajs config carries the CA
  bundle and rejectUnauthorized".
- New mTLS test ("SSL (mTLS): cert + key flow into kafkajs config alongside
  CA") preserves the explicit-mTLS coverage.
- "SSL with no `ssl` block at all" updated: previously asserted the mTLS
  guard fired; now asserts the resulting kafkajs config is a TLS-only
  block with `rejectUnauthorized: true`.

KafkaProbeOptions / KafkaSslMaterial JSDoc rewritten to reflect that
client cert/key are optional in both `SSL` and `SASL_SSL` modes.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`parseSasl` and `parseSsl` previously returned `undefined` for any
malformed payload — wrong outer type, unknown SASL mechanism, empty
username/password, non-string PEM, non-boolean `rejectUnauthorized`. The
route handler couldn't distinguish "field absent" from "field present
and broken", so a bogus block silently dropped through and the resulting
KA recorded `verificationStatus: "unattempted"` instead of producing a
400. Worst case: a request with an unknown SASL mechanism registered an
unverified endpoint with no warning to the caller.

Tighten both parsers:

* Genuinely-absent (`null` / `undefined` / missing) → still returns
  `undefined`. `ssl: {}` (empty object) also returns `undefined`.
* Outer type wrong (string, array, primitive) → throws.
* SASL: missing or empty username/password → throws. Unknown mechanism
  → throws with the valid alternatives listed. Non-string mechanism →
  throws.
* SSL: any of `ca`/`cert`/`key`/`caPath`/`certPath`/`keyPath` present but
  not a non-empty string → throws naming the field. Non-boolean
  `rejectUnauthorized` → throws.

A new typed error class `KafkaRequestParseError` carries a sanitized
`publicMessage`. The route handler catches it and emits HTTP 400 with
the message in the body. Error messages name fields and (for unknown
mechanisms) list valid alternatives — they never echo credential values.

Tests:
- All "returns undefined for malformed input" assertions become
  "throws KafkaRequestParseError with a specific message".
- New cases: unknown mechanism, empty username, empty password, SSL with
  non-string ca/cert/key/path, SSL with non-boolean rejectUnauthorized,
  ssl: {} → undefined, genuinely-absent fields → undefined.
- A defence-in-depth test asserts the parser's error messages never
  contain the supplied credential value.

Total parser-test count: 17 → 25.

The route's try/catch is mechanical (one catch arm); no daemon-HTTP
test harness exists for kafka routes today, so the parser-level
contract test is the load-bearing surface. If a daemon-HTTP harness is
added later, asserting 400 → publicMessage on the route is a one-line
addition.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`dkg kafka endpoint register` previously hardcoded `mechanism: 'plain'`
in the request body, even though the daemon parser and the kafkajs
admin client both already support `scram-sha-256` and `scram-sha-512`.
Brokers that mandate SCRAM (a common production posture) could not be
registered through the CLI at all.

Add a `--sasl-mechanism <mechanism>` option to the kafka register
command, defaulting to `plain` to preserve the existing behaviour for
callers that never set the flag. The action handler validates the
value against the same `{plain, scram-sha-256, scram-sha-512}` set the
daemon parser uses, throws a clear error on a bogus value, and threads
the mechanism through into the request body's `sasl.mechanism` field.

Tests:
- New: `--sasl-mechanism scram-sha-256` produces
  `body.sasl.mechanism === 'scram-sha-256'`.
- New: `--sasl-mechanism gibberish` exits non-zero with stderr that
  names the flag and lists the valid alternatives.

The api-client `registerKafkaEndpoint` request type already accepts
`mechanism` on `sasl`; no additional plumbing changes were required.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/cli/src/daemon/parsers/kafka-request.ts Outdated
Comment thread packages/cli/src/cli.ts Outdated
Comment thread packages/cli/src/cli.ts Outdated
Zvonimir and others added 3 commits May 5, 2026 09:14
…gv credential input

`--password <pass>` exposes the SASL secret to shell history (`~/.zsh_history`
/ `~/.bash_history`), to `ps -ef` listings, and to any process that scrapes
argv. Add two safer paths: `--password-stdin` reads the first line of piped
stdin (recommended for CI), and `DKG_KAFKA_PASSWORD` is read from the
environment when neither `--password` nor `--password-stdin` is set.

Resolution priority is `--password-stdin` → `--password` → environment →
unresolved. `--password` and `--password-stdin` are mutually exclusive and
fail fast. Interactive masked prompts on TTY-attached stdin are out of scope
for this commit; `--password-stdin` with a TTY fails with a clear pointer to
the alternatives.

Help text on `--password` now warns about the shell-history exposure and
recommends the two safer alternatives.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Previously, the request body construction silently dropped the SASL block
whenever exactly one of `--username` / `--password` was missing
(`opts.username && opts.password ? sasl : {}`). The caller intended to
authenticate, but the daemon registered the KA as
`verificationStatus: "unattempted"` — a quiet acceptance of a misconfig
that surfaced only at runtime.

Add three layered validations between password resolution and request
body composition:

  1. Partial-pair: exactly one of username / resolved-password supplied.
     Error names every input that could have set the credential, including
     `--password-stdin` and `DKG_KAFKA_PASSWORD`.
  2. SASL protocol without credentials: `--security-protocol`
     `SASL_PLAINTEXT` / `SASL_SSL` requires both.
  3. Non-SASL protocol with credentials: `PLAINTEXT` / `SSL` may not
     carry SASL inputs.

The `--password-stdin` empty-stdin case now fails through the partial-pair
check (username present, password resolved to undefined) instead of
silently dropping the SASL block.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…nt with buildSsl)

The route-level `shouldProbe` gate refused to probe SSL endpoints unless
the request supplied both a client cert AND key. That contract is
inconsistent with `buildSsl` in `@origintrail-official/dkg-kafka`, which
since 6f76df7 accepts SSL with mTLS material, with a CA-only bundle, or
with no SSL block at all (default trust store).

Treat `securityProtocol: 'SSL'` like `'PLAINTEXT'`: setting the protocol is
the explicit opt-in to verification — the probe runs in all three SSL
shapes. SASL_PLAINTEXT / SASL_SSL still gate on `sasl.username`+`password`
(those credentials are what the probe is verifying).

JSDoc on `shouldProbe` rewritten to spell out which inputs each protocol
needs, since the contract is the source of truth callers will read first.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Comment thread packages/cli/src/daemon/routes/kafka.ts
Comment thread packages/kafka/src/kafka-probe.ts Outdated
Comment thread packages/kafka/src/kafka-probe.ts
Comment thread packages/cli/src/cli.ts
Zvonimir added 4 commits May 5, 2026 09:52
…route

Round-1 closed `parseSasl`/`parseSsl` silent downgrade. Round-2 closed CLI
silent SASL drop. But the daemon route still parsed each field in isolation
— a direct HTTP POST `{ securityProtocol: 'SASL_SSL', ... }` with no `sasl`
block (or `PLAINTEXT` + `sasl`) slipped through and registered as
`verificationStatus: "unattempted"` instead of HTTP 400. CLI fail-fast does
not help non-CLI clients.

Add `validateKafkaAuthConsistency` to the request parser module and call it
from the route after the per-field parsers have populated `reqBody`. The
route's existing `KafkaRequestParseError` translation surfaces the mismatch
as 400. Slice-01 wire compat preserved: omitting `securityProtocol` entirely
still skips the probe and records `unattempted`.
…not "unreachable"

kafkajs throws `KafkaJSSASLAuthenticationError` (and the parent
`KafkaJSAuthenticationError`) from `Admin#connect()` when credentials are
wrong. The previous catch arm lumped every connect-time failure as
`unreachable`, lying about the failure mode and steering operators towards
network debugging instead of credential debugging.

Classify the connect-time error and return `failed` for auth-class names,
keeping `unreachable` for everything else (network/transport, DNS,
unidentifiable errors). Update the existing assertion that expected
`unreachable` for SASL auth and add explicit guards for the auth-parent
class and the unrelated-error default arm.
Supplying half of an mTLS pair (cert without key, or key without cert) is a
LOCAL input error: the caller intended mTLS but only supplied one half. The
previous behaviour silently passed it through to kafkajs, which failed later
with a vague handshake error mapped by the route to HTTP 422 (probe
failure). The correct response is HTTP 400 (input validation).

Add an XOR check in `buildSsl` after the PEMs are loaded. The throw
propagates up to `probe()`, then up to the route's `kafkaProbe()` catch arm
which already maps thrown probe errors to 400. CA-only one-way TLS, cert+key
mTLS, and no-ssl-block default-trust-store paths all continue to pass.
Update the JSDoc on `KafkaSslMaterial` to document the rule.
…nses

Round-2 closed CLI silent SASL drop, and Fix 2 of round-3 stopped
mis-classifying SASL auth failures as `unreachable`. But the user-facing
422 failure still landed as a single line of generic "pass force=true to
register anyway" — the actual `probeStatus` (failed / unreachable) and
`probeError` (kafkajs error class) lived on the response body and were
never surfaced. Operators debugging an auth or topic miss had to read
daemon logs to learn which mode failed.

Two changes:

1. The route's 422 response now also emits `probeStatus` at the top level
   (alongside the pre-existing `probeError`). The nested `probe` block is
   retained for backwards compat.

2. The CLI's `kafka endpoint register` catch handler reads `probeStatus`
   and `probeError` off `responseBody` and prints them on stderr after
   the top-level error message. Sub-second change for the user; no
   credential leak (the route already strips creds).
Comment thread packages/cli/src/daemon/parsers/kafka-request.ts
Comment thread packages/kafka/test/integration/kafka-probe.test.ts
Comment thread packages/kafka/package.json
…route

Tighten the no-protocol branch of validateKafkaAuthConsistency: requests
that supply a sasl or ssl block without declaring securityProtocol are
now rejected at the gate. Previously the route accepted them, then
shouldProbe returned false and the KA was registered as `unattempted` —
silently dropping the auth payload the caller sent. Same silent-downgrade
pattern this slice has been closing on the protocol-declared paths.

Slice-01 wire compat is preserved: requests with no protocol AND no
auth/TLS material still pass through cleanly.
Comment thread packages/cli/src/daemon/parsers/kafka-request.ts
Comment thread packages/cli/src/daemon/parsers/kafka-request.ts
Comment thread packages/kafka/src/endpoint.ts
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant