Skip to content

[MCP] foundation: component scaffold + config schema + boot gating#649

Merged
kylebernhardy merged 5 commits into
mainfrom
feat/mcp-foundation
May 23, 2026
Merged

[MCP] foundation: component scaffold + config schema + boot gating#649
kylebernhardy merged 5 commits into
mainfrom
feat/mcp-foundation

Conversation

@kylebernhardy
Copy link
Copy Markdown
Member

@kylebernhardy kylebernhardy commented May 20, 2026

Summary

Lands the foundation for the native MCP server design (#465) — the empty MCP component, the mcp: config block, and a presence-gated registration hook in the operations server. Plumbing only. No JSON-RPC, no transport, no tools, no resources yet.

Closes #613
Tracking: #465 (sub-issue #1 of 11)

What this PR lands

  • New components/mcp/index.ts exporting registerMcpProfile({ profile, host, config, routeOptions }) — supports both operations and application profiles via one entry point.
  • Joi schema in validation/configValidator.ts covering both profiles and mcp.session. No enabled flag — profile presence in config gates registration (matches Harper's replication convention).
  • static/defaultConfig.yaml intentionally omits the mcp: block (opt-in via block presence; same as replication). Existing deployments see zero behavior change on upgrade.
  • Config keys in utility/hdbTerms.ts (MCP_* flat names; reserved for env-var overrides via HARPER_MCP_OPERATIONS_MOUNTPATH=… etc., not used by the runtime gate itself).
  • Presence-gated call site in server/operationsServer.ts (after registerContentHandlers, before /health). Uses getConfigObj() to read the merged config tree and gates on fullConfig.mcp?.operations. Auth wired via routeOptions: { preValidation: [authHandler] }.
  • Unit tests: 6 for registerMcpProfile + stub handler (node:assert/strict, no sinon/rewire); 5 in configValidator.test.js for parse-clean / defaults / full-block / bad-types.

What's still off / gated

This trade-off is called out explicitly on #465; the acceptance criterion "registration hook usable by both servers" is satisfied — the hook is in place, only the application boot invocation moves.

Where to put attention

  • server/operationsServer.ts — the gate uses getConfigObj() and if (fullConfig.mcp?.operations). Block-level presence is intentional because Joi .default(...) values under mcp.* are not propagated back into env.get (config/configUtils.js:483-490 only re-applies six specific defaults; flagged as a Harper-wide cleanup, not blocking).
  • components/mcp/index.ts:FastifyLike — the loose post: (path: string, ...rest: unknown[]) => unknown shape avoids pulling Fastify types into the component (per EVP review point Add some unit tests #9 on Expose Operations (API) through an MCP Wrapper #465 — no new Fastify API surface).
  • validation/configValidator.ts — the application profile inherits from the operations schema via .keys({ searchMaxResults: … }) to keep the shape declarative.

Verification

npm run build      # my files emit clean; pre-existing TS errors in resources/* are not from this PR
npm run lint:required
npx mocha unitTests/components/mcp/index.test.js unitTests/validation/configValidator.test.js
# Expected: 27 passing

# Defaults parse cleanly against the live YAML (mcp block absent → mcp undefined)
node -e "
const yaml = require('js-yaml'); const fs = require('fs');
const cfg = yaml.load(fs.readFileSync('static/defaultConfig.yaml','utf8'));
cfg.rootPath = '/tmp';
const { configValidator } = require('./dist/validation/configValidator.js');
const r = configValidator(cfg, true);
console.log(r.error ? r.error.message : 'OK');
"

# Manual smoke (with a local Harper running, mcp: { operations: {} } added to config):
curl -sS -X POST http://localhost:9925/mcp -d '{}' -i | head -5
# HTTP/1.1 503 Service Unavailable
# Retry-After: 0
# Content-Type: application/json
# {"error":"mcp_not_implemented","profile":"operations"}

Test plan

  • Targeted unit tests pass locally (27 passing).
  • Build clean for files in this PR.
  • Lint clean.
  • CI green across the full matrix (Node 20/22/24, Bun, Windows shards).
  • Manual smoke: mcp: absent → 404; mcp.operations: {} → 503 with the documented body.
  • Regression: existing operations API unchanged when mcp: is absent.

Notes

  • Cross-model review (Gemini 0.42.0): completed. Surfaced a critical truthiness bug (env-sourced 'false' string was truthy in the old enabled-based gate) plus a defense-in-depth gap (stub /mcp lacked authHandler preValidation). Both fixed before the convention switch — see the disposition comment thread.
  • Convention switch mid-review: following team feedback, dropped the enabled field entirely; profiles are presence-gated now. Design doc on Expose Operations (API) through an MCP Wrapper #465 updated to match.
  • Self-audit against Harper config patterns: caught a real bug where Joi defaults under mcp don't propagate back to env.get (configUtils.js:483-490 re-applies only six specific defaults). Switched to getConfigObj() + block-presence check; see the dedicated comment thread.
  • Generated by an AI agent (Claude Opus 4.7) following the Harper Engineering Process & Guidelines lifecycle.

Lands the empty MCP component, the mcp: config block, and a config-gated
registration hook in the operations server. Until #614 ships the
Streamable HTTP transport, flipping mcp.operations.enabled: true serves a
503 with `{error:"mcp_not_implemented", profile:"operations"}` from the
configured mountPath — enough to prove the gate end-to-end.

The application-profile call site is intentionally deferred to #614,
where the transport will register itself via server.http(...). The
shared registerMcpProfile() supports both profiles already (verified by
tests).

Defaults: mcp.{operations,application}.enabled: false. Existing
deployments see no change on upgrade.

Closes #613
Tracking: #465

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented May 20, 2026

Reviewed; no blockers found.

Gemini cross-model review surfaced two issues:

1. **Truthiness regression.** `env.get(MCP_OPERATIONS_ENABLED)` can return
   the literal string `'false'` when sourced from an environment variable,
   which is truthy in JS — `mcp.operations.enabled: false` would still
   register the route. Switched to the in-file pattern used for
   `studioOn` (commonUtils.isEmpty + toString().toLowerCase() === 'true').
2. **Auth missing on stub route.** The /mcp stub had no preValidation
   while the rest of the operations port uses authHandler. Pre-staging
   the design's auth-first stance from #465: registerMcpProfile now
   accepts optional `routeOptions` (forwarded as Fastify's 3-arg post),
   and the operations call site passes `{preValidation:[authHandler]}`.

Also removed the redundant `?? '/mcp'` fallback at the call site —
registerMcpProfile already defaults the mount path.

New tests cover routeOptions forwarding and the strict-boolean contract.

Refs #613.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy
Copy link
Copy Markdown
Member Author

Cross-model review (Gemini 0.42.0) — applied

Gemini surfaced one critical bug + two improvements. Pushed as a fix commit on top of the foundation.

Critical: env-sourced 'false' string is truthy

env.get(MCP_OPERATIONS_ENABLED) can return the literal string 'false' from environment-variable overrides, which would have falsely enabled the route. Fixed by switching to the in-file pattern already used for studioOn:

const v = env.get(...MCP_OPERATIONS_ENABLED);
if (!commonUtils.isEmpty(v) && v.toString().toLowerCase() === 'true') { ... }

Defense in depth: stub /mcp had no auth

The 503 stub was registered without preValidation, while every other operations-port POST uses authHandler. Fixed by extending registerMcpProfile to accept an optional routeOptions (forwarded as Fastify's 3-arg post(path, opts, handler)), and the operations call site now passes {preValidation:[authHandler]}. Unauth'd POST /mcp will now return 401, matching the design's auth-first stance from #465.

Minor: dropped redundant ?? '/mcp' fallback

registerMcpProfile already defaults the mount path; the call-site fallback was duplication.

Other findings (acknowledged, not applied)

  • Gemini noted three places that hold /mcp (the component constant, the YAML, the env override). The constant is the runtime fallback; the YAML is human docs; the env override path is now removed. Two places left — both necessary.
  • Gemini suggested a "cleaner" config passing scheme that passes the merged config sub-tree to the component instead of synthesizing it from env.get. Deferred — Harper's flatten/inflate config pattern doesn't expose sub-trees from env cheaply, and the component's unit-test surface benefits from the current explicit config arg.

Tests now 28 passing (was 26 — two new regressions pinned: routeOptions forwarding and the strict-boolean enable contract).

Copy link
Copy Markdown
Member

@kriszyp kriszyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good foundation. I generally have a preference towards cleaner and simpler default config files that are easy for users to get started with, but if you feel like this is helpful, that's fine.

Comment thread server/operationsServer.ts Outdated
config: {
mcp: {
operations: {
enabled: true,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern has always seemed a little verbose to me. Normally, we include the components/functionality in the config that we want enabled. Here we include operations to say that we maybe do or don't really want to include operations (depending on if it is enabled)? Rather than just including or omitting it? I guess maybe it is more explicit? IMHO, it seems like omitting operations from mcp is a pretty good way of omitting operations from MCP.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good feedback, will adjust.

Comment thread static/defaultConfig.yaml Outdated
hostname: null
mcp:
operations:
enabled: false
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we are adding a bunch of default stuff to the config that isn't enabled? (my concern is there is already a lot to read in there that can confuse people)

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it, the preference is if enabled: false we do not add the remaining config? I sere the challenge that when the feature is turned on the admin would need to read docs to know how to config. Not arguing, just calling that out.

@kylebernhardy kylebernhardy marked this pull request as ready for review May 20, 2026 19:05
@kylebernhardy kylebernhardy requested a review from a team as a code owner May 20, 2026 19:05
@kylebernhardy
Copy link
Copy Markdown
Member Author

I generally have a preference towards cleaner and simpler default config files.

Please let me know how you would like this tightened up outside of the:

operations:
    enabled: true

and

application:
    enabled: true

Instead there will be no enabled: true, if it exists we use it.

Per Harper convention (matches `replication`): a profile is enabled iff
its config sub-block is present, not via an `enabled: true/false` flag.

- Drop `mcp.<profile>.enabled` from the Joi schema.
- Drop the `mcp:` block from static/defaultConfig.yaml entirely — opt-in.
- Drop `MCP_OPERATIONS_ENABLED` / `MCP_APPLICATION_ENABLED` config keys.
- Component: skip registration when `config.mcp.<profile>` is absent.
- Operations call site: gate on presence of `MCP_OPERATIONS_MOUNTPATH`
  (Joi always defaults `mountPath` to `/mcp` when the block is set).

Backward compatibility: existing deployments have no `mcp:` block, so
this preserves the "MCP off by default" behavior — same outcome as the
prior `enabled: false` defaults, fewer keys to maintain.

Refs #613.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy
Copy link
Copy Markdown
Member Author

Refactor: presence-based enablement (drop enabled field)

Per Harper convention, a config block's presence should determine enablement (matches replication — there's no replication.enabled, it's just on if the block is configured).

Changes (commit 54139bb):

  • Removed mcp.<profile>.enabled from the Joi schema.
  • Removed the entire mcp: block from static/defaultConfig.yaml (it's opt-in now — users add the block to enable).
  • Removed MCP_OPERATIONS_ENABLED / MCP_APPLICATION_ENABLED from hdbTerms.ts.
  • Component now registers when config.mcp.<profile> is present (any value), not when .enabled: true.
  • Operations call site now gates on the presence of MCP_OPERATIONS_MOUNTPATH (Joi always defaults mountPath to /mcp when the block is configured, so this is a reliable presence sentinel).

Behavior preserved: existing deployments have no mcp: block in their config, so MCP stays off — same outcome as the prior enabled: false defaults, with fewer keys to maintain.

Net diff: -54 lines across 7 files; 27 tests passing locally (replaced the enabled-based tests with presence-based equivalents).

Follow-up: the umbrella design on #465 still references enabled; I'll edit that issue body separately so the design doc stays consistent with the as-built convention.

Audit against Harper's config patterns surfaced a real bug: only six
hardcoded Joi defaults are re-applied to configDoc in
`config/configUtils.js:483-490`, so the Joi `.default('/mcp')` on
`mcp.operations.mountPath` never reaches `env.get(...)`. The previous
gate `env.get(MCP_OPERATIONS_MOUNTPATH)` would silently return undefined
when a user wrote `mcp: { operations: {} }`, leaving MCP off despite
configuration.

Switched the call site to:
- Import `getConfigObj` from `config/configUtils.js` (matches the
  existing pattern in `bin/run.ts:355`).
- Gate on block presence: `if (fullConfig.mcp?.operations)`.
- Pass the full merged config object through to `registerMcpProfile`
  instead of synthesizing a fake `{mcp:{...}}` from individual env keys.

Drops the boolean-string-coercion shim and the misleading "mountPath as
presence sentinel" comment — the component already falls back to
`DEFAULT_MOUNT_PATH = '/mcp'` when mountPath is absent on the nested
config.

Refs #613.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@kylebernhardy
Copy link
Copy Markdown
Member Author

Config-pattern audit + fix (commit d5c8ea3)

Audited the foundation against Harper's existing config conventions (read config/configUtils.js, validation/configValidator.ts, bin/run.ts:355, server/loadRootComponents.js, etc.).

Real bug found and fixed

config/configUtils.js:483-490 only re-applies six specific Joi defaults back into configDoc after validation (threads, componentsRoot, logging.root, storage.path, logging.rotation.path, operationsApi.network.domainSocket). Every other .default(...) lives only on validation.value and is discarded. env.get(...) reads from the flattened configDoc.toJSON() output — so:

  • User writes mcp: { operations: {} } in their config
  • Joi computes validation.value.mcp.operations.mountPath = '/mcp'
  • That default is never propagated to configDoc
  • env.get(MCP_OPERATIONS_MOUNTPATH) returns undefined
  • My prior gate short-circuits → MCP silently not registered despite explicit user opt-in

Fix: switched the gate to call getConfigObj() (existing accessor at config/configUtils.js:1010, already used at bin/run.ts:355) and check fullConfig.mcp?.operations for block presence. Drops the boolean-string-coercion shim and the synthesized-config construction. Passes the merged config through to registerMcpProfile directly.

Stylistic cleanups also addressed

  • No more synthesizing {mcp:{operations:{mountPath:...}}} from individual env keys (was the only such pattern in server code).
  • Removed the misleading "mountPath as presence sentinel" comment.

Alignments verified correct, no change

  • No enabled flag — only 2 of ~15 top-level config blocks use one (localStudio, logging.rotation). Aligned with replication.
  • mcp: omitted from static/defaultConfig.yaml — matches replication's Pattern-C convention.
  • Explicit config arg on registerMcpProfile — better unit-test surface than DI through scope.options.
  • Joi sub-schema shape, auth via routeOptions.preValidation — match existing patterns.

Out of scope (Harper-wide, flagged for follow-up)

The Joi-default propagation gap (config/configUtils.js:483-490 is an explicit allow-list of six keys) bites any feature that adds Joi defaults under a new top-level block. Could be a separate issue if it stings other future work — a generalized "copy all defaulted values from validation.value back to configDoc" pass would fix it Harper-wide. Not appropriate for this foundation PR.

Verification

  • 27 tests passing locally; no test changes needed (the component contract didn't change — tests already pass nested config objects directly).
  • Manual smoke would now correctly register MCP for mcp: { operations: {} } — the case that was silently broken before.

@kylebernhardy
Copy link
Copy Markdown
Member Author

CI flake on Node.js v24 unit shard — re-run triggered

One unit test failed on Node 24 only (Node 20 and 22 passed). The failure is in unitTests/resources/query.test.js:1183 (Querying through Resource API → Query optimizations → Uses both indices for two kinda similar conditions), with AssertionError: 9 == 10.

Not caused by this PR. This branch's diff is 6 files (components/mcp/, server/operationsServer.ts, config validator, hdbTerms.ts); none touch resource querying, indices, or the SchemaChanges table referenced in the log.

Likely cause: timing flake. The log shows [error]: Transaction was open too long and has been committed, from table: SchemaChanges/ warnings firing just before the failed assertion, and the off-by-one (9 == 10 records) is the kind of count drift that happens when a transaction commit narrowly misses a flush boundary. Node 24's slightly different scheduling likely tipped it over.

Re-ran the failed job (run 26185818115). If it goes green on retry, the flake is confirmed. If it fails again, I'll dig deeper.

heskew added a commit that referenced this pull request May 21, 2026
First real `ModelBackend` against the Ollama HTTP API: `embed` via
`/api/embed`, `generate` via `/api/generate` or `/api/chat`, and
`generateStream` via NDJSON over chunked HTTP.

Backend lives under `components/<name>/` matching the pattern from the
MCP foundation (PR #649) — core imports `registerOllamaBackend` and calls
it during boot; the file is not a `handleApplication(scope)` self-loader.

Capabilities advertise `tools: false` and `adapters: false`. Ollama tool
support exists on some models but is uneven across the catalog; the v1
portability guarantee keeps it off here.

Validates the Phase 1 (#628 / PR #638) `ModelBackend` interface against a
non-trivial real provider without external dependencies in CI.

Tracking: #629, #510

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@kriszyp kriszyp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thank you! Sorry for being nitpicky!

# Conflicts:
#	unitTests/validation/configValidator.test.js
#	validation/configValidator.ts
@kylebernhardy
Copy link
Copy Markdown
Member Author

CI flake on merge commit e5d7ee383 — re-running the failed shard now.

Job: Integration Tests 4/4 (Bun) failed on a single test in integrationTests/apiTests/terminology.test.mjs:490 (csv_data_load without database starts job) — the csv_data_load job stayed IN_PROGRESS past 30s instead of completing. 86 of 87 tests in that shard passed.

That test file is new on main (landed in the models-registry stack, commits f2bc4b2956769ff9b). This PR doesn't touch it and has no csv_data_load code path, so this is a timing flake on a recently-introduced integration test, not a regression from the merge.

Other 34 checks: ✅. Rerunning just the failed Bun shard.

Copy link
Copy Markdown
Member

@cb1kenobi cb1kenobi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is exciting!

@kylebernhardy kylebernhardy merged commit 3913c15 into main May 23, 2026
62 of 63 checks passed
@kylebernhardy kylebernhardy deleted the feat/mcp-foundation branch May 23, 2026 17:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[MCP] Component scaffold + config schema + boot gating

4 participants