Skip to content

[DSPX-3302] (3/5) otdf-local multi-instance refactor#452

Draft
dmihalcik-virtru wants to merge 18 commits into
DSPX-3302-02-platform-installerfrom
DSPX-3302-03-multi-instance
Draft

[DSPX-3302] (3/5) otdf-local multi-instance refactor#452
dmihalcik-virtru wants to merge 18 commits into
DSPX-3302-02-platform-installerfrom
DSPX-3302-03-multi-instance

Conversation

@dmihalcik-virtru
Copy link
Copy Markdown
Member

@dmihalcik-virtru dmihalcik-virtru commented May 15, 2026

Summary

Third PR in the five-part stack. Refactors otdf-local from a single-instance CLI to a multi-instance harness. Each named instance under tests/instances/<name>/ owns its own opentdf.yaml, keys, KAS configs, and port range, and references platform binaries managed by otdf-sdk-mgr (PR 2).

Highlights

  • Settings gains instance_name, instance_dir, instances_root, platform_binary_for(dist). Per-instance paths kick in whenever instance.yaml exists; legacy single-instance behavior is preserved when it doesn't.
  • Ports parameterize on instance.ports.base via a new KAS_OFFSETS table. Two instances on different bases coexist.
  • PlatformService / KASService use the pinned xtest/platform/dist/<dist>/service binary when an instance is loaded; go run ./service legacy path runs unchanged otherwise. KAS features (ec_tdf_enabled, etc.) come from instance.yaml's kas.<name>.features.
  • KASManager restricts the managed set to KAS names listed in the manifest (subset topologies work).
  • utils.keys.setup_golden_keys writes keys into the target dir and emits absolute paths so the binary finds them regardless of cwd.
  • New CLI surface:
    • Top-level --instance NAME
    • otdf-local instance init <name> [--from-scenario PATH] [--ports-base N] [--platform DIST]
    • otdf-local instance ls --json, otdf-local instance rm <name> -y
    • otdf-local scenario run <path> (translates suite block to pytest args)
  • uv workspace dep: otdf-local/pyproject.toml declares otdf-sdk-mgr via [tool.uv.sources].
  • .gitignore: /instances/, xtest/scenarios/*.installed.json, .claude/tmp/.
  • Tests: 5 new in test_multi_instance.py covering port arithmetic, settings round-trip with/without an instance, and binary resolution.

Recent Cherry Picks from DSPX-3302-05-claude-plugin

  • Self-provision keys + opentdf.yaml at instance init: Instance init now auto-generates Keycloak TLS certificates, JKS truststore, and copies opentdf-dev.yaml with a fresh per-instance root key, eliminating manual setup steps.
  • Fix scenario suite pytest argv translation: Corrects _build_pytest_args to read from the actual Pydantic model (targets: list[str], containers: list[ContainerKind]) instead of hardcoded attribute names. Adds comprehensive unit tests.
  • Style: Apply ruff format.

Backward compatibility

uv run otdf-local up without --instance still works against a sibling platform/ checkout. Migration to multi-instance is opt-in via instance init.

Stack

  1. (base) Shared schema — chore(xtest): Shared Scenario/Instance Pydantic schema in otdf-sdk-mgr #450
  2. (base) Platform installer + install scenario — feat(xtest): Lets otdf-sdk-mgr manage platform too #451
  3. This PR — otdf-local multi-instance refactor
  4. xtest/conftest.py integration
  5. Claude plugin

Test plan

  • cd otdf-local && uv run pytest tests/ -m 'not integration' → 27 passing (20 existing + 5 new + 2 integration kept)
  • uv run otdf-local instance init demo --from-scenario <path> → directory layout correct
  • uv run otdf-local instance ls --json → enumerates instance
  • uv run otdf-local --instance demo instance ls--instance flag threads through

Jira: https://virtru.atlassian.net/browse/DSPX-3302

🤖 Generated with Claude Code

Stack (a60d3302):

Generated by wgo stack. Edit text above or below this block, not inside it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 15, 2026

Important

Review skipped

Draft detected.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: b5c06671-d23f-44d7-b7ca-b7f4f4de651a

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch DSPX-3302-03-multi-instance

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a multi-instance refactor for the otdf-local CLI, enabling the management of isolated test environments. It introduces new instance and scenario subcommands, updates the configuration system to be instance-aware, and integrates with otdf-sdk-mgr for binary management. Service launchers for KAS and the platform now support per-instance port offsets and directory structures. Review feedback highlights a potential TypeError in KAS feature handling and suggests a more direct approach for updating Pydantic model metadata.

# Per-KAS features from instance.yaml override the legacy heuristic.
instance = self.settings.load_instance()
kas_pin = instance.kas.get(self._kas_name) if instance is not None else None
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The dict(kas_pin.features) call will raise a TypeError if kas_pin.features is None. Since features are typically optional in the configuration schema, this should be handled defensively to avoid crashing when no features are specified for a KAS instance.

Suggested change
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
extra_features: dict[str, bool] = dict(kas_pin.features or {}) if kas_pin is not None else {}

else:
raise typer.BadParameter(f"{scenario_path} has unknown kind {kind!r}")
# Ensure the metadata name matches the chosen directory name.
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Updating the metadata name by dumping and re-creating the entire Metadata object is unnecessarily complex and inefficient. Since Pydantic models are mutable by default, you can update the field directly on the existing object.

Suggested change
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
instance.metadata.name = name

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request implements a multi-instance architecture for the otdf-local CLI, allowing for the management and execution of isolated test environments. Key updates include new subcommands for instance and scenario handling, offset-based port allocation, and instance-specific directory structures for logs and configurations. Feedback from the review suggests several improvements: adding a null check for KAS features to avoid runtime errors, using Pydantic's model_copy for cleaner metadata updates, adopting shlex.join for safer command display, and adding missing type hints to enhance code maintainability.

# Per-KAS features from instance.yaml override the legacy heuristic.
instance = self.settings.load_instance()
kas_pin = instance.kas.get(self._kas_name) if instance is not None else None
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

If kas_pin.features is None in the instance configuration, calling dict() on it will raise a TypeError: 'NoneType' object is not iterable. You should provide a default empty dictionary or add a check.

Suggested change
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
extra_features: dict[str, bool] = dict(kas_pin.features or {}) if kas_pin is not None else {}

else:
raise typer.BadParameter(f"{scenario_path} has unknown kind {kind!r}")
# Ensure the metadata name matches the chosen directory name.
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This line is quite verbose. Since Metadata is a Pydantic model, you can use model_copy with the update parameter to achieve the same result more cleanly.

Suggested change
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
instance.metadata = instance.metadata.model_copy(update={"name": name})

pytest_args.extend(extra)

cmd = ["uv", "run", "pytest", *pytest_args]
typer.echo(f" Running: {' '.join(cmd)} (cwd={xtest_root})")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Using ' '.join(cmd) for display can be misleading if any of the arguments contain spaces. It is safer to use shlex.join to format the command string for the console.

Suggested change
typer.echo(f" Running: {' '.join(cmd)} (cwd={xtest_root})")
import shlex
typer.echo(f" Running: {shlex.join(cmd)} (cwd={xtest_root})")

return Ports.get_kas_port(name, base=instance.ports.base)
return Ports.get_kas_port(name)

def load_instance(self):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Adding a return type hint here would improve IDE support and maintainability. Since the Instance type is imported locally within the method, you can use a string forward reference.

Suggested change
def load_instance(self):
def load_instance(self) -> "Instance | None":

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a multi-instance test harness capability, allowing for the management and execution of isolated OpenTDF environments with distinct configurations, port ranges, and platform versions. Key additions include new CLI subcommands for instance management (init, ls, rm) and scenario execution, an instance-aware settings system, and integration with otdf-sdk-mgr to resolve versioned binaries. Feedback identifies a critical issue where the up command still relies on static port constants, which will break health checks for non-default instances. Additionally, improvements were suggested regarding safer dictionary handling for KAS features and more idiomatic use of Pydantic's model_copy.

Comment on lines +101 to +104
if instance is not None:
os.environ["OTDF_LOCAL_INSTANCE_NAME"] = instance
# Invalidate the cached Settings so subsequent commands see the new value
get_settings.cache_clear()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The --instance flag correctly updates the environment and clears the settings cache. However, the up command in this file (specifically at lines 192 and 225) still relies on static Ports constants for health checks and port resolution. This will cause health checks to fail when a non-default instance with a different ports_base is active. The up command should be updated to use the instance-aware settings.get_kas_port(name) or the port property of the service instances, and it should iterate over the instances managed by kas_manager instead of Ports.all_kas_names().

else:
raise typer.BadParameter(f"{scenario_path} has unknown kind {kind!r}")
# Ensure the metadata name matches the chosen directory name.
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Since Metadata is a Pydantic model, you can use model_copy with the update parameter to modify the name. This is more idiomatic and concise than dumping to a dict and reconstructing the model.

Suggested change
instance.metadata = Metadata(**{**instance.metadata.model_dump(exclude_none=True), "name": name})
instance.metadata = instance.metadata.model_copy(update={"name": name})

# Per-KAS features from instance.yaml override the legacy heuristic.
instance = self.settings.load_instance()
kas_pin = instance.kas.get(self._kas_name) if instance is not None else None
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

If kas_pin.features is None, calling dict() on it will raise a TypeError. It's safer to provide an empty dictionary as a fallback.

Suggested change
extra_features: dict[str, bool] = dict(kas_pin.features) if kas_pin is not None else {}
extra_features: dict[str, bool] = dict(kas_pin.features or {}) if kas_pin is not None else {}

@github-actions
Copy link
Copy Markdown

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from c6a7895 to ebc0c15 Compare May 15, 2026 16:35
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from c69afd6 to a8ef24a Compare May 15, 2026 16:36
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from ebc0c15 to 14e5c1e Compare May 15, 2026 16:57
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from a8ef24a to 78b2ca6 Compare May 15, 2026 16:58
dmihalcik-virtru added a commit that referenced this pull request May 21, 2026
#450)

## Summary

First PR in a five-part stack that introduces a multi-instance test
harness and a Claude plugin for OpenTDF bug reproduction. This PR adds
*only* the shared Pydantic schema in `otdf-sdk-mgr` — no consumers yet.

- Adds `otdf_sdk_mgr.schema` with v2 models: `Scenario`, `Instance`,
`PlatformPin`, `KasPin`, `SdkPin`, `ScenarioSdks`, `Suite`, etc.
- `ScenarioSdks.encrypt` / `.decrypt` mirror xtest's existing
`--sdks-encrypt` / `--sdks-decrypt` convention so a→b-only scenarios are
first-class.
- `python -m otdf_sdk_mgr.schema validate <path>` validates either a
Scenario or an Instance file based on its `kind:`.
- Adds `pydantic` + `ruamel.yaml` to `otdf-sdk-mgr/pyproject.toml`.
- 6 unit tests covering round-trips, pin invariants, and unknown-field
rejection.

## Stack

1. [**This PR**](#450) — Shared
schema
2. [Platform installer + `install
scenario`](#451) in `otdf-sdk-mgr`
(builds on this)
3. `otdf-local` [multi-instance
refactor](#452) + new CLI
subcommands
4. `xtest/conftest.py`
[integration](#453) (`--scenario`,
`--instance`)
5. [Claude plugin](#454)
(`.claude/skills/`, settings, plugin manifest)
6. #455

## Test plan

- [x] `cd otdf-sdk-mgr && uv run pytest tests/test_schema.py` — all 6
pass
- [x] `uv run python -m otdf_sdk_mgr.schema validate <path>` accepts a
valid scenarios.yaml and rejects unknown fields

Jira: https://virtru.atlassian.net/browse/DSPX-3302

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated comment: release notes by coderabbit.ai
-->

## Summary by CodeRabbit

* **New Features**
* Added schema validation for OpenTDF Scenario and Instance YAML
configurations with a new CLI command.
* Introduced strict validation with cross-field constraints for SDK and
platform configurations.

* **Documentation**
  * Updated supported container formats from `nano` to `ztdf-ecwrap`.

* **Dependencies**
* Updated core package dependencies to support enhanced validation
capabilities.

<!-- review_stack_entry_start -->

[![Review Change
Stack](https://storage.googleapis.com/coderabbit_public_assets/review-stack-in-coderabbit-ui.svg)](https://app.coderabbit.ai/change-stack/opentdf/tests/pull/450?utm_source=github_walkthrough&utm_medium=github&utm_campaign=change_stack)

<!-- review_stack_entry_end -->

<!-- end of auto-generated comment: release notes by coderabbit.ai -->

---------

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from 78b2ca6 to e196e43 Compare May 21, 2026 15:38
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from 14e5c1e to 9993b12 Compare May 21, 2026 15:38
@github-actions
Copy link
Copy Markdown

X-Test Failure Report

@github-actions
Copy link
Copy Markdown

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from ec1f655 to 13b5c96 Compare May 22, 2026 01:46
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch 2 times, most recently from 5b1c928 to a1bcecc Compare May 22, 2026 13:50
@github-actions
Copy link
Copy Markdown

@github-actions
Copy link
Copy Markdown

X-Test Failure Report

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from e7d13f5 to 6832d58 Compare May 28, 2026 12:46
@github-actions
Copy link
Copy Markdown

X-Test Failure Report

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch 2 times, most recently from fa5fc7b to 35fd96a Compare June 2, 2026 17:47
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

X-Test Failure Report

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from db9a288 to 614797f Compare June 2, 2026 17:48
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

X-Test Failure Report

dmihalcik-virtru and others added 18 commits June 3, 2026 17:56
`install scenario` could not run as written: it iterated `ScenarioSdks.union()`
as a dict (it returns a list) and passed a `source=` kwarg `install_release`
does not accept. The emitted `installed.json` shape also did not match what
`scenario_to_pytest_sdks` reads (per-role lists, not sdk-name-keyed dict),
so even the platform-only path produced a manifest no downstream tool
could consume.

Source fixes:
- cli_scenario.py: iterate `union()` as the list it is, cache installs by
  (sdk, version, source), emit role-keyed lists matching the reader's
  expected shape; on failure write a partial manifest with `status=partial`
  so half-installed dist trees are diagnosable. Catch YAMLError in
  `_peek_kind` to surface a clean typer error.
- platform_installer.py: `_git_rev_parse` raises on failure instead of
  silently writing an empty `sha=` into `.version`. Missing `scripts/`
  raises instead of warning-and-continuing. SHA passthrough heuristic
  tightened from `>=7` chars to exactly 40 (SHA-1) or 64 (SHA-256), so
  ambiguous short tags like `abc1234` no longer skip the `service/`
  prefix. Dropped a docstring fragment pointing to a planning doc that
  won't exist post-merge.
- cli_install.py: dropped a docstring whose "deferred import" claim was
  false (the registration runs at module import). `lts platform` with no
  pinned version now exits 1 instead of warning-and-exit-0.

Tests:
- test_platform_installer.py: parametrized cases for `_resolve_platform_ref`
  covering version normalization, branch passthrough, the tightened hex
  heuristic, and SHA-1/SHA-256 passthrough.
- test_cli_scenario.py: end-to-end smoke that mocks the installers and
  asserts the produced manifest is round-trip consumable by
  `scenario_to_pytest_sdks`. This is the gating test that would have
  caught the original bug.

79 passing (was 67).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- platform_installer: fix worktree update from bare repo (no `origin`
  remote exists), use `git reset --hard <branch>` instead of `git pull`
- platform_installer: stop swallowing subprocess output so long-running
  `go build`/`git clone` progress is visible to the user
- cli_install: extract `_install_platform_or_exit` to dedupe platform
  handling across `lts`, `tip`, and `release`
- cli_scenario: parse manifest YAML once and dispatch by `kind`, instead
  of peeking + re-parsing in each loader

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…typing

- AGENTS.md: add "Before Committing Python Changes" section requiring
  `uv run ruff check`, `uv run ruff format`, and `uv run pyright` on any
  touched Python package before commit. Explicitly call out that `uvx`
  must NOT be used for pyright (isolated env can't see project deps, so
  every project import becomes a spurious "could not be resolved" error).
- cli_scenario: split the single `dict[str, object]` install record into
  per-section typed containers (`installed_platform`, `installed_kas`,
  `installed_sdks`) assembled at write time via a `_snapshot()` helper.
  Fixes pre-existing pyright `__setitem__ ... not defined on object`
  errors at the nested writes; on-disk JSON shape is unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Root AGENTS.md: add a Repository Layout table near the top, correct
  the `platform/` description (it's installed by `otdf-sdk-mgr install`,
  not committed source), and trim the duplicated "Summary → Preferred
  Workflow" block that restated the body.
- otdf-local/AGENTS.md: lead with the dependency on `otdf-sdk-mgr`
  (otdf-local launches the binaries the installer produces). Mark the
  manual-keys YAML block as an emergency fallback that may drift.
- otdf-sdk-mgr/AGENTS.md (new): operational guide for the installer —
  subcommand layout, bare-clone-worktree gotchas (no `origin` remote,
  namespaced `service/vX.Y.Z` tags, unbuffered subprocess output),
  pattern for adding a new subcommand.
- xtest/AGENTS.md (new): test-suite layout, custom pytest options,
  audit-log fixture quick reference, authoring guidance.
- otdf-sdk-mgr/CLAUDE.md, xtest/CLAUDE.md: symlinks to AGENTS.md to
  match the repo convention.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a `--ref` option to `install tip` so platform and SDK source builds
can target any git ref (branches, tags, SHAs, raw `refs/...`, or the
`pr:N` shorthand that expands to `refs/pull/N/head`). Mutable refs
(branches, PR heads) re-fetch the bare repo and rebuild on each
invocation; immutable refs (tags, full SHAs) reuse the cached dist.

Also fetches `refs/...` refs explicitly into the bare repo before
`git worktree add` — the default bare-clone refspec doesn't include
`refs/pull/*`, so PR installs were dying with `invalid reference`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…refs

For immutable refs (tags, SHAs), derive dist_name by normalizing only the
semver tail after the last `/`. This ensures namespaced tags like
`service/v0.9.0` produce the same dist_name (`v0.9.0`) as plain tags
(`v0.9.0`, `0.9.0`), enabling immutable ref dist-dir reuse.

Before: `normalize_version(ref)` on `service/v0.9.0` → `vservice/v0.9.0`
After: `normalize_version(ref.rsplit("/", 1)[-1])` → `v0.9.0`

Also add `list_platform_versions()` to registry and expose platform versions
via `versions list platform`.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove stray <<<<<<< HEAD merge marker from xtest/AGENTS.md
- Disambiguate 7-39 char hex refs in platform_installer via
  `git rev-parse --verify`; ambiguous prefixes raise PlatformInstallError,
  unresolvable hex falls through as a branch/tag name
- Make `install_go_release` fail loudly on `go install` pre-warm errors;
  no more silent .version writes after a broken install
- Add `RegistryUnreachableError` and raise it from npm/Maven/GitHub
  URLError paths so network outages no longer look like "no versions
  available"; CLI wrappers translate to clean typer.Exit(1)
- Fix `versions {list,latest}` typo in AGENTS.md (subcommands are
  `list` and `resolve`)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Widen `install scenario` exception net to subprocess.CalledProcessError,
  ValidationError, and OSError so a failed build, bad YAML, or missing
  helper script still produces a typer.Exit(1) plus a partial manifest
  instead of an unhandled traceback
- Delete duplicate `list_platform_versions` from platform_installer.py
  (registry.py has the canonical version returning dict entries)
- Preserve KasPin.mode and KasPin.features in the installed.json manifest
  so downstream tooling can read them back without re-parsing YAML
- Add `.complete` marker to platform builds; reuse requires both the
  binary and the marker, surviving Ctrl-C mid-build
- checkout._run now captures stderr and includes cwd in the raised
  CalledProcessError; platform_installer._run wraps FileNotFoundError
  with the executable name
- Move scenario subcommand registration out of `_register_scenario_cmd`
  side-effect wrapper

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Tests:
- Rewrite `test_dist_name_derivation` to call `install_platform_source`
  instead of re-implementing its dist-name logic in the test body
- Regression tests for mutable-vs-immutable rebuild, .complete marker
  semantics, PR ref fetch via explicit refspec, and short-SHA expansion
- New `test_registry.py` covering RegistryUnreachableError propagation,
  _github_headers with/without GITHUB_TOKEN, ls_remote tag parsing, and
  GitHub rate-limit warning
- Assert KasPin.mode and KasPin.features round-trip into installed.json

Polish:
- `install_java_release` switches the BaseException catch to try/finally
  so KeyboardInterrupt/SystemExit retain their normal semantics
- README documents the dist-naming convention as a table

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- Reject container-image refs in _resolve_platform_ref with a clear
  PlatformInstallError, instead of letting strings like
  ghcr.io/opentdf/platform:v0.9.0 fall through to git and fail with a
  generic "invalid reference" message.
- Use "install release platform:<version>" in registry.install_method
  so copy-paste from `versions list` lands on the actual subcommand
  signature.
- Drop unused boom() helper flagged by Sonar in test_registry.py.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Cherry-picks the documentation enhancement from b9f84e05 that adds:
- Top-level comment explaining KAS Preview Settings precedence
- Field descriptions for KasPin.features and Instance.features

This documentation clarifies how preview settings are configured and
applied, helping users understand the features dict without needing to
reference external docs.

Cherry-picked from: b9f84e05 (feat(scenario): enable KAS preview features configuration)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Refactors otdf-local from a single-instance CLI (one platform checkout,
fixed ports, hardcoded six KAS instances) into a multi-instance harness
where each named instance under tests/instances/<name>/ owns its own
opentdf.yaml, keys, KAS configs, and port range.

Why
---

A single bug report often describes a *combination* — platform v0.9.0
with Java SDK 0.7.8 and a KAS at a pre-release. Today a developer has
to hand-edit configs and re-checkout the platform to reproduce. After
this change:

  otdf-local instance init java-078 --from-scenario .../scenario.yaml
  otdf-local --instance java-078 up

brings up exactly the topology the scenario describes, using platform
binaries that otdf-sdk-mgr already provisioned (each instance, and each
KAS within an instance, can reference a different pinned version). Two
instances on disjoint ports.base can coexist on a developer laptop.

What changes
------------

otdf-local now depends on otdf-sdk-mgr via a uv path source so both
tools share the canonical Scenario/Instance schema.

Settings (otdf_local.config.settings):
  - New instance_name (env-overridable via OTDF_LOCAL_INSTANCE_NAME),
    instance_dir, instances_root, instance_yaml properties.
  - platform_dir becomes optional; legacy sibling-discovery only kicks
    in when no per-instance configuration is present.
  - platform_binary_for(dist) resolves to the otdf-sdk-mgr-managed
    xtest/platform/dist/<dist>/service binary.
  - keys_dir, logs_dir, config_dir, platform_config, and
    get_kas_config_path switch to per-instance paths whenever
    instance.yaml exists; legacy behavior is preserved otherwise.
  - load_instance() reads the per-instance manifest via the shared
    Pydantic model.

Ports (otdf_local.config.ports):
  - KAS_OFFSETS exposes the offset table (alpha=+101, beta=+202, ...,
    km2=+606) so multiple instances on different bases get disjoint
    port ranges. The legacy 8080-based constants are preserved as
    defaults.
  - get_kas_port(name, base=...) computes the port relative to base.

Services (otdf_local.services.platform / .kas):
  - PlatformService.start() and KASService.start() use the pinned dist
    binary at xtest/platform/dist/<dist>/service when an instance is
    loaded, with cwd set to the recorded worktree so the binary finds
    its embedded resources. Legacy `go run ./service` path runs
    unchanged when no instance is active.
  - KASService.is_key_management defers to the manifest's `mode` field
    instead of the legacy name-based heuristic; per-KAS features (e.g.
    ec_tdf_enabled) pass through to opentdf.yaml.
  - KASManager constructs only the KAS instances listed in
    instance.yaml's kas: map. start_standard / start_km filter on
    is_key_management so subset topologies still work.

utils.keys.setup_golden_keys:
  - Writes key files into the target directory (per-instance keys_dir
    or legacy platform_dir) and uses absolute paths in the generated
    keys_config so the binary finds them regardless of cwd.

CLI:
  - New top-level --instance option threads through every command via
    OTDF_LOCAL_INSTANCE_NAME.
  - New `instance` subcommand group: init [--from-scenario PATH],
    ls --json, rm.
  - New `scenario` subcommand: `run <path>` translates the scenario's
    suite block into `pytest --sdks-encrypt ... --sdks-decrypt ...
    --containers ...` under xtest/ with OTDF_LOCAL_INSTANCE_NAME set.

Tests (otdf-local/tests/test_multi_instance.py):
  - Port arithmetic at default and alternate bases.
  - Settings round-trip with and without an instance.yaml.
  - platform_binary_for resolves under the otdf-sdk-mgr-managed
    xtest/platform/ tree.

.gitignore additions:
  - tests/instances/ (per-instance config and logs)
  - xtest/scenarios/*.installed.json (provisioning records)
  - .claude/tmp/

Backward compatibility:
  - `otdf-local up` with no --instance flag keeps working against a
    sibling platform/ checkout.

Refs: https://virtru.atlassian.net/browse/DSPX-3302

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Before this change, `otdf-local instance init` only wrote `instance.yaml`
and empty subdirs. Anyone running a fresh instance had to manually copy
keys from another worktree, run `init-temp-keys.sh` by hand, and copy
`opentdf-dev.yaml` into the instance dir before `up` would succeed —
otherwise Keycloak crash-looped on a missing `truststore.jks`, and
pytest failed with `OT_ROOT_KEY environment variable is not set`.

Changes:
- utils/keys.py: add `generate_localhost_cert()` and `generate_ca_jks()`
  to produce the Keycloak TLS pair + JKS truststore (matches the
  platform's `init-temp-keys.sh`). `generate_ca_jks()` runs `keytool`
  inside the `keycloak/keycloak:25.0` image so a local JDK isn't
  required. `ensure_keys_exist()` now generates the full bootstrap
  bundle, idempotently.
- cli_instance.py: `_init_from_scenario` and `_init_minimal` call a new
  `_provision_instance_dir()` helper that runs `ensure_keys_exist()` and
  copies the platform's `opentdf-dev.yaml` (or `opentdf-example.yaml`)
  into the instance dir, overriding `services.kas.root_key` with a
  freshly generated value so every instance owns its own root key.
- services/platform.py: `_generate_config()` preserves an existing
  per-instance `opentdf.yaml`, only patching logger + golden-key fields
  in place, so the init-time `root_key` survives restarts.
- services/docker.py: docker-compose subprocesses are now run with
  `KEYS_DIR=<instance>/keys` so the compose file's `${KEYS_DIR:-./keys}`
  mounts resolve to the per-instance bundle.

Users can now run:

  otdf-local instance init <name> --from-scenario path/to/scenario.yaml
  otdf-local --instance <name> up
  eval $(otdf-local --instance <name> env)
  cd xtest && uv run pytest ...

with no manual key-copying, no editing of `opentdf.yaml`, and no
shell-script fallback. Verified end-to-end against `pure-mlkem.yaml`
(PR opentdf/platform#3537): all 9 services come up healthy on the first
try and `env` exports `OT_ROOT_KEY`.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…chema

`_build_pytest_args` read `suite.select` and treated `suite.containers`
as a string, but the Pydantic Suite model exposes `targets: list[str]`
and `containers: list[ContainerKind]`. Any user invoking
`otdf-local scenario run` hit AttributeError. Also wires `suite.kexpr`
through as `-k`; it was silently dropped.

Adds unit tests covering empty/multi targets, container join, kexpr,
markers + extra args, and SDK token forwarding.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-03-multi-instance branch from 35fd96a to 0b41c21 Compare June 3, 2026 22:01
@sonarqubecloud
Copy link
Copy Markdown

sonarqubecloud Bot commented Jun 3, 2026

Quality Gate Failed Quality Gate failed

Failed conditions
D Security Rating on New Code (required ≥ A)

See analysis details on SonarQube Cloud

Catch issues before they fail your Quality Gate with our IDE extension SonarQube for IDE

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 3, 2026

X-Test Failure Report

✅ js@main-main
✅ java@v0.15.0-main
✅ java@main-main

@dmihalcik-virtru dmihalcik-virtru force-pushed the DSPX-3302-02-platform-installer branch from 9da91f5 to 74ab0c1 Compare June 5, 2026 18:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant