Skip to content

feat: pluggable storage backends — file, Archivista, memory#139

Merged
mlieberman85 merged 15 commits into
kusari-oss:mainfrom
Marc-cn:feature/storage-backends
Apr 4, 2026
Merged

feat: pluggable storage backends — file, Archivista, memory#139
mlieberman85 merged 15 commits into
kusari-oss:mainfrom
Marc-cn:feature/storage-backends

Conversation

@Marc-cn
Copy link
Copy Markdown
Collaborator

@Marc-cn Marc-cn commented Mar 28, 2026

Summary

Adds a pluggable StorageBackend interface addressing the database requirements discussed: storing attestations, project metadata when repo write access is unavailable, and reproducibility research results.

Type of Change

  • New feature (non-breaking change adding functionality)

Framework Changes Checklist

  • Updated framework spec if behavior changed
  • Ran uv run python scripts/validate_sync.py --verbose and it passes
  • Ran uv run python scripts/generate_docs.py and committed any doc changes

Control/TOML Changes Checklist

Not applicable — no controls or TOML modified in this PR.

Testing

  • Tests pass locally
  • Added tests for new functionality
  • Linting passes

What was built

  • FileBackend (default) — stores JSON files in .darnit/attestations/, metadata/, research/. No breaking changes to existing behaviour.
  • ArchivistaBackend — uploads attestations to Archivista via HTTP (POST /upload). Falls back to FileBackend if Archivista is unreachable.
  • MemoryBackend — in-memory only, for testing.

Configuration:
[storage]
backend = "archivista"
archivista_url = "http://localhost:8082"

Covers all three requirements:

  1. Attestations — ArchivistaBackend uploads signed in-toto statements
  2. Project metadata without repo write access — external FileBackend or Archivista
  3. Reproducibility research results — all backends support store_research_result()

Additional Notes

  • 27 tests, all passing
  • Archivista tested via fallback — no real HTTP calls in tests
  • Storage backend not yet wired into the agent run or attestation generator — exists as a module, wiring comes in a follow-up PR

@Marc-cn Marc-cn requested a review from mlieberman85 as a code owner March 28, 2026 15:11
Copy link
Copy Markdown
Contributor

@mlieberman85 mlieberman85 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: Pluggable Storage Backends

Code is well-structured, tests are included and all 27 pass, and the interface is straightforward. No new bugs found in the storage module itself.

Design

Three-backend architecture:

  • FileBackend — JSON files in .darnit/, good default
  • ArchivistaBackend — HTTP upload to Archivista for attestations, falls back to FileBackend for metadata/research and on connection failure
  • MemoryBackend — in-memory for testing

The StorageBackend base class defines a clear interface with 6 methods (store/retrieve for each of attestations, metadata, research results). Factory function handles config-based selection.

Not yet wired in

Acknowledged in the PR description — the storage module exists as a standalone module but isn't connected to the audit pipeline, attestation generator, or agent run. Should have a tracking issue for the wiring.

Minor

  • ArchivistaBackend.retrieve_attestation always falls back to file — can never retrieve from Archivista since it stores by gitoid but looks up by repo+commit. The gitoid from store_attestation is returned but not persisted for later retrieval.
  • Same plugin.py indentation issue as the other PRs (lines 114, 136, 151, 165)
  • _repo_slug uses simple string replacement (/_) which could collide: org/repo and org_repo would produce the same slug
  • Tests don't have @pytest.mark.unit markers (minor inconsistency with the rest of the test suite)

What's good

  • Ships 27 tests that all pass
  • FileBackend handles errors gracefully (returns None/False instead of raising)
  • Archivista fallback is a solid pattern — upload failures don't lose data
  • MemoryBackend is a useful test utility for other code that needs storage

@Marc-cn
Copy link
Copy Markdown
Collaborator Author

Marc-cn commented Mar 28, 2026

Fixes pushed:

@mlieberman85
Copy link
Copy Markdown
Contributor

Not sure what happened but it looks like you broke the uv.lock file

Using CPython 3.12.3 interpreter at: /usr/bin/python
Creating virtual environment at: .venv
error: Failed to parse `uv.lock`
  Caused by: TOML parse error at line 1390, column 1
     |
1390 | version = "1.2.0"
     | ^
invalid array
expected `]`

In addition there's a couple of minor merge conflicts.

@Marc-cn
Copy link
Copy Markdown
Collaborator Author

Marc-cn commented Mar 31, 2026

Sorry for the broken lock file, it got corrupted during a merge from main.

@mlieberman85
Copy link
Copy Markdown
Contributor

@Marc-cn just fix the merge conflicts. In the followup it would be nice to see it hooked into an example module.

@mlieberman85 mlieberman85 merged commit d123d49 into kusari-oss:main Apr 4, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants