Skip to content

[Hackathon] hiten: add configurable provenance_supply_chain_multi scenario#60

Open
Hiten0305l wants to merge 1 commit into
projnanda:mainfrom
Hiten0305l:hackathon/hiten-provenance-supply-chain-multi
Open

[Hackathon] hiten: add configurable provenance_supply_chain_multi scenario#60
Hiten0305l wants to merge 1 commit into
projnanda:mainfrom
Hiten0305l:hackathon/hiten-provenance-supply-chain-multi

Conversation

@Hiten0305l

Copy link
Copy Markdown

Summary

This is a focused follow-up to my previous hackathon PR (#39). Following the merge of the DataFacts implementation in #31, this PR contributes the independent scenario enhancement suggested during review by introducing a configurable provenance supply-chain scenario.

The new provenance_supply_chain_multi scenario exercises the existing cid_facts implementation over a parameterized multi-stage provenance DAG, supporting configurable supplier and manufacturer stages while preserving the existing provenance_supply_chain (diamond topology) unchanged. Together, both scenarios exercise the same provenance implementation across complementary graph structures without modifying plugin behaviour.

What

  • Add a new built-in provenance_supply_chain_multi scenario supporting configurable num_suppliers and num_manufacturers.
  • Add a dedicated provenance_supply_chain_multi.yaml for configuring parameterized graph sizes.
  • Register the scenario independently alongside the existing provenance scenarios.
  • Register a dedicated validator entry that reuses the existing provenance validator suite without modifying validator behaviour.
  • Add comprehensive test coverage for configurable topologies, deterministic execution, provenance validation, invalid configuration handling, and regression protection.

Topology

                   supplier-0
                   supplier-1
                      ...
                supplier-(N-1)
                      │
        ┌─────────────┼─────────────┐
        ▼             ▼             ▼
  manufacturer-0 manufacturer-1 ... manufacturer-(M-1)
        └─────────────┬─────────────┘
                      ▼
                distributor-0
                      ▼
                  retailer-0

Each manufacturer aggregates provenance from all supplier datasets, producing a configurable multi-stage provenance DAG. The distributor aggregates all manufacturer outputs before forwarding the resulting lineage to the retailer for provenance verification and adversarial validation.

Why

The existing provenance_supply_chain scenario validates provenance using a fixed diamond-shaped dependency graph. This PR introduces a complementary configurable graph topology that exercises the same cid_facts implementation under a different dependency structure without changing the underlying plugin.

Supporting configurable numbers of suppliers and manufacturers allows the framework to exercise the same provenance implementation across multiple graph sizes and dependency structures, expanding scenario coverage for:

  • Provenance lineage construction and traversal
  • Parent aggregation across multiple upstream datasets
  • Provenance verification across configurable graph topologies
  • Deterministic replay for configurable scenario configurations
  • Adversarial validation using the existing validator suite

This expands scenario coverage while preserving backward compatibility and leaving the existing provenance implementation unchanged.

Scope

This PR is intentionally limited to introducing the new provenance_supply_chain_multi scenario, its configuration, registry integration, and associated tests.

Verification

make ci-local

uv sync
uv run ruff check .
uv run ruff format --check .
uv run pyright
uv run pytest -v

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant