:book: Scorecard v6: OSPS Baseline conformance proposal and 2026 roadmap by justaugustus · Pull Request #4952 · ossf/scorecard

justaugustus · 2026-02-27T12:00:06Z

What kind of change does this PR introduce?

Documentation: Scorecard v6 proposal and 2026 roadmap.

Scorecard v6 evolves Scorecard from a scoring tool to an open source security
evidence engine. The primary initiative for 2026 is adding
OSPS Baseline conformance evaluation as the
first use case that proves this architecture.

Mission: Scorecard produces trusted, structured security evidence for the
open source ecosystem.

Scorecard accepts diverse inputs about a project's security practices,
normalizes them through probe-based analysis, and packages the resulting
evidence in interoperable formats for downstream tools to act on. Check scores
(0-10) and conformance labels (PASS/FAIL/UNKNOWN) are parallel evaluation
layers over the same probe evidence, produced in a single run. v6 is additive —
existing checks, probes, scores, and output formats are preserved.

The goal of this PR is to create a collaboration/decision-making nexus for
Scorecard and WG ORBIT tooling maintainers to ensure that we build interfaces
that easily interact with other tools and minimize duplication of work across
our maintainers and others in the OpenSSF ecosystem.

Key changes that warrant a major version:

New evaluation layer — conformance labels alongside existing check scores
Framework-agnostic architecture — probe evidence can be composed against
OSPS Baseline or other frameworks via pluggable mapping definitions
Interoperable output formats — in-toto, Gemara, OSCAL Assessment Results
alongside existing JSON and SARIF
Probe catalog as public interface — probe definitions become a consumable
artifact for external tools

Key decisions resolved:

Architecture (CQ-19): Option C (hybrid) — Scorecard owns probe execution
and conformance evaluation; interoperability at the output layer only
Output formats (CQ-18): Enriched JSON, in-toto (SVR + Baseline
Predicate), Gemara (via security-baseline), OSCAL Assessment Results
Mapping model (CQ-17/CQ-23): Two layers — check-level relations upstream
in security-baseline, probe-level mappings in Scorecard
Ecosystem positioning (CQ-13): All downstream consumers (Privateer,
AMPEL, Minder, Darnit) are equal

Open questions requiring Steering Committee resolution:

OQ-1/CQ-22: Attestation identity model (blocking)
OQ-2: Enforcement detection scope
PR title follows the guidelines defined in our pull request documentation

What is the current behavior?

Scorecard produces 0-10 check scores and structured probe findings. There is no
OSPS Baseline conformance evaluation capability and no public 2026 roadmap.

What is the new behavior (if this is a feature change)?**

This PR adds documentation only (no code changes):

docs/ROADMAP.md — Public 2026 roadmap with phased delivery plan
openspec/changes/osps-baseline-conformance/proposal.md — Detailed proposal
covering architecture, scope, phased delivery, and ecosystem positioning
openspec/changes/osps-baseline-conformance/decisions.md — Reviewer feedback,
open questions, maintainer responses, and decision priority analysis
Tests for the changes have been added (for bug fixes/features)

N/A — documentation only.

Which issue(s) this PR fixes

NONE

Special notes for your reviewer

This PR is structured as two companion documents:

proposal.md contains the proposal itself (motivation, architecture,
scope, phased delivery, ecosystem positioning, success criteria)
decisions.md tracks all reviewer feedback, open questions, maintainer
responses, and the decision priority analysis

The control-by-control coverage analysis
is maintained separately.

Feedback from Eddie Knight (ORBIT WG TSC Chair), Adolfo García Veytia (AMPEL),
Mike Lieberman, and Felix Lange has been incorporated. See decisions.md for the
full feedback log and how it informed the proposal.

Does this PR introduce a user-facing change?

NONE

codecov · 2026-02-27T12:04:51Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 69.67%. Comparing base (353ed60) to head (45f4da6).
⚠️ Report is 321 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #4952      +/-   ##
==========================================
+ Coverage   66.80%   69.67%   +2.87%     
==========================================
  Files         230      251      +21     
  Lines       16602    15654     -948     
==========================================
- Hits        11091    10907     -184     
+ Misses       4808     3873     -935     
- Partials      703      874     +171

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

justaugustus

@ossf/scorecard-maintainers @ossf/scorecard-fe-maintainers @eddie-knight @puerco @evankanderson @mlieberman85 — based on conversations from this week with various WG ORBIT-adjacent maintainers, I'm tossing this early draft up for review.

Feel free to comment away while I work through this!

- Add AGENTS.md with project overview, build/test commands, architecture guide, contribution conventions, and AI agent collaboration guidelines (co-authorship trailer, OpenSpec workflow, git hygiene rules) - Bootstrap openspec/ directory structure with initial specs: - openspec/specs/platform-clients/spec.md: platform client abstraction - openspec/changes/pvtr-integration/specs/pvtr-baseline/spec.md: OSPS Baseline integration requirements and scenarios - Incorporate guidance from the OSPO Engineering Playbook into AGENTS.md Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

The scope of this work is OSPS Baseline conformance within the ORBIT ecosystem — Privateer/PVTR interoperability is one aspect, not the whole story. Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Complete rewrite of the proposal and spec to cover the full scope of the 2026 roadmap, not just Privateer/PVTR interoperability: - Conformance engine producing PASS/FAIL/UNKNOWN/NOT_APPLICABLE/ATTESTED - OSPS output format (--format=osps) - Versioned control-to-probe mapping files - Applicability engine for precondition detection - Security Insights ingestion for ORBIT ecosystem interop - Attestation mechanism for non-automatable controls - Gemara Layer 4 compatibility - CI gating support - Phased delivery aligned with quarterly milestones - ORBIT ecosystem positioning (complement PVTR, don't duplicate) Highlights Spencer's review notes as numbered open questions (OQ-1 through OQ-4): - OQ-1: Attestation identity model (OIDC? tokens? workflows?) - OQ-2: Enforcement detection vs. being an enforcement tool - OQ-3: scan_scope field usefulness in output schema - OQ-4: Evidence should be probe-based only, not check-based Renames spec subdirectory from pvtr-baseline to osps-conformance. Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

- Add openspec/specs/core-checks/spec.md and openspec/specs/probes/spec.md documenting existing Scorecard architecture for spec-driven development - Update .gitignore to exclude roadmap drafting notes Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Stephen's responses to clarifying questions (CQ-1 through CQ-8) and feedback on the proposal draft: - Both scoring and conformance modes coexist; no deprecation needed now - Target OSPS Baseline v2026.02.19 (latest), align with maintenance cadence - Provide degraded-but-useful evaluation without Security Insights - Invest in Gemara SDK integration for multi-tool consumption - Prioritize Level 1 conformance; consume external signals where possible - Approval requires Stephen + Spencer + 1 non-Steering maintainer - Q2 outcome should be OSPS Baseline Level 1 conformance - Land capabilities across all surfaces (CLI, Action, API) Key changes requested: - Correct PVTR references (it's the Privateer plugin, not a separate tool) - Add Darnit and AMPEL comparison - Replace quarterly timelines with phase-based outcomes - Plan to extract Scorecard's control catalog for other tools - Use Mermaid for diagrams - Create separate OSPS Baseline coverage analysis in docs/ - Create docs/ROADMAP.md for public consumption Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Changes based on Stephen's review: - Replace all "PVTR" references with "Privateer plugin for GitHub repositories" — it's the Privateer plugin, not a separate tool - Add ecosystem tooling comparison section covering Darnit (compliance audit + remediation), AMPEL (attestation-based policy enforcement), Privateer plugin (Baseline evaluation), and Scorecard (measurement) - Replace quarterly timeline (Q1-Q4) with phase-based delivery (Phase 1-3) focused on outcomes, not calendar dates - Update OSPS Baseline version from v2025-10-10 to v2026.02.19 - Convert ASCII ecosystem diagram to Mermaid - Add Scorecard control catalog extraction to scope - Add Gemara SDK integration to scope - Update coverage snapshot to reference docs/osps-baseline-coverage.md (to be created with fresh analysis) - Add approval process section based on governance answers - Update Security Insights requirement to degraded-but-useful mode - Add integration pipeline diagram (Scorecard -> Darnit -> AMPEL) Signed-off-by: Stephen Augustus <foo@auggie.dev> Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Create docs/osps-baseline-coverage.md with a control-by-control analysis of Scorecard's current probe coverage against OSPS Baseline v2026.02.19. Coverage summary: 8 COVERED, 17 PARTIAL, 31 GAP, 3 NOT_OBSERVABLE across 59 controls. Create docs/ROADMAP.md with a publicly-consumable 2026 roadmap organized into three phases: conformance foundation + Level 1, release integrity + Level 2, and enforcement detection + Level 3. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

…h CQ-12 Remove reference to docs/roadmap-ideas.md from the coverage analysis document since it is not committed to the repo. Add four new clarifying questions to the proposal: NOT_OBSERVABLE controls in Phase 1 (CQ-9), mapping file ownership (CQ-10), OSPS output schema stability guarantees (CQ-11), and Phase 1 probe gap prioritization (CQ-12). Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Replace \n with <br/> in Mermaid node labels so line breaks render correctly in GitHub's Mermaid renderer. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Replace remaining "Darn" references with "Darnit" throughout the proposal. Add Minder to the ecosystem comparison table, integration diagram, and "What Scorecard must not do" section. Minder is an OpenSSF Sandbox project in the ORBIT WG that consumes Scorecard findings for policy enforcement and auto-remediation. Add CQ-13 (Minder integration surface) and CQ-14 (Darnit vs. Minder delineation) as new clarifying questions. Update docs/ROADMAP.md ecosystem alignment to include Minder. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Add a new section to docs/osps-baseline-coverage.md listing existing Scorecard issues and PRs that are directly relevant to closing OSPS Baseline coverage gaps, including: - ossf#2305 / ossf#2479 (Security Insights) - #30 (secrets scanning) - ossf#1476 / ossf#2605 (SBOM) - ossf#4824 (changelog) - ossf#2465 (private vulnerability reporting) - ossf#4080 / ossf#4823 / ossf#2684 / ossf#1417 (signed releases) - ossf#2142 (threat model) - ossf#4723 (Minder/Rego integration, closed) Add CQ-15 asking whether existing issues should be adopted as Phase 1 work items or whether new issues should reference them. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Remove openspec system specs (core-checks, platform-clients, probes) that were scaffolding for documenting existing Scorecard architecture. These are not part of the OSPS conformance proposal and can be recreated if needed. Remove docs/roadmap-ideas.md from .gitignore. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Add Allstar (Scorecard sub-project) to the ecosystem comparison table, integration flow diagram, and ORBIT ecosystem diagram. Allstar continuously monitors GitHub orgs and enforces Scorecard checks as policies with auto-remediation, and already enforces controls aligned with OSPS Baseline (branch protection, security policy, binary artifacts, dangerous workflows). Add Allstar to "Existing Scorecard surfaces that matter" section and to docs/ROADMAP.md ecosystem alignment. Add CQ-16 asking whether Allstar should be an explicit Phase 1 consumer of OSPS conformance output, and whether it is considered part of the enforcement boundary Scorecard does not cross. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

Update the commit guidelines to make the -s flag requirement unambiguous. Add a complete commit message format example showing how to combine the HEREDOC pattern with -s for DCO sign-off and the Co-Authored-By trailer. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

eddie-knight · 2026-02-27T21:10:13Z

Hey @justaugustus, thanks for leading this collaboration! Looking forward to hammering this out.

Some things to clarify:

Regarding mappings between Baseline catalog<->Scorecard checks, it is possible to easily put that into a new file with Scorecard maintainers as codeowners, pending approval on that proposal from baseline maintainers.
There is not an "OSPS output format," and even the relevant Gemara schemas (which are quite opinionated) are still designed to support output in multiple output formats within the SDK, such as SARIF. I would expect that you'd keep your current output logic, and then maybe add basic Gemara json/yaml as another option.
There is a stated goal of not duplicating the code from the plugin ossf/pvtr-github-repo-scanner, but the implementation plan as it's currently written does require duplication. In the current proposal, there would not be a technical relationship between the two codebases.
There is cursory mention of a scorecard catalog extraction, which I'm hugely in favor of, but I don't see an implementation plan for that.

An alternative plan would be to for us to spend a week consolidating checks/probes into the pvtr plugin (with relevant CODEOWNERS), then update Scorecard to selectively execute the plugin under the covers.

This would allow us to:

Extract the Scorecard control catalog for independent versioning and still easily connect it to the code
Instantly integrate Gemara to scorecard
Allow Scorecard to optionally run the existing Baseline checks from Scorecard
Allow LFX Insights and other pvtr users to optionally run Scorecard checks from Privateer
Simplify contribution overhead for each individual scorecard check
And also improve the quality of both codebases through shared logic

justaugustus

Reviewed this in today's Steering meeting and I think this is moving in the right direction, based on the initial feedback.

@spencerschrock and @jeffmendoza intend to leave reviews.

@GeauxJD @SecurityCRob — for awareness.

puerco

Amazing, two nits and two comments but it LGTM.

puerco · 2026-03-02T23:59:52Z

openspec/changes/osps-baseline-conformance/proposal.md

+
+- **Duplicate policy enforcement or remediation.** Downstream tools — [Privateer](https://github.com/ossf/pvtr-github-repo-scanner), [Minder](https://github.com/mindersec/minder), [AMPEL](https://github.com/carabiner-dev/ampel), [Darnit](https://github.com/kusari-oss/darnit), and others — consume Scorecard evidence through published output formats. Scorecard *produces* findings and attestations; downstream tools enforce, remediate, and audit.
+- **Privilege any downstream consumer.** All tools consume Scorecard output on equal terms. No tool has a special integration relationship.
+- **Turn OSPS controls into Scorecard checks.** OSPS conformance is a layer that consumes existing Scorecard signals, not 59 new checks.


This point is key. Baseline looks for outcomes. Compliance can be supported by Scorecard probe data.

The baseline control can be a 1:1 map to a probe's data, other times it will be a composite set of probes. If you add new probes to look for something new that's useful to test a baseline control, we just need to add another composition definition to say OSPS-XX-XXX can be [probe X] or [probe set 1] or [probe set 2].

This is akin to the way checks work now, but by generalizing it, the probe data can inform other framework testing tools, beyond baseline.

This is akin to the way checks work now, but by generalizing it, the probe data can inform other framework testing tools, beyond baseline.

Agreed. The current composition of probes comprise the "Scorecard checks":

scorecard/probes/entries.go

Lines 79 to 165 in 4dbf142

// SecurityPolicy is all the probes for the

// SecurityPolicy check.

SecurityPolicy = []ProbeImpl{

securityPolicyPresent.Run,

securityPolicyContainsLinks.Run,

securityPolicyContainsVulnerabilityDisclosure.Run,

securityPolicyContainsText.Run,

}

// DependencyToolUpdates is all the probes for the

// DependencyUpdateTool check.

DependencyToolUpdates = []ProbeImpl{

dependencyUpdateToolConfigured.Run,

}

Fuzzing = []ProbeImpl{

fuzzed.Run,

}

Packaging = []ProbeImpl{

packagedWithAutomatedWorkflow.Run,

}

License = []ProbeImpl{

hasLicenseFile.Run,

hasFSFOrOSIApprovedLicense.Run,

}

Contributors = []ProbeImpl{

contributorsFromOrgOrCompany.Run,

}

Vulnerabilities = []ProbeImpl{

hasOSVVulnerabilities.Run,

}

CodeReview = []ProbeImpl{

codeApproved.Run,

}

SAST = []ProbeImpl{

sastToolConfigured.Run,

sastToolRunsOnAllCommits.Run,

}

DangerousWorkflows = []ProbeImpl{

hasDangerousWorkflowScriptInjection.Run,

hasDangerousWorkflowUntrustedCheckout.Run,

}

Maintained = []ProbeImpl{

archived.Run,

hasRecentCommits.Run,

issueActivityByProjectMember.Run,

createdRecently.Run,

}

CIIBestPractices = []ProbeImpl{

hasOpenSSFBadge.Run,

}

BinaryArtifacts = []ProbeImpl{

hasUnverifiedBinaryArtifacts.Run,

}

Webhook = []ProbeImpl{

webhooksUseSecrets.Run,

}

CITests = []ProbeImpl{

testsRunInCI.Run,

}

SBOM = []ProbeImpl{

hasSBOM.Run,

hasReleaseSBOM.Run,

}

SignedReleases = []ProbeImpl{

releasesAreSigned.Run,

releasesHaveProvenance.Run,

}

BranchProtection = []ProbeImpl{

blocksDeleteOnBranches.Run,

blocksForcePushOnBranches.Run,

branchesAreProtected.Run,

branchProtectionAppliesToAdmins.Run,

dismissesStaleReviews.Run,

requiresApproversForPullRequests.Run,

requiresCodeOwnersReview.Run,

requiresLastPushApproval.Run,

requiresUpToDateBranches.Run,

runsStatusChecksBeforeMerging.Run,

requiresPRsToChangeCode.Run,

}

PinnedDependencies = []ProbeImpl{

pinsDependencies.Run,

}

TokenPermissions = []ProbeImpl{

hasNoGitHubWorkflowPermissionUnknown.Run,

jobLevelPermissions.Run,

topLevelPermissions.Run,

}

"probe sets" or "compositions" seems like the right way to approach it without introducing too much additional vocabulary (or layers of complexity).

puerco · 2026-03-03T00:15:29Z

openspec/changes/osps-baseline-conformance/proposal.md

+4. Evaluation logic is self-contained — Scorecard can produce conformance
+   results using its own probes and mappings, independent of external
+   evaluation engines


I'm assuming conformance here means "framework compliance".

This is cool, but also ensure that Scorecard's view of the world can be used at the check and probe level to enable projects and organizations to evaluate adherence to other frameworks. Especially useful for internal/unpublished variants (profiles) of frameworks that organizations define.

Agreed — the conformance engine should support arbitrary frameworks and organizational profiles. The probe findings are framework-agnostic by design; OSPS Baseline is just the first (non-"checks") evaluation layer over them.

The same probe evidence can be composed differently for other frameworks, as you suggested above.
We'll make this explicit in the proposal.

puerco · 2026-03-03T00:24:09Z

openspec/changes/osps-baseline-conformance/proposal.md

+   - In-toto predicates (SVR first; track [Baseline Predicate PR #502](https://github.com/in-toto/attestation/pull/502))
+   - Gemara output (transitive dependency via security-baseline)
+   - OSCAL Assessment Results (using [go-oscal](https://github.com/defenseunicorns/go-oscal))
+   - Existing Scorecard predicate type (`scorecard.dev/result/v0.1`) preserved; new predicate types added as options


The current predicate type is the full scorecard run evaluation. For completeness' sake, it would be nice to have one type for a list of check evaluations and one for probe evaluations.

These are only useful, though, if they have more data than what an SVR has to offer, so I would wait until there is an actual need for them.

Right. Probe-level findings are available via --format=probe but have no in-toto wrapper today. Agree that dedicated predicate types for check and probe evaluations are worth considering once there's a concrete need beyond what SVR provides.

Maybe we assume that you want a check-based or probe-based predicate type depending on the run and output options the user provides?

This might suggest the need for a --framework or --evaluation option?

puerco · 2026-03-03T00:27:13Z

openspec/changes/osps-baseline-conformance/proposal.md

+5. **Applicability engine** — detects preconditions (e.g., "has made a release") and outputs NOT_APPLICABLE
+6. **Metadata ingestion layer** — supports Security Insights as one source among several for metadata-dependent controls (OSPS-BR-03.01, BR-03.02, QA-04.01). Architecture invites contributions for alternative sources (SBOMs, VEX, platform APIs). No single metadata file is required for meaningful results.
+7. **Attestation mechanism (v1)** — accepts repo-local metadata for non-automatable controls (pending OQ-1 resolution)
+8. **Scorecard control catalog extraction** — plan and mechanism to make Scorecard's control definitions consumable by other tools


From reading the proposal, wouldn't Scorecard rather become a consumer of control catalogs?

Both directions — Scorecard would consume the OSPS Baseline catalog (via security-baseline) for [one type of] conformance evaluation, and Scorecard's own probe definitions (probes/*/def.yml) are already machine-readable YAML with structured metadata.

The "extraction plan" is about packaging those existing definitions for consumption so that tools like other parts of the Scorecard codebase or external tools, like AMPEL, can discover what Scorecard evaluates and compose mappings against it, if needed.

Add framework-agnostic conformance language, probe composition model (1:1 and many-to-1 mappings), bidirectional catalog framing, and future design concepts (framework CLI option, probe-level predicate type). Log feedback and responses in decisions.md. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

FelixOliverLange

Really cool to see in general, as it may also help make output more understandable for users (e.g. pass / fail statements). From my perspective, it would be important to preserve the valuable insights generated by Scorecard, which is also why I have a question on how less mapped checks feed back into the ecosystem. But of course, I'm also only a user of Scorecard.

FelixOliverLange · 2026-03-05T16:23:43Z

docs/ROADMAP.md

+- Multi-repo project-level conformance aggregation
+- Attestation integration GA
+
+### Ecosystem alignment


When it comes to ecosystem alignment, will existent Scorecard checks like "Maintained" also feed back into other parts of the ecosystem (as right now, "Maintained" is mainly referenced as precondition)? Checks like "Maintained" can help users of open-source projects to e.g. identify abandoned projects, so would be nice to preserve in a prominent manner.

But of course, I'm also only a user of Scorecard.

@FelixOliverLange — User feedback is important to let us know whether we're building the right thing. I appreciate you taking the time to comment! 🚀

Existing checks (like Maintained) would be fully preserved; check scores (0-10) and conformance labels (PASS/FAIL/UNKNOWN) are parallel evaluation layers over the same probe evidence. (See the three-tier evaluation model in the proposal.)

Checks that don't map to OSPS Baseline controls would continue to produce scores as they do today.
The Maintained check's probes also serve double duty: the conformance layer uses them as preconditions
(via the applicability engine) and as evidence toward maintenance-related controls.

No existing check would be deprioritized or removed.

Add Scorecard v6 framing with "Why v6" section, single-run architectural constraint, confidence scoring future concept, and Scorecard user feedback section (FL-1 through FL-4) from community meeting. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

SecurityCRob · 2026-03-06T19:37:16Z

I am VERY supportive of this effort. Thanks for taking the time to think this through, and thanks to all the contributors for the thoughtful and respectful conversation! scorecard++

AdamKorcz · 2026-03-06T19:45:43Z

docs/ROADMAP.md

+
+### Ecosystem alignment
+
+Scorecard operates within the ORBIT WG ecosystem as an evidence engine. All


Why does Scorecard need to operate within the ORBIT WG ecosystem? Perhaps add a bit about what the ORBIT WG ecosystem is - that may clear it up.

AdamKorcz · 2026-03-06T19:56:08Z

openspec/changes/osps-baseline-conformance/proposal.md

+## Summary
+
+**Mission:** Scorecard produces trusted, structured security evidence for the
+open source ecosystem. _(Full MVVSR to be developed as a follow-up deliverable


Suggestion: Expand MVVSR here.

AdamKorcz · 2026-03-06T20:01:54Z

openspec/changes/osps-baseline-conformance/proposal.md

+proves this architecture, and the central initiative for Scorecard's 2026
+roadmap.
+
+This is fundamentally a **product-level shift** — the defining change for


A bit of a general question: What are the reasons of adding framework conformance to Scorecard itself instead of having a standalone tool to which we can feed Scorecard findings and where the standalone tool then gives a verdict about framework conformance?

AdamKorcz · 2026-03-06T20:06:34Z

openspec/changes/osps-baseline-conformance/proposal.md

+### What Scorecard SHOULD NOT do
+
+Scorecard SHOULD NOT (per [RFC 2119](https://www.rfc-editor.org/rfc/rfc2119))
+duplicate evaluation that downstream tools handle. There may be scenarios where


Would be good with a definition of downstream tools here.

AdamKorcz · 2026-03-06T20:11:20Z

openspec/changes/osps-baseline-conformance/proposal.md

+
+## Architecture
+
+### Processing model


Is this ("Processing model") the current dataflow and the following section "Three-tier evaluation model" the intended?

spencerschrock

As a whole, I support the evidence-based focus. I like the 6 listed design principles.
I didn't quite get a chance to go through all the decisions.md file, as it's quite a lot of feedback to parse through

spencerschrock · 2026-03-06T19:37:59Z

docs/osps-baseline-coverage.md

+Draft PR that attempted to run Minder Rego rules within Scorecard,
+including OSPS-QA-05.01 and QA-03.01. Closed due to inactivity but
+demonstrates interest in deeper Minder/Scorecard integration.


Is this going to drift into policy/enforcement? Does this conflict with our goal of evidence only?

spencerschrock · 2026-03-06T19:38:16Z

docs/osps-baseline-coverage.md

+| OSPS-AC-01.01 | MFA for sensitive resources | NOT_OBSERVABLE | None | Requires org admin API access; Scorecard tokens typically lack this. Must be UNKNOWN unless org-admin token is provided. |
+| OSPS-AC-02.01 | Least-privilege defaults for new collaborators | NOT_OBSERVABLE | None | Requires org-level permission visibility. Must be UNKNOWN. |


I'd say this could be observable if it just needs the right token. If this is run in the context of an OSPO self-observation I think it's fine.

spencerschrock · 2026-03-06T19:39:39Z

docs/ROADMAP.md

+Check scores (0-10) and conformance labels (PASS/FAIL/UNKNOWN) are parallel
+evaluation layers over the same probe evidence, produced in a single run.


to be clear, in this situation "evaluation" or "conformance" layer, just means output format?

spencerschrock · 2026-03-06T19:44:00Z

docs/ROADMAP.md

+- Two-layer mapping model for OSPS Baseline v2026.02.19:
+  - Check-level relations contributed upstream to security-baseline
+  - Probe-level mappings maintained in Scorecard


What's the value in upstreaming check-level relations? I think mapping probes to baseline controls is fine.

spencerschrock · 2026-03-06T19:44:49Z

docs/ROADMAP.md

+  - Secrets detection — consuming platform signals where available
+- Metadata ingestion layer — Security Insights as first supported source;
+  architecture supports additional metadata sources
+- Scorecard control catalog extraction plan


extracting to where?

spencerschrock · 2026-03-06T19:58:21Z

openspec/changes/osps-baseline-conformance/decisions.md

+
+The current version of the OSPS Baseline is [v2026.02.19](https://baseline.openssf.org/versions/2026-02-19).
+
+We should align with the latest version at first and have a process for aligning with new versions on a defined cadence. We should understand the [OSPS Baseline maintenance process](https://baseline.openssf.org/maintenance.html) and align with it.


What sort of maintenance toil will this involve? Do existing controls get updated, or just new ones added?

spencerschrock · 2026-03-06T20:02:09Z

openspec/changes/osps-baseline-conformance/decisions.md

+
+We need to land these capabilities for as much surface area as possible.


I would say the cron has additional barriers, cost of writing/serving more data. I have no concerns with the action

spencerschrock · 2026-03-06T20:03:20Z

openspec/changes/osps-baseline-conformance/decisions.md

+
+The versioned mapping file (e.g., `pkg/osps/mappings/v2026-02-19.yaml`) is a critical artifact that defines which probes satisfy which OSPS controls. Who should own this file? Options:


this mapping file lives in which repo?

spencerschrock · 2026-03-06T20:09:39Z

openspec/changes/osps-baseline-conformance/proposal.md

+### Design principles
+


I agree with these principles

spencerschrock · 2026-03-06T20:11:09Z

AGENTS.md

+# AGENTS.md
+
+This file provides guidance for AI coding agents working on the OpenSSF Scorecard project.


This seems unrelated to this change. Other than I assume you using it to generate some of these docs?

AdamKorcz · 2026-03-06T20:12:51Z

openspec/changes/osps-baseline-conformance/proposal.md

+acceptable data dependency for control definitions (see Scope).
+
+**Flexibility:** Under this structure, scaling back to a fully independent model
+(Option A) remains straightforward — deprioritize or drop specific output


Where is Option A?

AdamKorcz · 2026-03-06T20:15:17Z

openspec/changes/osps-baseline-conformance/proposal.md

+   applicability engine all live in Scorecard)
+3. Interoperability is purely at the output layer — Gemara, in-toto, SARIF,
+   OSCAL are presentation formats, not architectural dependencies
+4. Evaluation logic is self-contained — Scorecard can produce conformance


Not sure if this is entirely correct. Currently, I wouldn't say that Scorecard can produce conformance results, but perhaps I am understanding the context of "constraints" incorrectly here; Are these current constraints or are they constraints that should exits with the conformance layer in Scorecard?

AdamKorcz · 2026-03-06T20:23:38Z

openspec/changes/osps-baseline-conformance/proposal.md

+AMPEL's role), perform compliance auditing and remediation (Darnit's role), or
+guarantee compliance with any regulatory framework.
+
+## Success criteria


Would be nice to make this more explicit: What is the success criteria for here? The proposal or the implementation?

justaugustus

@spencerschrock @AdamKorcz — thanks so much for the feedback!

I'm working on some changes locally and will push them up by tomorrow.

github-project-automation bot added this to OpenSSF Scorecard Feb 27, 2026

justaugustus temporarily deployed to gitlab February 27, 2026 12:00 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test February 27, 2026 12:00 — with GitHub Actions Inactive

dosubot bot added the size:XXL This PR changes 1000+ lines, ignoring generated files. label Feb 27, 2026

justaugustus commented Feb 27, 2026

View reviewed changes

justaugustus and others added 12 commits February 27, 2026 07:07

🌱 Fix Mermaid diagram newlines in proposal

259d3ec

Replace \n with <br/> in Mermaid node labels so line breaks render correctly in GitHub's Mermaid renderer. Co-Authored-By: Claude <noreply@anthropic.com> Signed-off-by: Stephen Augustus <foo@auggie.dev>

justaugustus changed the title ~~[WIP] OSPS Baseline roadmap~~ [WIP] 📖 OSPS Baseline roadmap Feb 27, 2026

justaugustus force-pushed the roadmap-baseline branch from f0b229d to afe4c8d Compare February 27, 2026 12:09

justaugustus temporarily deployed to gitlab February 27, 2026 12:10 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test February 27, 2026 12:10 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test February 27, 2026 12:21 — with GitHub Actions Inactive

justaugustus temporarily deployed to gitlab February 27, 2026 12:21 — with GitHub Actions Inactive

justaugustus and others added 2 commits February 27, 2026 07:26

justaugustus force-pushed the roadmap-baseline branch from 4f5c6ce to bd76d94 Compare February 27, 2026 12:26

justaugustus temporarily deployed to gitlab February 27, 2026 12:26 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test February 27, 2026 12:26 — with GitHub Actions Inactive

justaugustus requested review from AdamKorcz, jeffmendoza and spencerschrock March 2, 2026 10:42

justaugustus commented Mar 2, 2026

View reviewed changes

justaugustus moved this to In Progress in OpenSSF Scorecard Mar 2, 2026

justaugustus moved this from In Progress to Review in progress in OpenSSF Scorecard Mar 2, 2026

justaugustus changed the title ~~[WIP] 📖 OSPS Baseline roadmap~~ 📖 OSPS Baseline conformance proposal and 2026 roadmap Mar 2, 2026

justaugustus changed the title ~~📖 OSPS Baseline conformance proposal and 2026 roadmap~~ 📖 2026 roadmap and OSPS Baseline conformance proposal Mar 2, 2026

puerco reviewed Mar 3, 2026

View reviewed changes

jpower432 mentioned this pull request Mar 3, 2026

feat!: establish mapping boundary between inline and document gemaraproj/gemara#313

Merged

11 tasks

justaugustus temporarily deployed to gitlab March 3, 2026 06:13 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test March 3, 2026 06:13 — with GitHub Actions Inactive

FelixOliverLange reviewed Mar 5, 2026

View reviewed changes

justaugustus changed the title ~~📖 2026 roadmap and OSPS Baseline conformance proposal~~ 📖 Scorecard v6: OSPS Baseline conformance proposal and 2026 roadmap Mar 6, 2026

justaugustus temporarily deployed to gitlab March 6, 2026 02:43 — with GitHub Actions Inactive

justaugustus temporarily deployed to integration-test March 6, 2026 02:43 — with GitHub Actions Inactive

AdamKorcz reviewed Mar 6, 2026

View reviewed changes

spencerschrock reviewed Mar 6, 2026

View reviewed changes

AdamKorcz reviewed Mar 6, 2026

View reviewed changes

justaugustus commented Mar 11, 2026

View reviewed changes

This comment was marked as off-topic.

Sign in to view

	// SecurityPolicy is all the probes for the
	// SecurityPolicy check.
	SecurityPolicy = []ProbeImpl{
	securityPolicyPresent.Run,
	securityPolicyContainsLinks.Run,
	securityPolicyContainsVulnerabilityDisclosure.Run,
	securityPolicyContainsText.Run,
	}
	// DependencyToolUpdates is all the probes for the
	// DependencyUpdateTool check.
	DependencyToolUpdates = []ProbeImpl{
	dependencyUpdateToolConfigured.Run,
	}
	Fuzzing = []ProbeImpl{
	fuzzed.Run,
	}
	Packaging = []ProbeImpl{
	packagedWithAutomatedWorkflow.Run,
	}
	License = []ProbeImpl{
	hasLicenseFile.Run,
	hasFSFOrOSIApprovedLicense.Run,
	}
	Contributors = []ProbeImpl{
	contributorsFromOrgOrCompany.Run,
	}
	Vulnerabilities = []ProbeImpl{
	hasOSVVulnerabilities.Run,
	}
	CodeReview = []ProbeImpl{
	codeApproved.Run,
	}
	SAST = []ProbeImpl{
	sastToolConfigured.Run,
	sastToolRunsOnAllCommits.Run,
	}
	DangerousWorkflows = []ProbeImpl{
	hasDangerousWorkflowScriptInjection.Run,
	hasDangerousWorkflowUntrustedCheckout.Run,
	}
	Maintained = []ProbeImpl{
	archived.Run,
	hasRecentCommits.Run,
	issueActivityByProjectMember.Run,
	createdRecently.Run,
	}
	CIIBestPractices = []ProbeImpl{
	hasOpenSSFBadge.Run,
	}
	BinaryArtifacts = []ProbeImpl{
	hasUnverifiedBinaryArtifacts.Run,
	}
	Webhook = []ProbeImpl{
	webhooksUseSecrets.Run,
	}
	CITests = []ProbeImpl{
	testsRunInCI.Run,
	}
	SBOM = []ProbeImpl{
	hasSBOM.Run,
	hasReleaseSBOM.Run,
	}
	SignedReleases = []ProbeImpl{
	releasesAreSigned.Run,
	releasesHaveProvenance.Run,
	}
	BranchProtection = []ProbeImpl{
	blocksDeleteOnBranches.Run,
	blocksForcePushOnBranches.Run,
	branchesAreProtected.Run,
	branchProtectionAppliesToAdmins.Run,
	dismissesStaleReviews.Run,
	requiresApproversForPullRequests.Run,
	requiresCodeOwnersReview.Run,
	requiresLastPushApproval.Run,
	requiresUpToDateBranches.Run,
	runsStatusChecksBeforeMerging.Run,
	requiresPRsToChangeCode.Run,
	}
	PinnedDependencies = []ProbeImpl{
	pinsDependencies.Run,
	}
	TokenPermissions = []ProbeImpl{
	hasNoGitHubWorkflowPermissionUnknown.Run,
	jobLevelPermissions.Run,
	topLevelPermissions.Run,
	}


		### Ecosystem alignment

		Scorecard operates within the ORBIT WG ecosystem as an evidence engine. All

		\| OSPS-AC-01.01 \| MFA for sensitive resources \| NOT_OBSERVABLE \| None \| Requires org admin API access; Scorecard tokens typically lack this. Must be UNKNOWN unless org-admin token is provided. \|
		\| OSPS-AC-02.01 \| Least-privilege defaults for new collaborators \| NOT_OBSERVABLE \| None \| Requires org-level permission visibility. Must be UNKNOWN. \|

		Check scores (0-10) and conformance labels (PASS/FAIL/UNKNOWN) are parallel
		evaluation layers over the same probe evidence, produced in a single run.


		The current version of the OSPS Baseline is [v2026.02.19](https://baseline.openssf.org/versions/2026-02-19).

		We should align with the latest version at first and have a process for aligning with new versions on a defined cadence. We should understand the [OSPS Baseline maintenance process](https://baseline.openssf.org/maintenance.html) and align with it.


		We need to land these capabilities for as much surface area as possible.


		The versioned mapping file (e.g., `pkg/osps/mappings/v2026-02-19.yaml`) is a critical artifact that defines which probes satisfy which OSPS controls. Who should own this file? Options:

		# AGENTS.md

		This file provides guidance for AI coding agents working on the OpenSSF Scorecard project.

Conversation

justaugustus commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What kind of change does this PR introduce?

What is the current behavior?

What is the new behavior (if this is a feature change)?**

Which issue(s) this PR fixes

Special notes for your reviewer

Does this PR introduce a user-facing change?

Uh oh!

codecov bot commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

justaugustus left a comment

Choose a reason for hiding this comment

Uh oh!

eddie-knight commented Feb 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

justaugustus left a comment

Choose a reason for hiding this comment

Uh oh!

puerco left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justaugustus Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

FelixOliverLange left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

SecurityCRob commented Mar 6, 2026

Uh oh!

AdamKorcz Mar 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

spencerschrock left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

justaugustus commented Feb 27, 2026 •

edited

Loading

codecov bot commented Feb 27, 2026 •

edited

Loading

eddie-knight commented Feb 27, 2026 •

edited

Loading

justaugustus Mar 3, 2026 •

edited

Loading

AdamKorcz Mar 6, 2026 •

edited

Loading