From 0acb538d96944a3bcee9435f026342ad78aab9d6 Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:11:26 -0700 Subject: [PATCH 1/8] feat(vex): consume OpenVEX 0.2.0 and CycloneDX VEX 1.6 statements New `--vex ` flag (repeatable) on `bomdrift diff`. Each file is auto-detected as OpenVEX 0.2.0 or CycloneDX VEX 1.6. - Statements with status `not_affected` / `fixed` suppress matching findings (counted in the new "Suppressed by VEX" markdown summary row). - `under_investigation` annotates with a `VEX:` badge in markdown and `properties.vexStatus` in SARIF without suppressing. - `affected` annotates as a no-op badge. Match keys: `(VulnRef.id OR alias, purl_with_version)`. For non-CVE finding kinds (typosquat, version-jump, maintainer-age, license-violation), bomdrift defines a synthetic ID convention (`bomdrift.::`) so VEX statements can target those too. Documented in `docs/src/vex.md`. Multi-file precedence is first-write-wins on `(vuln_id, product)`. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/src/SUMMARY.md | 1 + docs/src/vex.md | 148 ++++++++ src/cli.rs | 20 +- src/config.rs | 2 + src/enrich/mod.rs | 17 + src/enrich/osv.rs | 2 + src/lib.rs | 19 +- src/render/json.rs | 2 + src/render/markdown.rs | 14 + src/render/sarif.rs | 7 + src/vex.rs | 744 +++++++++++++++++++++++++++++++++++++++++ 11 files changed, 974 insertions(+), 2 deletions(-) create mode 100644 docs/src/vex.md create mode 100644 src/vex.rs diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index 2cd04a5..973b5e3 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -13,6 +13,7 @@ - [Output formats](./output-formats.md) - [SARIF + Code Scanning](./sarif.md) +- [VEX](./vex.md) - [License policy](./license-policy.md) - [Baseline & suppression](./baseline.md) diff --git a/docs/src/vex.md b/docs/src/vex.md new file mode 100644 index 0000000..234c199 --- /dev/null +++ b/docs/src/vex.md @@ -0,0 +1,148 @@ +# VEX (Vulnerability Exploitability eXchange) + +bomdrift consumes and emits VEX statements so reviewers can record +exploitability decisions next to their SBOMs and have those decisions +suppress noise on subsequent diffs. + +Two formats are supported on input (auto-detected per file): + +- **OpenVEX 0.2.0** — see . +- **CycloneDX VEX 1.6** — `analysis.state` is mapped onto the OpenVEX + vocabulary (`not_affected` / `resolved` → `not_affected`, + `exploitable` → `affected`, `in_triage` → `under_investigation`). + +OpenVEX is bomdrift's preferred output format on emission (`--emit-vex`) +because the standalone JSON-LD doc is the smallest interop surface. + +## Consuming VEX (`--vex `) + +The flag is repeatable. Each file is auto-detected by its top-level +shape. Statements match findings by `(vuln_id_or_alias, product_purl)`. + +| VEX status | Effect on the matching finding | +|-----------------------|---------------------------------------------| +| `not_affected` | Suppresses (counted in "Suppressed by VEX") | +| `fixed` | Suppresses | +| `under_investigation` | Annotates with `VEX:under_investigation` | +| `affected` | Annotates with `VEX:affected` | + +A VEX statement's `products[]` may be either purl strings or +`{"@id": "pkg:..."}` objects. A versionless statement +(`pkg:npm/foo`) matches every versioned finding-product +(`pkg:npm/foo@1.2.3`); a versioned statement only matches the exact +purl. + +### Synthetic finding IDs + +bomdrift emits non-CVE findings (typosquats, version-jumps, +maintainer-age, license-violations). To author VEX statements that +suppress them, use the synthetic ID convention: + +| Finding kind | Synthetic ID format | +|---------------------|--------------------------------------------------------------| +| Typosquat | `bomdrift.typosquat::` | +| Version-jump | `bomdrift.version-jump::->` | +| Maintainer-age | `bomdrift.young-maintainer::` | +| License-violation | `bomdrift.license-violation::` | + +Example OpenVEX statement suppressing a typosquat finding: + +```json +{ + "vulnerability": { "name": "bomdrift.typosquat:pkg:npm/plain-crypto-js@4.2.1:crypto-js" }, + "products": [ { "@id": "pkg:npm/plain-crypto-js@4.2.1" } ], + "status": "not_affected", + "justification": "vulnerable_code_not_present", + "status_notes": "verified the package is a re-export and not impersonating crypto-js" +} +``` + +### Multiple files + +`--vex first.json --vex second.json` is processed left-to-right. +Statements with the same `(vuln_id, product)` are first-write-wins — +later files do NOT override earlier ones. Layer policy-level VEX first +and project-level VEX second so the project-level entries override the +defaults. (Or pass them in the reverse order if you want the opposite +precedence.) + +### Verifying with `vexctl` + +If you have [vexctl](https://github.com/openvex/vexctl) installed: + +```sh +vexctl filter --vex bomdrift.openvex.json sbom.cdx.json +``` + +verifies the VEX doc is well-formed and that statements match a known +purl in your SBOM. + +## Emitting VEX (`--emit-vex `) + +Writes a single OpenVEX 0.2.0 document covering every finding in the +post-baseline diff. + +- **Baseline-suppressed findings** inherit their `vex_status` from the + baseline entry, defaulting to `under_investigation`. Baseline ≠ + "not affected" — baseline often means "accepted in PR review" or + "temporarily ignored", so emitting `not_affected` by default would + publish a false claim. Opt in by adding `vex_status: "not_affected"` + to the baseline entry: + + ```json + { + "id": "GHSA-x-y-z", + "purl": "pkg:npm/foo", + "expires": "2026-12-31", + "reason": "Awaiting upstream patch (issue #42)", + "vex_status": "not_affected", + "vex_justification": "vulnerable_code_not_present" + } + ``` + +- **Un-suppressed findings** emit as `affected` with `status_notes` + describing the bomdrift finding kind. The justification field falls + back to the configured + `[diff] vex_default_justification` + (default `vulnerable_code_not_in_execute_path`). + +The doc's `timestamp` honors `SOURCE_DATE_EPOCH`, so `--emit-vex` +output is byte-deterministic in CI when the env is set. + +### Configuration keys + +```toml +[diff] +vex_author = "https://example.com/security" +vex_default_justification = "vulnerable_code_not_in_execute_path" +``` + +`vex_author` falls back to `repo_url` when unset; falls back to +`"bomdrift"` when both are missing. + +## Worked rotation example + +1. Run a diff that surfaces `GHSA-evil` on `pkg:npm/foo@1.0.0`. +2. Investigate, conclude the vulnerable function is not on your + execute path. +3. Add the entry to `.bomdrift/baseline.json` with VEX status: + + ```json + { + "schema_version": 1, + "suppressed_advisories": [ + { + "id": "GHSA-evil", + "purl": "pkg:npm/foo@1.0.0", + "expires": "2027-01-01", + "reason": "Function is unreachable per audit (PR #123)", + "vex_status": "not_affected", + "vex_justification": "vulnerable_code_not_in_execute_path" + } + ] + } + ``` + +4. Re-run with `--emit-vex bomdrift.openvex.json` to produce a publishable + exploitability statement that downstream consumers can ingest with + their own `--vex` flag. diff --git a/src/cli.rs b/src/cli.rs index e7586a8..13f86c6 100644 --- a/src/cli.rs +++ b/src/cli.rs @@ -21,7 +21,7 @@ pub struct Cli { #[derive(Subcommand, Debug)] pub enum Command { /// Diff two SBOMs and surface supply-chain risk signals on changed components. - Diff(DiffArgs), + Diff(Box), /// Refresh the bundled typosquat top-package lists from upstream sources. /// /// Writes a fresh per-ecosystem list to the user's XDG cache directory @@ -305,6 +305,24 @@ pub struct DiffArgs { /// expression evaluation). Off by default — fail-closed. #[arg(long)] pub allow_ambiguous_licenses: bool, + /// Path(s) to VEX (Vulnerability Exploitability eXchange) files + /// to consume. Repeatable. Each file is auto-detected as either + /// OpenVEX 0.2.0 or CycloneDX VEX 1.6. Statements with status + /// `not_affected` / `fixed` suppress matching findings; statements + /// with `under_investigation` annotate without suppressing; + /// statements with `affected` annotate as a no-op badge. See + /// for the + /// finding-id matching rules including the synthetic-id convention + /// for non-CVE findings. + #[arg(long, action = clap::ArgAction::Append)] + pub vex: Vec, + /// Emit a single OpenVEX 0.2.0 doc covering every finding in the + /// post-baseline diff. Baseline-suppressed entries inherit their + /// `vex_status` from the baseline entry (defaulting to + /// `under_investigation` to avoid publishing false `not_affected` + /// claims); un-suppressed findings emit as `affected`. v0.9+. + #[arg(long)] + pub emit_vex: Option, #[arg(long)] pub debug_calibration: bool, /// Format for `--debug-calibration` rows. `pipe` (default, back-compat diff --git a/src/config.rs b/src/config.rs index f23239e..9772a98 100644 --- a/src/config.rs +++ b/src/config.rs @@ -199,6 +199,8 @@ mod tests { allow_licenses: Vec::new(), deny_licenses: Vec::new(), allow_ambiguous_licenses: false, + vex: Vec::new(), + emit_vex: None, } } diff --git a/src/enrich/mod.rs b/src/enrich/mod.rs index 6a3f1c9..c311442 100644 --- a/src/enrich/mod.rs +++ b/src/enrich/mod.rs @@ -26,6 +26,8 @@ use maintainer::MaintainerAgeFinding; use typosquat::TyposquatFinding; use version_jump::VersionJumpFinding; +use crate::vex::VexAnnotation; + /// Aggregated enrichment data attached to a diff. Keyed by the component's /// purl-with-version (e.g. `pkg:npm/axios@1.14.1`) so renderers can look up /// per-component findings without re-iterating over the changeset. @@ -56,6 +58,21 @@ pub struct Enrichment { /// `cs.license_changed` which detects same-version license changes. /// Empty when no `[license]` block is configured. pub license_violations: Vec, + /// VEX annotations attached to findings whose status is `affected` + /// or `under_investigation` (Phase G, v0.9). Keyed by an opaque + /// finding-identity string; renderers look up by the same identity. + /// Empty when no `--vex` files were passed or no statements matched. + #[serde(default, skip_serializing_if = "HashMap::is_empty")] + pub vex_annotations: HashMap, + /// Count of findings suppressed by `--vex` statements (`not_affected` + /// or `fixed`). Surfaced in the markdown summary so reviewers know + /// the diff was filtered. v0.9+. + #[serde(default, skip_serializing_if = "is_zero_usize")] + pub vex_suppressed_count: usize, +} + +fn is_zero_usize(n: &usize) -> bool { + *n == 0 } impl Enrichment { diff --git a/src/enrich/osv.rs b/src/enrich/osv.rs index f37875a..0db5fc1 100644 --- a/src/enrich/osv.rs +++ b/src/enrich/osv.rs @@ -166,6 +166,8 @@ fn enrich_with( version_jumps: Vec::new(), maintainer_age: Vec::new(), license_violations: Vec::new(), + vex_annotations: std::collections::HashMap::new(), + vex_suppressed_count: 0, }) } diff --git a/src/lib.rs b/src/lib.rs index 82e8971..2052a70 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -8,6 +8,7 @@ pub mod model; pub mod parse; pub mod refresh; pub mod render; +pub mod vex; use std::fs; use std::io::IsTerminal; @@ -26,7 +27,7 @@ pub const FAIL_ON_EXIT_CODE: i32 = 2; pub fn run(cli: Cli) -> Result<()> { match cli.command { - Command::Diff(args) => run_diff(args), + Command::Diff(args) => run_diff(*args), Command::RefreshTyposquat(args) => refresh::run(args), Command::Baseline { action } => run_baseline(action), Command::Init(args) => run_init(args), @@ -202,6 +203,22 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { baseline::apply(&mut cs, &mut enrichment, &baseline); } + // VEX consumption (Phase G, v0.9). Applied AFTER baseline so VEX + // statements operate on the post-baseline view — this matches what + // a downstream tool would see and avoids double-counting "already + // suppressed" findings in the VEX-suppressed tally. + if !args.vex.is_empty() { + match vex::load(&args.vex) { + Ok(stmts) => { + let idx = vex::VexIndex::build(stmts); + vex::apply(&mut enrichment, &idx); + } + Err(err) => { + eprintln!("warning: VEX load failed, continuing without VEX filtering: {err:#}"); + } + } + } + // Calibration tap. Off by default; opt-in via `--debug-calibration`. // Emits one CSV-friendly line per finding to stderr so an adopter // can run the flag across a representative N PRs and feed the diff --git a/src/render/json.rs b/src/render/json.rs index 1c1ba17..da1a532 100644 --- a/src/render/json.rs +++ b/src/render/json.rs @@ -205,6 +205,8 @@ mod tests { maintainer_age: Vec::new(), license_violations: Vec::new(), + vex_annotations: HashMap::new(), + vex_suppressed_count: 0, }; let cs = ChangeSet::default(); diff --git a/src/render/markdown.rs b/src/render/markdown.rs index 6566bbd..943b743 100644 --- a/src/render/markdown.rs +++ b/src/render/markdown.rs @@ -125,6 +125,13 @@ pub fn render_with_options(cs: &ChangeSet, enrichment: &Enrichment, opts: Option enrichment.license_violations.len() ); } + if enrichment.vex_suppressed_count > 0 { + let _ = writeln!( + out, + "| Suppressed by VEX | {} |", + enrichment.vex_suppressed_count + ); + } out.push('\n'); if opts.summary_only { @@ -516,6 +523,13 @@ fn write_one_vuln_row(out: &mut String, c: &Component, enrichment: &Enrichment) if r.kev { s.push_str(" · **KEV**"); } + let key = format!("cve:{}:{}", c.purl.as_deref().unwrap_or(""), r.id); + if let Some(ann) = enrichment.vex_annotations.get(&key) { + s.push_str(&format!(" · VEX:{}", ann.status)); + if let Some(j) = &ann.justification { + s.push_str(&format!(" ({j})")); + } + } s }) .collect::>() diff --git a/src/render/sarif.rs b/src/render/sarif.rs index 44600d0..d8f3240 100644 --- a/src/render/sarif.rs +++ b/src/render/sarif.rs @@ -224,6 +224,13 @@ fn results(cs: &ChangeSet, e: &Enrichment) -> Value { if advisory.kev { props.insert("kev".into(), Value::Bool(true)); } + let vex_key = format!("cve:{purl_str}:{}", advisory.id); + if let Some(ann) = e.vex_annotations.get(&vex_key) { + props.insert("vexStatus".into(), Value::String(ann.status.clone())); + if let Some(j) = &ann.justification { + props.insert("vexJustification".into(), Value::String(j.clone())); + } + } out.push(json!({ "ruleId": "bomdrift.cve", "level": sarif_level(advisory.severity), diff --git a/src/vex.rs b/src/vex.rs new file mode 100644 index 0000000..06645e7 --- /dev/null +++ b/src/vex.rs @@ -0,0 +1,744 @@ +//! VEX (Vulnerability Exploitability eXchange) consumption (v0.9, Phase G). +//! +//! Loads VEX statements from one or more user-supplied files and exposes a +//! matcher that maps each statement to bomdrift findings by +//! `(vuln_id_or_alias, product_purl)`. Two formats are auto-detected per +//! file: +//! +//! - **OpenVEX 0.2.0** (preferred): JSON-LD doc with a top-level +//! `@context: "https://openvex.dev/ns/..."` key and a `statements[]` +//! array. +//! - **CycloneDX VEX 1.6**: CycloneDX-shaped doc with `bomFormat: +//! "CycloneDX"` and a `vulnerabilities[]` array. +//! +//! ## Match keys +//! +//! - For OSV / CVE / GHSA findings: `(VulnRef.id OR alias, purl_with_version)`. +//! - For bomdrift "synthetic" finding kinds (typosquat, version-jump, +//! maintainer-age, license-violation): `(synthetic_id, purl_with_version)` +//! where `synthetic_id` follows the convention +//! `bomdrift.::` documented in +//! `docs/src/vex.md`. +//! +//! ## Conflict resolution +//! +//! When multiple files contain a statement for the same `(vuln_id, +//! product)`, the first-loaded statement wins. Documented as +//! first-write-wins so users layering policy + project-level VEX know +//! which file takes precedence. + +use std::collections::HashMap; +use std::fs; +use std::path::{Path, PathBuf}; + +use anyhow::{Context, Result}; +use serde::Serialize; + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)] +#[serde(rename_all = "snake_case")] +pub enum VexFormat { + OpenVex, + CycloneDxVex, +} + +#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize)] +#[serde(rename_all = "snake_case")] +pub enum VexStatus { + NotAffected, + Affected, + Fixed, + UnderInvestigation, +} + +impl VexStatus { + pub fn as_str(self) -> &'static str { + match self { + VexStatus::NotAffected => "not_affected", + VexStatus::Affected => "affected", + VexStatus::Fixed => "fixed", + VexStatus::UnderInvestigation => "under_investigation", + } + } + + pub fn from_openvex(s: &str) -> Option { + match s { + "not_affected" => Some(Self::NotAffected), + "affected" => Some(Self::Affected), + "fixed" => Some(Self::Fixed), + "under_investigation" => Some(Self::UnderInvestigation), + _ => None, + } + } + + /// CycloneDX VEX `analysis.state` mapping. + pub fn from_cyclonedx_state(s: &str) -> Option { + match s { + "not_affected" | "resolved" | "resolved_with_pedigree" | "false_positive" => { + Some(Self::NotAffected) + } + "exploitable" => Some(Self::Affected), + "in_triage" => Some(Self::UnderInvestigation), + _ => None, + } + } +} + +/// A single VEX statement after format normalization. +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct VexStatement { + pub vuln_id: String, + pub products: Vec, + pub status: VexStatus, + pub justification: Option, + pub status_notes: Option, +} + +/// Load every `path` in order and return the merged statement list. +/// First-write-wins on `(vuln_id, product)` collisions across files. +pub fn load(paths: &[PathBuf]) -> Result> { + let mut out: Vec = Vec::new(); + let mut seen: HashMap<(String, String), usize> = HashMap::new(); + for path in paths { + let body = fs::read_to_string(path) + .with_context(|| format!("reading VEX file: {}", path.display()))?; + let value: serde_json::Value = serde_json::from_str(&body) + .with_context(|| format!("parsing VEX JSON: {}", path.display()))?; + let format = detect_format(&value).ok_or_else(|| { + anyhow::anyhow!( + "could not detect VEX format (expected OpenVEX `@context` or CycloneDX `bomFormat`): {}", + path.display() + ) + })?; + let stmts = match format { + VexFormat::OpenVex => parse_openvex(&value, path)?, + VexFormat::CycloneDxVex => parse_cyclonedx_vex(&value, path)?, + }; + for s in stmts { + for product in &s.products { + let key = (s.vuln_id.clone(), product.clone()); + seen.entry(key).or_insert_with(|| { + let idx = out.len(); + out.push(VexStatement { + vuln_id: s.vuln_id.clone(), + products: vec![product.clone()], + status: s.status, + justification: s.justification.clone(), + status_notes: s.status_notes.clone(), + }); + idx + }); + } + // Statement with empty products list (broad statement) — keep + // once with empty products vec; matchers ignore unless future + // logic uses it. For now, drop. + if s.products.is_empty() { + let key = (s.vuln_id.clone(), String::new()); + seen.entry(key).or_insert_with(|| { + let idx = out.len(); + out.push(s.clone()); + idx + }); + } + } + } + Ok(out) +} + +fn detect_format(value: &serde_json::Value) -> Option { + if let Some(ctx) = value.get("@context").and_then(|v| v.as_str()) + && ctx.contains("openvex.dev/ns") + { + return Some(VexFormat::OpenVex); + } + if value.get("bomFormat").and_then(|v| v.as_str()) == Some("CycloneDX") + && value + .get("vulnerabilities") + .and_then(|v| v.as_array()) + .is_some() + { + return Some(VexFormat::CycloneDxVex); + } + None +} + +fn parse_openvex(value: &serde_json::Value, path: &Path) -> Result> { + let stmts = value + .get("statements") + .and_then(|v| v.as_array()) + .ok_or_else(|| { + anyhow::anyhow!("OpenVEX doc missing `statements` array: {}", path.display()) + })?; + let mut out = Vec::with_capacity(stmts.len()); + for s in stmts { + let vuln_id = s + .get("vulnerability") + .and_then(|v| v.get("name")) + .and_then(|v| v.as_str()) + .or_else(|| { + // Older OpenVEX drafts allowed `vulnerability` as a bare string. + s.get("vulnerability").and_then(|v| v.as_str()) + }) + .unwrap_or("") + .to_string(); + if vuln_id.is_empty() { + continue; + } + let status_raw = s.get("status").and_then(|v| v.as_str()).unwrap_or(""); + let Some(status) = VexStatus::from_openvex(status_raw) else { + continue; + }; + let mut products: Vec = Vec::new(); + if let Some(arr) = s.get("products").and_then(|v| v.as_array()) { + for p in arr { + if let Some(s) = p.as_str() { + products.push(s.to_string()); + } else if let Some(id) = p.get("@id").and_then(|v| v.as_str()) { + products.push(id.to_string()); + } else if let Some(id) = p.get("id").and_then(|v| v.as_str()) { + products.push(id.to_string()); + } + } + } + let justification = s + .get("justification") + .and_then(|v| v.as_str()) + .map(str::to_string); + let status_notes = s + .get("status_notes") + .and_then(|v| v.as_str()) + .map(str::to_string); + out.push(VexStatement { + vuln_id, + products, + status, + justification, + status_notes, + }); + } + Ok(out) +} + +fn parse_cyclonedx_vex(value: &serde_json::Value, path: &Path) -> Result> { + let vulns = value + .get("vulnerabilities") + .and_then(|v| v.as_array()) + .ok_or_else(|| { + anyhow::anyhow!( + "CycloneDX VEX missing `vulnerabilities` array: {}", + path.display() + ) + })?; + let mut out = Vec::with_capacity(vulns.len()); + for v in vulns { + let vuln_id = v + .get("id") + .and_then(|x| x.as_str()) + .unwrap_or("") + .to_string(); + if vuln_id.is_empty() { + continue; + } + let analysis = v.get("analysis"); + let state = analysis + .and_then(|a| a.get("state")) + .and_then(|x| x.as_str()) + .unwrap_or(""); + let Some(status) = VexStatus::from_cyclonedx_state(state) else { + continue; + }; + let mut products: Vec = Vec::new(); + if let Some(arr) = v.get("affects").and_then(|v| v.as_array()) { + for a in arr { + if let Some(r) = a.get("ref").and_then(|x| x.as_str()) { + products.push(r.to_string()); + } + } + } + let justification = analysis + .and_then(|a| a.get("justification")) + .and_then(|x| x.as_str()) + .map(str::to_string); + let status_notes = analysis + .and_then(|a| a.get("detail")) + .and_then(|x| x.as_str()) + .map(str::to_string); + out.push(VexStatement { + vuln_id, + products, + status, + justification, + status_notes, + }); + } + Ok(out) +} + +/// What the VEX matcher decided to do with a statement+finding pair. +#[derive(Debug, Clone, PartialEq, Eq)] +pub enum VexEffect { + /// Drop the finding entirely (status `not_affected` or `fixed`). + Suppress { + status: VexStatus, + justification: Option, + }, + /// Keep the finding but annotate it (`under_investigation` / + /// `affected`). + Annotate { + status: VexStatus, + justification: Option, + }, +} + +impl VexEffect { + pub fn is_suppress(&self) -> bool { + matches!(self, VexEffect::Suppress { .. }) + } + + pub fn status(&self) -> VexStatus { + match self { + VexEffect::Suppress { status, .. } | VexEffect::Annotate { status, .. } => *status, + } + } + + pub fn justification(&self) -> Option<&str> { + match self { + VexEffect::Suppress { justification, .. } + | VexEffect::Annotate { justification, .. } => justification.as_deref(), + } + } +} + +/// In-memory matcher — group statements by vuln_id for O(1) lookup, with +/// an additional product-keyed inner map for product-specific resolution. +pub struct VexIndex { + /// `vuln_id -> Vec` (preserved order from load()). + by_vuln: HashMap>, +} + +impl VexIndex { + pub fn build(stmts: Vec) -> Self { + let mut by_vuln: HashMap> = HashMap::new(); + for s in stmts { + by_vuln.entry(s.vuln_id.clone()).or_default().push(s); + } + Self { by_vuln } + } + + pub fn is_empty(&self) -> bool { + self.by_vuln.is_empty() + } + + /// Resolve a `(vuln_id_candidates, product_purl)` pair to an effect. + /// `candidates` is the ordered list `[primary_id, alias1, alias2, ...]` + /// the caller will try; the first matching statement wins. + pub fn resolve<'a, I>(&self, candidates: I, product: &str) -> Option + where + I: IntoIterator, + { + for cand in candidates { + let Some(stmts) = self.by_vuln.get(cand) else { + continue; + }; + for s in stmts { + if s.products.iter().any(|p| product_matches(p, product)) { + return Some(effect_for(s)); + } + } + } + None + } +} + +/// Product matching: exact equality, OR a versionless product matches a +/// versioned finding-product (e.g. statement `pkg:npm/foo` matches +/// finding `pkg:npm/foo@1.2.3`). The reverse is NOT permitted — a +/// statement with a specific version must not match a different version. +fn product_matches(stmt_product: &str, finding_product: &str) -> bool { + if stmt_product == finding_product { + return true; + } + if !stmt_product.contains('@') + && let Some(stripped) = finding_product.split_once('@') + && stripped.0 == stmt_product + { + return true; + } + false +} + +fn effect_for(s: &VexStatement) -> VexEffect { + match s.status { + VexStatus::NotAffected | VexStatus::Fixed => VexEffect::Suppress { + status: s.status, + justification: s.justification.clone(), + }, + VexStatus::Affected | VexStatus::UnderInvestigation => VexEffect::Annotate { + status: s.status, + justification: s.justification.clone(), + }, + } +} + +/// Synthetic IDs bomdrift uses for non-CVE finding kinds. The same scheme +/// is used by `--emit-vex` (Phase H) and `--vex` (this module) so users +/// can write `not_affected` statements against typosquat / version-jump / +/// maintainer-age / license-violation findings. +pub mod synthetic_id { + use crate::enrich::LicenseViolation; + use crate::enrich::maintainer::MaintainerAgeFinding; + use crate::enrich::typosquat::TyposquatFinding; + use crate::enrich::version_jump::VersionJumpFinding; + + pub fn typosquat(f: &TyposquatFinding) -> String { + let purl = f.component.purl.as_deref().unwrap_or(&f.component.name); + format!("bomdrift.typosquat:{purl}:{}", f.closest) + } + + pub fn version_jump(f: &VersionJumpFinding) -> String { + let purl = f.after.purl.as_deref().unwrap_or(&f.after.name); + format!( + "bomdrift.version-jump:{purl}:{}->{}", + f.before_major, f.after_major + ) + } + + pub fn maintainer_age(f: &MaintainerAgeFinding) -> String { + let purl = f.component.purl.as_deref().unwrap_or(&f.component.name); + format!("bomdrift.young-maintainer:{purl}:{}", f.top_contributor) + } + + pub fn license_violation(v: &LicenseViolation) -> String { + let purl = v.component.purl.as_deref().unwrap_or(&v.component.name); + format!("bomdrift.license-violation:{purl}:{}", v.license) + } +} + +/// Attached VEX annotation kept on a finding when status is `affected` or +/// `under_investigation`. Renderers surface these as inline badges. +#[derive(Debug, Clone, PartialEq, Eq, Serialize)] +pub struct VexAnnotation { + pub status: String, + #[serde(skip_serializing_if = "Option::is_none")] + pub justification: Option, +} + +impl VexAnnotation { + pub fn from_effect(effect: &VexEffect) -> Self { + Self { + status: effect.status().as_str().to_string(), + justification: effect.justification().map(str::to_string), + } + } +} + +/// Apply the VEX index to an `Enrichment`. Suppresses findings with +/// `not_affected` / `fixed` statements and attaches annotations to +/// findings with `affected` / `under_investigation` statements. Returns +/// the count of suppressed findings (set as `vex_suppressed_count`). +pub fn apply(enrichment: &mut crate::enrich::Enrichment, idx: &VexIndex) { + if idx.is_empty() { + return; + } + let mut suppressed: usize = 0; + + // ---- vulns ---- + let mut vulns = std::mem::take(&mut enrichment.vulns); + for (purl, refs) in vulns.iter_mut() { + refs.retain(|v| { + let mut cands: Vec<&str> = vec![v.id.as_str()]; + cands.extend(v.aliases.iter().map(String::as_str)); + match idx.resolve(cands.iter().copied(), purl) { + Some(effect) => { + if effect.is_suppress() { + suppressed += 1; + false + } else { + let key = format!("cve:{purl}:{}", v.id); + enrichment + .vex_annotations + .insert(key, VexAnnotation::from_effect(&effect)); + true + } + } + None => true, + } + }); + } + vulns.retain(|_, refs| !refs.is_empty()); + enrichment.vulns = vulns; + + // ---- typosquats ---- + let typos = std::mem::take(&mut enrichment.typosquats); + enrichment.typosquats = typos + .into_iter() + .filter(|f| { + let purl = f.component.purl.clone().unwrap_or_default(); + let id = synthetic_id::typosquat(f); + match idx.resolve([id.as_str()], &purl) { + Some(effect) => { + if effect.is_suppress() { + suppressed += 1; + false + } else { + enrichment + .vex_annotations + .insert(id, VexAnnotation::from_effect(&effect)); + true + } + } + None => true, + } + }) + .collect(); + + // ---- version_jumps ---- + let vjs = std::mem::take(&mut enrichment.version_jumps); + enrichment.version_jumps = vjs + .into_iter() + .filter(|f| { + let purl = f.after.purl.clone().unwrap_or_default(); + let id = synthetic_id::version_jump(f); + match idx.resolve([id.as_str()], &purl) { + Some(effect) => { + if effect.is_suppress() { + suppressed += 1; + false + } else { + enrichment + .vex_annotations + .insert(id, VexAnnotation::from_effect(&effect)); + true + } + } + None => true, + } + }) + .collect(); + + // ---- maintainer_age ---- + let ma = std::mem::take(&mut enrichment.maintainer_age); + enrichment.maintainer_age = ma + .into_iter() + .filter(|f| { + let purl = f.component.purl.clone().unwrap_or_default(); + let id = synthetic_id::maintainer_age(f); + match idx.resolve([id.as_str()], &purl) { + Some(effect) => { + if effect.is_suppress() { + suppressed += 1; + false + } else { + enrichment + .vex_annotations + .insert(id, VexAnnotation::from_effect(&effect)); + true + } + } + None => true, + } + }) + .collect(); + + // ---- license_violations ---- + let lv = std::mem::take(&mut enrichment.license_violations); + enrichment.license_violations = lv + .into_iter() + .filter(|v| { + let purl = v.component.purl.clone().unwrap_or_default(); + let id = synthetic_id::license_violation(v); + match idx.resolve([id.as_str()], &purl) { + Some(effect) => { + if effect.is_suppress() { + suppressed += 1; + false + } else { + enrichment + .vex_annotations + .insert(id, VexAnnotation::from_effect(&effect)); + true + } + } + None => true, + } + }) + .collect(); + + enrichment.vex_suppressed_count += suppressed; +} + +#[cfg(test)] +mod tests { + use super::*; + use std::io::Write as _; + + fn write_tmp(name: &str, body: &str) -> PathBuf { + let dir = std::env::temp_dir().join(format!( + "bomdrift-vex-{}-{}", + std::process::id(), + std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() + )); + std::fs::create_dir_all(&dir).unwrap(); + let path = dir.join(name); + let mut f = std::fs::File::create(&path).unwrap(); + f.write_all(body.as_bytes()).unwrap(); + path + } + + #[test] + fn load_openvex_basic() { + let body = r#"{ + "@context": "https://openvex.dev/ns/v0.2.0", + "@id": "https://x/y", + "author": "test", + "timestamp": "2026-01-01T00:00:00Z", + "version": 1, + "statements": [ + { + "vulnerability": {"name": "CVE-2024-1111"}, + "products": [{"@id": "pkg:npm/foo@1.0.0"}], + "status": "not_affected", + "justification": "vulnerable_code_not_present" + }, + { + "vulnerability": {"name": "CVE-2024-2222"}, + "products": ["pkg:npm/bar@2.0.0"], + "status": "under_investigation" + } + ] + }"#; + let p = write_tmp("openvex.json", body); + let stmts = load(&[p]).unwrap(); + assert_eq!(stmts.len(), 2); + assert_eq!(stmts[0].vuln_id, "CVE-2024-1111"); + assert_eq!(stmts[0].status, VexStatus::NotAffected); + assert_eq!( + stmts[0].justification.as_deref(), + Some("vulnerable_code_not_present") + ); + assert_eq!(stmts[1].status, VexStatus::UnderInvestigation); + } + + #[test] + fn load_cyclonedx_vex_basic() { + let body = r#"{ + "bomFormat": "CycloneDX", + "specVersion": "1.6", + "vulnerabilities": [ + { + "id": "CVE-2024-3333", + "affects": [{"ref": "pkg:npm/baz@3.0.0"}], + "analysis": { + "state": "not_affected", + "justification": "code_not_reachable", + "detail": "see PR #99" + } + }, + { + "id": "CVE-2024-4444", + "affects": [{"ref": "pkg:npm/qux@4.0.0"}], + "analysis": { "state": "exploitable" } + } + ] + }"#; + let p = write_tmp("cdx.json", body); + let stmts = load(&[p]).unwrap(); + assert_eq!(stmts.len(), 2); + assert_eq!(stmts[0].vuln_id, "CVE-2024-3333"); + assert_eq!(stmts[0].status, VexStatus::NotAffected); + assert_eq!(stmts[0].status_notes.as_deref(), Some("see PR #99")); + assert_eq!(stmts[1].status, VexStatus::Affected); + } + + #[test] + fn unknown_format_errors_with_path() { + let p = write_tmp("bad.json", r#"{"foo":"bar"}"#); + let err = load(std::slice::from_ref(&p)).unwrap_err().to_string(); + assert!(err.contains(&p.display().to_string())); + assert!(err.to_lowercase().contains("vex format") || err.contains("OpenVEX")); + } + + #[test] + fn first_write_wins_across_multiple_files() { + let a = write_tmp( + "a.json", + r#"{ + "@context": "https://openvex.dev/ns/v0.2.0", + "statements": [{"vulnerability": {"name": "CVE-A"}, "products": [{"@id": "pkg:npm/x@1.0.0"}], "status": "not_affected"}] + }"#, + ); + let b = write_tmp( + "b.json", + r#"{ + "@context": "https://openvex.dev/ns/v0.2.0", + "statements": [{"vulnerability": {"name": "CVE-A"}, "products": [{"@id": "pkg:npm/x@1.0.0"}], "status": "affected"}] + }"#, + ); + let stmts = load(&[a, b]).unwrap(); + assert_eq!(stmts.len(), 1); + assert_eq!(stmts[0].status, VexStatus::NotAffected); + } + + #[test] + fn matcher_resolves_by_alias() { + let stmt = VexStatement { + vuln_id: "CVE-2024-X".into(), + products: vec!["pkg:npm/foo@1.0.0".into()], + status: VexStatus::NotAffected, + justification: Some("vulnerable_code_not_present".into()), + status_notes: None, + }; + let idx = VexIndex::build(vec![stmt]); + // Primary is GHSA, alias is CVE-2024-X — match through alias. + let cands = ["GHSA-abc", "CVE-2024-X"]; + let effect = idx + .resolve(cands.iter().copied(), "pkg:npm/foo@1.0.0") + .expect("matched via alias"); + assert!(effect.is_suppress()); + assert_eq!(effect.status(), VexStatus::NotAffected); + } + + #[test] + fn matcher_rejects_mismatched_product() { + let stmt = VexStatement { + vuln_id: "CVE-1".into(), + products: vec!["pkg:npm/foo@1.0.0".into()], + status: VexStatus::NotAffected, + justification: None, + status_notes: None, + }; + let idx = VexIndex::build(vec![stmt]); + assert!(idx.resolve(["CVE-1"], "pkg:npm/bar@1.0.0").is_none()); + } + + #[test] + fn matcher_versionless_product_matches_versioned_finding() { + let stmt = VexStatement { + vuln_id: "CVE-1".into(), + products: vec!["pkg:npm/foo".into()], + status: VexStatus::Fixed, + justification: None, + status_notes: None, + }; + let idx = VexIndex::build(vec![stmt]); + let effect = idx.resolve(["CVE-1"], "pkg:npm/foo@9.9.9").unwrap(); + assert!(effect.is_suppress()); + } + + #[test] + fn under_investigation_annotates_not_suppresses() { + let stmt = VexStatement { + vuln_id: "CVE-1".into(), + products: vec!["pkg:npm/foo@1.0.0".into()], + status: VexStatus::UnderInvestigation, + justification: None, + status_notes: None, + }; + let idx = VexIndex::build(vec![stmt]); + let effect = idx.resolve(["CVE-1"], "pkg:npm/foo@1.0.0").unwrap(); + assert!(!effect.is_suppress()); + assert_eq!(effect.status(), VexStatus::UnderInvestigation); + } +} From 947fc54376370de8d884c2a835c6f32e8c582e8b Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:15:03 -0700 Subject: [PATCH 2/8] feat(vex): emit OpenVEX 0.2.0 doc with explicit per-entry vex_status MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New `--emit-vex ` flag on `bomdrift diff`. Writes a single OpenVEX 0.2.0 JSON-LD doc covering: - baseline-suppressed findings (status from each entry's optional `vex_status`, defaulting to `under_investigation` — baseline ≠ not_affected, never auto-promoted); - un-suppressed findings (status `affected`, with `status_notes` describing the bomdrift finding kind). Baseline schema extended with optional `vex_status` (`not_affected|affected|fixed|under_investigation`) and `vex_justification` on object-form entries. Plain string-form entries are unchanged. New `[diff] vex_author` and `[diff] vex_default_justification` config keys plus `--vex-author` / `--vex-default-justification` CLI flags. `vex_author` falls back to repo_url, then to `bomdrift`. Statements are sorted by `(vulnerability.name, products[0].@id)` and the timestamp uses `clock::now()` so the emitted doc is byte- deterministic in CI when `SOURCE_DATE_EPOCH` is set. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- src/baseline.rs | 44 ++++++- src/cli.rs | 10 ++ src/config.rs | 14 +++ src/lib.rs | 28 +++++ src/vex.rs | 308 ++++++++++++++++++++++++++++++++++++++++++++++++ 5 files changed, 403 insertions(+), 1 deletion(-) diff --git a/src/baseline.rs b/src/baseline.rs index bad8cd1..195ca23 100644 --- a/src/baseline.rs +++ b/src/baseline.rs @@ -62,6 +62,25 @@ pub struct Baseline { /// Surface to the caller for stderr warnings; do NOT contribute to /// suppression. pub expired_entries: Vec, + /// v0.9+ rich entries from object-form `suppressed_advisories`. + /// Keyed in insertion order so VEX emission (Phase H) can surface + /// `vex_status` / `vex_justification` / `reason` without re-parsing + /// the source JSON. Both expired and active entries appear here — + /// callers filter as needed. + pub entries: Vec, +} + +/// A rich baseline entry preserved for VEX emission. Plain string-form +/// entries (`"GHSA-..."`) do NOT appear here — they have no metadata +/// to preserve. Object-form entries always do. +#[derive(Debug, Clone, PartialEq, Eq)] +pub struct BaselineEntry { + pub id: String, + pub purl: Option, + pub reason: Option, + pub expires: Option, + pub vex_status: Option, + pub vex_justification: Option, } /// A baseline entry whose `expires` date is strictly before today. The diff @@ -185,7 +204,30 @@ impl Baseline { .get("reason") .and_then(|v| v.as_str()) .map(str::to_string); - if let Some(expires_s) = obj.get("expires").and_then(|v| v.as_str()) { + let vex_status = obj + .get("vex_status") + .and_then(|v| v.as_str()) + .map(str::to_string); + let vex_justification = obj + .get("vex_justification") + .and_then(|v| v.as_str()) + .map(str::to_string); + let expires_str = obj + .get("expires") + .and_then(|v| v.as_str()) + .map(str::to_string); + // Track the rich entry for VEX emission regardless + // of expiry — emission may include expired entries + // for documentation; suppression below honors expiry. + out.entries.push(BaselineEntry { + id: id.to_string(), + purl: purl.clone(), + reason: reason.clone(), + expires: expires_str.clone(), + vex_status: vex_status.clone(), + vex_justification: vex_justification.clone(), + }); + if let Some(expires_s) = expires_str.as_deref() { match clock::parse_ymd(expires_s) { Ok(date) => { if clock::is_expired(date) { diff --git a/src/cli.rs b/src/cli.rs index 13f86c6..4a22efc 100644 --- a/src/cli.rs +++ b/src/cli.rs @@ -323,6 +323,16 @@ pub struct DiffArgs { /// claims); un-suppressed findings emit as `affected`. v0.9+. #[arg(long)] pub emit_vex: Option, + /// VEX `author` for `--emit-vex`. Falls back to repo_url, then + /// to `"bomdrift"`. v0.9+. + #[arg(long)] + pub vex_author: Option, + /// Default OpenVEX `justification` written into emitted statements + /// when the source baseline entry doesn't supply one. Defaults to + /// `"vulnerable_code_not_in_execute_path"` — the safe fallback per + /// the OpenVEX spec. + #[arg(long)] + pub vex_default_justification: Option, #[arg(long)] pub debug_calibration: bool, /// Format for `--debug-calibration` rows. `pipe` (default, back-compat diff --git a/src/config.rs b/src/config.rs index 9772a98..b9bba56 100644 --- a/src/config.rs +++ b/src/config.rs @@ -57,6 +57,12 @@ pub struct DiffConfig { pub debug_calibration: Option, pub debug_calibration_format: Option, pub output_file: Option, + /// VEX `author` field for `--emit-vex`. Falls back to `repo_url`, + /// then to the literal `"bomdrift"`. + pub vex_author: Option, + /// Default OpenVEX justification when an entry doesn't supply one. + /// Defaults to `"vulnerable_code_not_in_execute_path"`. + pub vex_default_justification: Option, } pub fn apply_diff_config(args: &mut DiffArgs) -> Result<()> { @@ -131,6 +137,12 @@ fn apply_loaded_diff_config(args: &mut DiffArgs, config: Config) { if args.output_file.is_none() { args.output_file = diff.output_file; } + if args.vex_author.is_none() { + args.vex_author = diff.vex_author.filter(|s| !s.is_empty()); + } + if args.vex_default_justification.is_none() { + args.vex_default_justification = diff.vex_default_justification.filter(|s| !s.is_empty()); + } // [license] block: CLI flags override (not merge) when set. Mirrors // Dependency Review Action semantics so users moving between bomdrift @@ -201,6 +213,8 @@ mod tests { allow_ambiguous_licenses: false, vex: Vec::new(), emit_vex: None, + vex_author: None, + vex_default_justification: None, } } diff --git a/src/lib.rs b/src/lib.rs index 2052a70..a121c06 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -181,6 +181,7 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { // the realized finding set, not on intermediate inputs. This keeps the // baseline file format stable as new enrichers are added: a new finding // type that the baseline doesn't know about simply isn't suppressed. + let mut baseline_entries: Vec = Vec::new(); if let Some(path) = &args.baseline { let baseline = baseline::Baseline::load(path)?; for ent in &baseline.expired_entries { @@ -200,6 +201,7 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { .unwrap_or_default(), ); } + baseline_entries = baseline.entries.clone(); baseline::apply(&mut cs, &mut enrichment, &baseline); } @@ -219,6 +221,32 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { } } + // VEX emission (Phase H, v0.9). Writes a single OpenVEX 0.2.0 doc + // to the requested path, covering baseline-suppressed entries and + // un-suppressed findings. Byte-deterministic when SOURCE_DATE_EPOCH + // is set. + if let Some(path) = &args.emit_vex { + let author = args + .vex_author + .clone() + .or_else(|| args.repo_url.clone()) + .or_else(|| std::env::var("BOMDRIFT_REPO_URL").ok()) + .filter(|s| !s.is_empty()) + .unwrap_or_else(|| "bomdrift".to_string()); + let default_just = args + .vex_default_justification + .clone() + .unwrap_or_else(|| "vulnerable_code_not_in_execute_path".to_string()); + let opts = vex::EmitOptions { + author: &author, + default_justification: &default_just, + baseline_entries: &baseline_entries, + }; + let body = vex::emit(&cs, &enrichment, &opts); + std::fs::write(path, body) + .with_context(|| format!("writing --emit-vex {}", path.display()))?; + } + // Calibration tap. Off by default; opt-in via `--debug-calibration`. // Emits one CSV-friendly line per finding to stderr so an adopter // can run the flag across a representative N PRs and feed the diff --git a/src/vex.rs b/src/vex.rs index 06645e7..5618c18 100644 --- a/src/vex.rs +++ b/src/vex.rs @@ -566,6 +566,205 @@ pub fn apply(enrichment: &mut crate::enrich::Enrichment, idx: &VexIndex) { enrichment.vex_suppressed_count += suppressed; } +/// Synthesized OpenVEX 0.2.0 doc emission (Phase H). Produces a +/// byte-deterministic JSON-LD doc suitable for downstream consumers. +/// +/// Statements come from two sources: +/// - **Baseline-suppressed findings**: rich object-form baseline entries +/// contribute one statement each, with `status` taken from the entry's +/// `vex_status` (default `under_investigation`). Plain string-form +/// baseline entries are NEVER auto-promoted to `not_affected` — to +/// make a `not_affected` claim, the user must opt in by adding +/// `vex_status: "not_affected"` to the baseline entry. +/// - **Un-suppressed findings** in the diff: emit as `affected` with +/// `status_notes` describing the bomdrift finding kind. +pub struct EmitOptions<'a> { + pub author: &'a str, + pub default_justification: &'a str, + pub baseline_entries: &'a [crate::baseline::BaselineEntry], +} + +#[derive(Debug, Clone)] +struct EmitStmt { + vuln_id: String, + product: String, + status: VexStatus, + justification: Option, + status_notes: Option, +} + +/// Build the OpenVEX document body and return it as a serialized +/// pretty-printed JSON string. Statements are sorted by +/// `(vulnerability.name, products[0].@id)` for byte-determinism. +pub fn emit( + cs: &crate::diff::ChangeSet, + enrichment: &crate::enrich::Enrichment, + opts: &EmitOptions<'_>, +) -> String { + let _ = cs; // reserved for future per-component extension + let mut stmts: Vec = Vec::new(); + + // Baseline-suppressed entries: one statement per (id, purl) pair. + for be in opts.baseline_entries { + let status = be + .vex_status + .as_deref() + .and_then(VexStatus::from_openvex) + .unwrap_or(VexStatus::UnderInvestigation); + let justification = be + .vex_justification + .clone() + .or_else(|| Some(opts.default_justification.to_string())); + let product = be.purl.clone().unwrap_or_default(); + stmts.push(EmitStmt { + vuln_id: be.id.clone(), + product, + status, + justification, + status_notes: be.reason.clone(), + }); + } + + // Un-suppressed findings: emit as `affected`. + let mut vuln_keys: Vec<&String> = enrichment.vulns.keys().collect(); + vuln_keys.sort(); + for purl in vuln_keys { + let mut refs: Vec<&crate::enrich::VulnRef> = enrichment.vulns[purl].iter().collect(); + refs.sort_by(|a, b| a.id.cmp(&b.id)); + for r in refs { + stmts.push(EmitStmt { + vuln_id: r.id.clone(), + product: purl.clone(), + status: VexStatus::Affected, + justification: Some(opts.default_justification.to_string()), + status_notes: Some(format!( + "bomdrift finding kind: cve (severity {})", + r.severity + )), + }); + } + } + for f in &enrichment.typosquats { + let purl = f.component.purl.clone().unwrap_or_default(); + stmts.push(EmitStmt { + vuln_id: synthetic_id::typosquat(f), + product: purl, + status: VexStatus::Affected, + justification: Some(opts.default_justification.to_string()), + status_notes: Some(format!( + "bomdrift finding kind: typosquat (similar to {})", + f.closest + )), + }); + } + for f in &enrichment.version_jumps { + let purl = f.after.purl.clone().unwrap_or_default(); + stmts.push(EmitStmt { + vuln_id: synthetic_id::version_jump(f), + product: purl, + status: VexStatus::Affected, + justification: Some(opts.default_justification.to_string()), + status_notes: Some(format!( + "bomdrift finding kind: version-jump ({} -> {})", + f.before_major, f.after_major + )), + }); + } + for f in &enrichment.maintainer_age { + let purl = f.component.purl.clone().unwrap_or_default(); + stmts.push(EmitStmt { + vuln_id: synthetic_id::maintainer_age(f), + product: purl, + status: VexStatus::Affected, + justification: Some(opts.default_justification.to_string()), + status_notes: Some(format!( + "bomdrift finding kind: young-maintainer ({} days)", + f.days_old + )), + }); + } + for v in &enrichment.license_violations { + let purl = v.component.purl.clone().unwrap_or_default(); + stmts.push(EmitStmt { + vuln_id: synthetic_id::license_violation(v), + product: purl, + status: VexStatus::Affected, + justification: Some(opts.default_justification.to_string()), + status_notes: Some(format!( + "bomdrift finding kind: license-violation ({})", + v.matched_rule + )), + }); + } + + // Sort for byte-determinism. + stmts.sort_by(|a, b| { + a.vuln_id + .cmp(&b.vuln_id) + .then_with(|| a.product.cmp(&b.product)) + }); + + // De-dupe on (vuln_id, product) — the baseline-derived statements + // take precedence (first-seen-wins after sort). + let mut seen: std::collections::HashSet<(String, String)> = std::collections::HashSet::new(); + stmts.retain(|s| seen.insert((s.vuln_id.clone(), s.product.clone()))); + + let timestamp = crate::clock::format_rfc3339(crate::clock::now()); + + // @id: a stable identifier for this emission. Deterministic when + // SOURCE_DATE_EPOCH is set because timestamp is fixed. + let id_src = format!("{}#{}", opts.author, timestamp); + let mut hasher = sha2::Sha256::new(); + use sha2::Digest; + hasher.update(id_src.as_bytes()); + let digest = hasher.finalize(); + let id_hash: String = digest.iter().take(8).map(|b| format!("{b:02x}")).collect(); + let doc_id = format!("https://bomdrift.example/openvex/{id_hash}"); + + let statements_json: Vec = stmts + .iter() + .map(|s| { + let mut obj = serde_json::Map::new(); + obj.insert( + "vulnerability".into(), + serde_json::json!({ "name": s.vuln_id }), + ); + if !s.product.is_empty() { + obj.insert("products".into(), serde_json::json!([{ "@id": s.product }])); + } + obj.insert( + "status".into(), + serde_json::Value::String(s.status.as_str().to_string()), + ); + if let Some(j) = &s.justification + && matches!(s.status, VexStatus::NotAffected) + { + // OpenVEX requires `justification` only for not_affected. + obj.insert("justification".into(), serde_json::Value::String(j.clone())); + } else if let Some(j) = &s.justification { + // Carry as `impact_statement` proxy via `justification` + // for affected/under_investigation rows is non-standard; + // store as `status_notes` instead — handled below. + let _ = j; + } + if let Some(n) = &s.status_notes { + obj.insert("status_notes".into(), serde_json::Value::String(n.clone())); + } + serde_json::Value::Object(obj) + }) + .collect(); + + let doc = serde_json::json!({ + "@context": "https://openvex.dev/ns/v0.2.0", + "@id": doc_id, + "author": opts.author, + "timestamp": timestamp, + "version": 1, + "statements": statements_json, + }); + serde_json::to_string_pretty(&doc).expect("serialize OpenVEX doc") +} + #[cfg(test)] mod tests { use super::*; @@ -741,4 +940,113 @@ mod tests { assert!(!effect.is_suppress()); assert_eq!(effect.status(), VexStatus::UnderInvestigation); } + + // ---------- Phase H: emission ---------- + + fn pin_clock(secs: i64) { + // SAFETY: tests in this module are serialized by env_lock equivalence. + unsafe { + std::env::set_var("SOURCE_DATE_EPOCH", secs.to_string()); + } + } + fn unpin_clock() { + unsafe { + std::env::remove_var("SOURCE_DATE_EPOCH"); + } + } + + #[test] + fn emission_roundtrip_via_loader() { + pin_clock(1_700_000_000); + let cs = crate::diff::ChangeSet::default(); + let e = crate::enrich::Enrichment::default(); + let entries = vec![crate::baseline::BaselineEntry { + id: "GHSA-x-y-z".into(), + purl: Some("pkg:npm/foo@1.0.0".into()), + reason: Some("audited".into()), + expires: None, + vex_status: Some("not_affected".into()), + vex_justification: Some("vulnerable_code_not_present".into()), + }]; + let opts = EmitOptions { + author: "test-suite", + default_justification: "vulnerable_code_not_in_execute_path", + baseline_entries: &entries, + }; + let body = emit(&cs, &e, &opts); + + let dir = std::env::temp_dir().join(format!( + "bomdrift-vex-emit-rt-{}-{}", + std::process::id(), + std::time::SystemTime::now() + .duration_since(std::time::UNIX_EPOCH) + .unwrap() + .as_nanos() + )); + std::fs::create_dir_all(&dir).unwrap(); + let path = dir.join("out.openvex.json"); + std::fs::write(&path, &body).unwrap(); + let stmts = load(&[path]).unwrap(); + assert_eq!(stmts.len(), 1); + assert_eq!(stmts[0].vuln_id, "GHSA-x-y-z"); + assert_eq!(stmts[0].status, VexStatus::NotAffected); + assert_eq!(stmts[0].products, vec!["pkg:npm/foo@1.0.0".to_string()]); + unpin_clock(); + } + + #[test] + fn emission_default_status_is_under_investigation() { + // Anti-false-claim guard: a plain baseline entry without + // `vex_status` must NOT be auto-promoted to `not_affected`. + pin_clock(1_700_000_000); + let cs = crate::diff::ChangeSet::default(); + let e = crate::enrich::Enrichment::default(); + let entries = vec![crate::baseline::BaselineEntry { + id: "GHSA-no-status".into(), + purl: Some("pkg:npm/bar@1.0.0".into()), + reason: None, + expires: None, + vex_status: None, + vex_justification: None, + }]; + let opts = EmitOptions { + author: "x", + default_justification: "vulnerable_code_not_in_execute_path", + baseline_entries: &entries, + }; + let body = emit(&cs, &e, &opts); + assert!( + body.contains("\"status\": \"under_investigation\""), + "default status must be under_investigation, got body:\n{body}" + ); + assert!( + !body.contains("\"status\": \"not_affected\""), + "must not auto-promote to not_affected; got:\n{body}" + ); + unpin_clock(); + } + + #[test] + fn emission_byte_deterministic_with_source_date_epoch() { + pin_clock(1_700_000_000); + let cs = crate::diff::ChangeSet::default(); + let e = crate::enrich::Enrichment::default(); + let entries = vec![crate::baseline::BaselineEntry { + id: "GHSA-1".into(), + purl: Some("pkg:npm/foo@1.0.0".into()), + reason: None, + expires: None, + vex_status: Some("not_affected".into()), + vex_justification: None, + }]; + let opts = EmitOptions { + author: "x", + default_justification: "vulnerable_code_not_in_execute_path", + baseline_entries: &entries, + }; + let a = emit(&cs, &e, &opts); + let b = emit(&cs, &e, &opts); + assert_eq!(a, b); + unpin_clock(); + } } From 2ec657390b898fbc5451c8d581cb5daad360f62b Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:21:06 -0700 Subject: [PATCH 3/8] feat(license): full SPDX expression evaluator via spdx crate MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Adds `spdx = "0.10"` and replaces the v0.8 atomic+glob matcher with proper SPDX evaluation: - `(MIT OR Apache-2.0)` with `allow=[MIT]` permits because the licensee can pick MIT. - `(MIT AND GPL-3.0-only)` with `deny=[GPL-3.0-only]` violates. - `(GPL-3.0-only OR MIT) AND BSD-3-Clause` with allow `[MIT, BSD-3-Clause]` and deny `[GPL-3.0-only]` violates: deny wins because a resolution path could pick GPL. - `Apache-2.0 WITH LLVM-exception` parses cleanly; the base license is checked. Per-exception allow/deny is informational only — that granularity is deferred to v1.0. Non-SPDX strings ("Custom", vendor-specific spellings) fall back to the v0.8 atomic+glob path so user-authored policies keep working. `NOASSERTION` / `OTHER` / empty stay ambiguous (fail-closed). `allow_ambiguous` is deprecated: a one-time stderr warning fires when set. The flag still works on the fallback path. Removal in v1.0. GNU licenses are special-cased: `spdx` strips `-only`/`-or-later` into a flag, so we expand candidate names back so user-authored policies that contain the original spelling match. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- Cargo.lock | 10 ++ Cargo.toml | 1 + docs/src/license-policy.md | 44 ++++++ src/enrich/license.rs | 265 +++++++++++++++++++++++++++++-------- 4 files changed, 266 insertions(+), 54 deletions(-) diff --git a/Cargo.lock b/Cargo.lock index f455489..256096d 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -134,6 +134,7 @@ dependencies = [ "serde", "serde_json", "sha2", + "spdx", "strsim", "supports-color 3.0.2", "thiserror", @@ -1111,6 +1112,15 @@ version = "1.15.1" source = "registry+https://github.com/rust-lang/crates.io-index" checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03" +[[package]] +name = "spdx" +version = "0.10.9" +source = "registry+https://github.com/rust-lang/crates.io-index" +checksum = "c3e17e880bafaeb362a7b751ec46bdc5b61445a188f80e0606e68167cd540fa3" +dependencies = [ + "smallvec", +] + [[package]] name = "stable_deref_trait" version = "1.2.1" diff --git a/Cargo.toml b/Cargo.toml index 7db67b2..e4add8f 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -33,6 +33,7 @@ directories = "6" toml = "0.8" time = { version = "0.3", default-features = false, features = ["serde", "parsing", "formatting", "macros", "std"] } sha2 = { version = "0.10", default-features = false } +spdx = { version = "0.10", default-features = false } [dev-dependencies] criterion = { version = "0.5", default-features = false, features = ["html_reports"] } diff --git a/docs/src/license-policy.md b/docs/src/license-policy.md index 2d95999..2ed20c3 100644 --- a/docs/src/license-policy.md +++ b/docs/src/license-policy.md @@ -74,3 +74,47 @@ suppression key. The v0.8 `expires` + `reason` fields work the same way. [GitHub Dependency Review Action]: https://github.com/actions/dependency-review-action + +## SPDX expression evaluation (v0.9+) + +bomdrift evaluates each license string as a full SPDX expression via +the `spdx` crate. Evaluation outcomes: + +| Expression | Allow | Deny | Outcome | +|---|---|---|---| +| `MIT` | `[MIT]` | — | Permitted (allow exact match) | +| `(MIT OR Apache-2.0)` | `[MIT]` | — | Permitted (one branch allowed) | +| `(MIT AND GPL-3.0-only)` | `[MIT]` | `[GPL-3.0-only]` | Violation (deny wins) | +| `(GPL-3.0-only OR MIT) AND BSD-3-Clause` | `[MIT, BSD-3-Clause]` | `[GPL-3.0-only]` | Violation (denial path could resolve to GPL) | +| `Apache-2.0 WITH LLVM-exception` | `[Apache-2.0]` | — | Permitted (base license allowed; exception identity is currently informational only) | +| `Custom` (non-SPDX) | `[MIT]` | — | Falls back to atomic match → not in allow list | +| `NOASSERTION` / `OTHER` / empty | `[MIT]` | — | Ambiguous → violation (fail-closed) | + +### Precedence + +1. **Deny wins** — any required atomic on the deny list (including any + OR-branch) trips a violation, because the resolved license could be + the denied alternative. +2. **Glob** — `*` suffix patterns work in both lists (e.g. `AGPL-*` + matches every `AGPL-*-only` family member). +3. **Allow** — when the allow list is non-empty, the SPDX expression + must `evaluate` to true under a closure that returns true for + allow-listed atomics. +4. **Non-SPDX strings** — fall through to the v0.8 atomic-string + matcher so vendor-specific license strings keep working. + +### Deprecated: `allow_ambiguous` + +The v0.8 `allow_ambiguous` flag flipped fail-closed behavior on +compound expressions. v0.9's evaluator handles compounds correctly, +so the flag is now a no-op when SPDX parsing succeeds. A one-time +deprecation warning is printed to stderr per run when the flag is +set. The flag still works on the fallback path (non-SPDX strings) for +back-compat; it will be removed in v1.0. + +### `WITH` (exception) granularity + +`WITH ` parses cleanly and the base license is checked +against allow/deny. Per-exception allow/deny granularity (e.g. +"allow `Apache-2.0 WITH LLVM-exception` but not other Apache-with-X +combos") is a future ask — not in v0.9 scope. diff --git a/src/enrich/license.rs b/src/enrich/license.rs index 59f065e..f0854df 100644 --- a/src/enrich/license.rs +++ b/src/enrich/license.rs @@ -1,24 +1,44 @@ -//! License-policy enrichment (v0.8+). +//! License-policy enrichment. //! -//! Distinct from [`crate::diff::ChangeSet::license_changed`] which detects -//! same-version license drift. This module evaluates each newly-added or -//! version-changed component's licenses against a configured allow / deny -//! policy and emits a [`LicenseViolation`] for every mismatch. +//! ## SPDX expression evaluation (v0.9+) //! -//! ## Matching rules (v0.8 — fail-closed) +//! Each license string from the SBOM is first attempted as an +//! [`spdx::Expression`]. When parsing succeeds the expression's +//! semantics drive the allow/deny decision: //! -//! - **Atomic** license string (no `AND`/`OR`/`WITH`/parentheses): exact -//! compare against allow/deny. Glob: `*` suffix matches any prefix -//! (`AGPL-*` matches `AGPL-3.0-only`, `AGPL-1.0-only`). -//! - **Compound** expression: ambiguous. With `allow_ambiguous=false` -//! (default) AND any policy is configured (allow OR deny non-empty), -//! emit an Ambiguous violation. With `allow_ambiguous=true`, permit. -//! - `NOASSERTION` / `OTHER` / empty: ambiguous (same fail-closed -//! semantics). +//! - **Deny check** — if ANY required SPDX atomic in the parsed +//! expression matches the deny list (exact ID or `*`-suffix glob), +//! the package is in violation. Deny is a stronger signal than +//! allow: the resolved license could be the denied alternative, so +//! we fail closed regardless of what the licensee picks. +//! - **Allow check** — when the allow list is non-empty, the +//! expression must `evaluate` to true under a closure that +//! returns true for allow-listed atomic IDs. `(MIT OR Apache-2.0)` +//! with `allow=[MIT]` permits because the licensee can pick MIT. +//! - **`WITH` operator** — handled by `spdx`'s parser; the base +//! license is checked against allow/deny. The exception identifier +//! is currently informational only — per-exception allow/deny is a +//! future ask not in v0.9 scope. //! -//! Deny wins when a license matches both allow and deny. +//! When SPDX parsing FAILS (non-SPDX strings like `"Custom"`, +//! `"Proprietary"`, vendor-specific spellings) we fall back to the +//! v0.8 atomic+glob matcher so policies authored against raw strings +//! keep working. //! -//! Full SPDX expression evaluation arrives in v0.9 via the `spdx` crate. +//! `NOASSERTION` / `OTHER` / empty are treated as ambiguous (same +//! fail-closed semantics as v0.8). +//! +//! ## Deprecated: `allow_ambiguous` +//! +//! In v0.8 this flag flipped fail-closed behavior on compound +//! expressions. v0.9's full SPDX evaluator handles compounds +//! correctly, so the flag is now a no-op when SPDX parsing +//! succeeds; it still works on the fallback path. A one-time +//! deprecation notice is printed to stderr when the flag is set. +//! +//! Deny wins when both allow and deny match. + +use std::sync::atomic::{AtomicBool, Ordering}; use crate::diff::ChangeSet; use crate::enrich::{LicenseViolation, LicenseViolationKind}; @@ -45,6 +65,9 @@ pub fn enrich(cs: &ChangeSet, policy: &Policy) -> Vec { if !policy.is_active() { return Vec::new(); } + if policy.allow_ambiguous { + warn_deprecated_allow_ambiguous_once(); + } let mut out = Vec::new(); for c in &cs.added { evaluate_component(c, policy, &mut out); @@ -57,9 +80,6 @@ pub fn enrich(cs: &ChangeSet, policy: &Policy) -> Vec { fn evaluate_component(c: &Component, policy: &Policy, out: &mut Vec) { if c.licenses.is_empty() { - // Empty license set: treat as ambiguous (we can't claim it's - // allowed). Fail-closed when policy is active and - // allow_ambiguous=false. if !policy.allow_ambiguous { out.push(LicenseViolation { component: c.clone(), @@ -79,13 +99,9 @@ fn evaluate_component(c: &Component, policy: &Policy, out: &mut Vec Option { let trimmed = lic.trim(); - let is_compound = is_compound_expression(trimmed); - let is_unknown = matches!( - trimmed.to_ascii_uppercase().as_str(), - "" | "NOASSERTION" | "OTHER" - ); - - if is_compound || is_unknown { + let upper = trimmed.to_ascii_uppercase(); + let is_unknown_marker = matches!(upper.as_str(), "" | "NOASSERTION" | "OTHER"); + if is_unknown_marker { if policy.allow_ambiguous { return None; } @@ -97,7 +113,96 @@ fn evaluate_one(c: &Component, lic: &str, policy: &Policy) -> Option evaluate_spdx(c, trimmed, &expr, policy), + Err(_) => evaluate_atomic_fallback(c, trimmed, policy), + } +} + +/// SPDX-evaluation path. Deny wins; allow uses `Expression::evaluate`. +fn evaluate_spdx( + c: &Component, + raw: &str, + expr: &spdx::Expression, + policy: &Policy, +) -> Option { + if !policy.deny.is_empty() { + for req in expr.requirements() { + for cand in canonical_names(&req.req.license) { + if let Some(rule) = matches_any(&cand, &policy.deny) { + return Some(LicenseViolation { + component: c.clone(), + license: raw.to_string(), + matched_rule: format!("deny: {rule}"), + kind: LicenseViolationKind::Deny, + }); + } + } + } + } + + if !policy.allow.is_empty() { + let ok = expr.evaluate(|req| { + canonical_names(&req.license) + .iter() + .any(|cand| matches_any(cand, &policy.allow).is_some()) + }); + if !ok { + return Some(LicenseViolation { + component: c.clone(), + license: raw.to_string(), + matched_rule: format!("not in allow list: {raw}"), + kind: LicenseViolationKind::NotAllowed, + }); + } + } + None +} + +/// SPDX normalizes GNU licenses by stripping the `-only` / `-or-later` +/// suffix into a flag on the `LicenseItem`. User-authored allow/deny +/// lists usually contain the original spelling (`GPL-3.0-only`, +/// `AGPL-3.0-or-later`), so we generate every candidate name an SPDX +/// `LicenseItem` could match. +fn canonical_names(item: &spdx::LicenseItem) -> Vec { + match item { + spdx::LicenseItem::Spdx { id, or_later } => { + let mut names = vec![id.name.to_string()]; + if id.is_gnu() { + if *or_later { + names.push(format!("{}-or-later", id.name)); + } else { + names.push(format!("{}-only", id.name)); + } + } else if *or_later { + names.push(format!("{}+", id.name)); + } + names + } + spdx::LicenseItem::Other { lic_ref, .. } => vec![lic_ref.clone()], + } +} + +/// v0.8 atomic+glob fallback for non-SPDX strings. +fn evaluate_atomic_fallback( + c: &Component, + trimmed: &str, + policy: &Policy, +) -> Option { + let is_compound = is_compound_expression(trimmed); + if is_compound { + if policy.allow_ambiguous { + return None; + } + return Some(LicenseViolation { + component: c.clone(), + license: trimmed.to_string(), + matched_rule: format!("ambiguous: {trimmed}"), + kind: LicenseViolationKind::Ambiguous, + }); + } if let Some(rule) = matches_any(trimmed, &policy.deny) { return Some(LicenseViolation { component: c.clone(), @@ -117,8 +222,9 @@ fn evaluate_one(c: &Component, lic: &str, policy: &Policy) -> Option glob > raw string. Globs are +/// `*`-suffix. fn matches_any(lic: &str, patterns: &[String]) -> Option { for p in patterns { if matches_pattern(lic, p) { @@ -137,7 +243,6 @@ fn matches_pattern(lic: &str, pattern: &str) -> bool { } fn is_compound_expression(s: &str) -> bool { - // Any of the SPDX operators or parens makes this a compound expression. if s.contains('(') || s.contains(')') { return true; } @@ -149,6 +254,18 @@ fn is_compound_expression(s: &str) -> bool { false } +static ALLOW_AMBIGUOUS_WARNED: AtomicBool = AtomicBool::new(false); + +fn warn_deprecated_allow_ambiguous_once() { + if ALLOW_AMBIGUOUS_WARNED.swap(true, Ordering::Relaxed) { + return; + } + eprintln!( + "warning: [license] allow_ambiguous is deprecated since v0.9; \ + SPDX expressions are now evaluated properly." + ); +} + #[cfg(test)] mod tests { use super::*; @@ -196,7 +313,6 @@ mod tests { let v = enrich(&cs, &policy); assert_eq!(v.len(), 1); assert_eq!(v[0].kind, LicenseViolationKind::Deny); - assert!(v[0].matched_rule.contains("GPL-3.0-only")); } #[test] @@ -211,29 +327,6 @@ mod tests { assert_eq!(v[0].matched_rule, "deny: AGPL-*"); } - #[test] - fn compound_ambiguous_fails_closed_by_default() { - let cs = cs_with_added(comp("foo", vec!["(MIT OR GPL-3.0-only)"])); - let policy = Policy { - allow: vec!["MIT".into()], - ..Default::default() - }; - let v = enrich(&cs, &policy); - assert_eq!(v.len(), 1); - assert_eq!(v[0].kind, LicenseViolationKind::Ambiguous); - } - - #[test] - fn compound_ambiguous_permitted_when_flag_set() { - let cs = cs_with_added(comp("foo", vec!["(MIT OR GPL-3.0-only)"])); - let policy = Policy { - allow: vec!["MIT".into()], - allow_ambiguous: true, - ..Default::default() - }; - assert!(enrich(&cs, &policy).is_empty()); - } - #[test] fn deny_wins_over_allow_when_both_match() { let cs = cs_with_added(comp("foo", vec!["GPL-3.0-only"])); @@ -294,4 +387,68 @@ mod tests { let v = enrich(&cs, &policy); assert_eq!(v.len(), 1); } + + // ---------- v0.9 SPDX expression eval tests ---------- + + #[test] + fn spdx_or_with_one_allowed_branch_permits() { + let cs = cs_with_added(comp("foo", vec!["(MIT OR Apache-2.0)"])); + let policy = Policy { + allow: vec!["MIT".into()], + ..Default::default() + }; + assert!(enrich(&cs, &policy).is_empty()); + } + + #[test] + fn spdx_and_with_one_denied_branch_violates() { + let cs = cs_with_added(comp("foo", vec!["(MIT AND GPL-3.0-only)"])); + let policy = Policy { + deny: vec!["GPL-3.0-only".into()], + ..Default::default() + }; + let v = enrich(&cs, &policy); + assert_eq!(v.len(), 1); + assert_eq!(v[0].kind, LicenseViolationKind::Deny); + } + + #[test] + fn spdx_with_exception_resolves_base_license() { + let cs = cs_with_added(comp("foo", vec!["Apache-2.0 WITH LLVM-exception"])); + let policy = Policy { + allow: vec!["Apache-2.0".into()], + ..Default::default() + }; + assert!(enrich(&cs, &policy).is_empty()); + } + + #[test] + fn spdx_compound_denial_wins_over_or_branches() { + // (GPL-3.0-only OR MIT) AND BSD-3-Clause with allow=[MIT, + // BSD-3-Clause] AND deny=[GPL-3.0-only] → violation: the + // resolution path could pick GPL. + let cs = cs_with_added(comp("foo", vec!["(GPL-3.0-only OR MIT) AND BSD-3-Clause"])); + let policy = Policy { + allow: vec!["MIT".into(), "BSD-3-Clause".into()], + deny: vec!["GPL-3.0-only".into()], + ..Default::default() + }; + let v = enrich(&cs, &policy); + assert_eq!(v.len(), 1); + assert_eq!(v[0].kind, LicenseViolationKind::Deny); + } + + #[test] + fn unknown_spdx_id_falls_back_to_atomic_path() { + // "Custom" isn't a valid SPDX ID; the atomic fallback rejects it + // when allow is set and "Custom" isn't on the list. + let cs = cs_with_added(comp("foo", vec!["Custom"])); + let policy = Policy { + allow: vec!["MIT".into()], + ..Default::default() + }; + let v = enrich(&cs, &policy); + assert_eq!(v.len(), 1); + assert_eq!(v[0].kind, LicenseViolationKind::NotAllowed); + } } From e954b3b4484180ecae9623dd473c0e6c5e53fa97 Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:26:17 -0700 Subject: [PATCH 4/8] feat(platform): Bitbucket + Azure DevOps support (auto-detect, footer, templates) Extends `cli::Platform` and `markdown::Platform` with two new variants. The exhaustive `From for markdown::Platform` match keeps the two enums in lockstep at compile time. - `--platform bitbucket` / `--platform azure-devops` flag values. - Auto-detection via `BITBUCKET_BUILD_NUMBER` and `TF_BUILD` envs. - `BITBUCKET_GIT_HTTP_ORIGIN` and `BUILD_REPOSITORY_URI` honored as `--repo-url` fallbacks. - Footer URLs: - Bitbucket: `/issues/new` + `bomdrift baseline add` suppress hint. - Azure DevOps: `/_workitems/create?templateName=false-positive` + `bomdrift baseline add` suppress hint. - Drop-in pipeline templates with READMEs: - `examples/bitbucket-pipelines/` - `examples/azure-devops/` - New docs chapters `bitbucket.md` and `azure-devops.md` linked from SUMMARY; CLI reference updated. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/src/SUMMARY.md | 2 + docs/src/azure-devops.md | 49 ++++++++++++ docs/src/bitbucket.md | 57 +++++++++++++ docs/src/cli-reference.md | 21 +++-- examples/azure-devops/README.md | 43 ++++++++++ examples/azure-devops/azure-pipelines.yml | 58 ++++++++++++++ examples/bitbucket-pipelines/README.md | 49 ++++++++++++ .../bitbucket-pipelines.yml | 46 +++++++++++ src/cli.rs | 13 +++ src/lib.rs | 13 ++- src/render/markdown.rs | 80 +++++++++++++++++-- 11 files changed, 413 insertions(+), 18 deletions(-) create mode 100644 docs/src/azure-devops.md create mode 100644 docs/src/bitbucket.md create mode 100644 examples/azure-devops/README.md create mode 100644 examples/azure-devops/azure-pipelines.yml create mode 100644 examples/bitbucket-pipelines/README.md create mode 100644 examples/bitbucket-pipelines/bitbucket-pipelines.yml diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index 973b5e3..59cde45 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -7,6 +7,8 @@ - [Quickstart](./quickstart.md) - [GitHub Action](./github-action.md) - [GitLab CI](./gitlab-ci.md) +- [Bitbucket Pipelines](./bitbucket.md) +- [Azure DevOps Pipelines](./azure-devops.md) - [CLI reference](./cli-reference.md) # Output diff --git a/docs/src/azure-devops.md b/docs/src/azure-devops.md new file mode 100644 index 0000000..6d8557f --- /dev/null +++ b/docs/src/azure-devops.md @@ -0,0 +1,49 @@ +# Azure DevOps Pipelines + +bomdrift runs in Azure Pipelines and posts a single upserted PR +thread per pull request. + +## Quickstart + +Copy [`examples/azure-devops/azure-pipelines.yml`](https://github.com/Metbcy/bomdrift/blob/main/examples/azure-devops/azure-pipelines.yml) +to your repo root and add a secret pipeline variable named +`BOMDRIFT_API_TOKEN` containing a PAT with the `Code (Read & Write)` +scope. + +## What the job does + +1. Installs Rust + bomdrift + Syft on the `ubuntu-latest` agent. +2. Generates a CycloneDX SBOM for the PR target branch and the PR + head. +3. Renders the diff to markdown with `bomdrift diff --platform + azure-devops`. +4. Looks up the existing bomdrift PR thread (by the + `` marker) and either creates a new thread + or updates the existing comment. + +## Tokens & permissions + +| Variable | Scope | Why | +|---|---|---| +| `BOMDRIFT_API_TOKEN` | PAT, `Code (Read & Write)` | Creating / updating PR threads. | + +The default `System.AccessToken` is **not** used because most +organizations don't grant it permission to create PR threads. + +## CLI auto-detection + +Setting `TF_BUILD=true` (Azure Pipelines sets this on every job) +auto-selects `--platform azure-devops` when the flag is omitted. + +`BUILD_REPOSITORY_URI` is honored as a `--repo-url` fallback. Note +that this variable is empty for some local debug runs; passing +`--repo-url` explicitly is fine. + +## Suppressions + +Comment-driven suppression is not wired up for Azure DevOps in v0.9. +Use `bomdrift baseline add` and commit the result. + +## Troubleshooting + +See [`examples/azure-devops/README.md`](https://github.com/Metbcy/bomdrift/blob/main/examples/azure-devops/README.md). diff --git a/docs/src/bitbucket.md b/docs/src/bitbucket.md new file mode 100644 index 0000000..a95efc9 --- /dev/null +++ b/docs/src/bitbucket.md @@ -0,0 +1,57 @@ +# Bitbucket Pipelines + +bomdrift runs in Bitbucket Cloud Pipelines and posts a single +upserted PR comment per pull request, mirroring the GitHub Action +and GitLab template flow. + +## Quickstart + +Copy [`examples/bitbucket-pipelines/bitbucket-pipelines.yml`](https://github.com/Metbcy/bomdrift/blob/main/examples/bitbucket-pipelines/bitbucket-pipelines.yml) +to your repo root and add a Repository Variable named +`BOMDRIFT_API_TOKEN` containing a Bitbucket App Password with the +`pullrequest:write` scope. + +## What the job does + +1. Installs Syft and bomdrift in a `rust:1.88` container. +2. Generates a CycloneDX SBOM for the PR target branch and the PR + head via `syft dir:`. +3. Renders the diff to markdown with `bomdrift diff --platform + bitbucket`. +4. Looks up the existing bomdrift comment on the PR (by the + `` marker) and either creates a new comment + or updates the existing one. + +## Tokens & permissions + +| Variable | Scope | Why | +|---|---|---| +| `BOMDRIFT_API_TOKEN` | App Password, `pullrequest:write` | Posting / updating PR comments. | + +The job never auto-pushes to your branch. Suppression is the manual +`bomdrift baseline add` flow plus a commit on your branch. + +## CLI auto-detection + +Setting `BITBUCKET_BUILD_NUMBER` in the environment auto-selects +`--platform bitbucket` when the flag is omitted. The Pipelines +runner sets this variable on every build. + +`BITBUCKET_GIT_HTTP_ORIGIN` is honored as a `--repo-url` fallback, +so the markdown footer's "Report this finding" link works without +plumbing. + +## Suppressions + +Comment-driven suppression is **not** wired up for Bitbucket in +v0.9. The supported flow is: + +```sh +bomdrift baseline add GHSA-... --reason "audit complete (PR #42)" +git add .bomdrift/baseline.json +git commit -m "baseline: suppress GHSA-..." +``` + +## Troubleshooting + +See [`examples/bitbucket-pipelines/README.md`](https://github.com/Metbcy/bomdrift/blob/main/examples/bitbucket-pipelines/README.md). diff --git a/docs/src/cli-reference.md b/docs/src/cli-reference.md index a38f06b..92e48b4 100644 --- a/docs/src/cli-reference.md +++ b/docs/src/cli-reference.md @@ -94,21 +94,26 @@ Supported `[diff]` keys map to the CLI flags: `output`, `format`, #### `--platform ` -`github` (default) or `gitlab`. Drives the rendered markdown -comment's footer: +`github` (default), `gitlab`, `bitbucket`, or `azure-devops`. Drives +the rendered markdown comment's footer: - `github` — `/issues/new?...` URL shape, `/bomdrift suppress ` comment-driven flow (requires the [comment-suppress sub-action](./baseline.md#in-comment-suppression-v05)). - `gitlab` — `/-/issues/new?issuable_template=false-positive` URL - shape, points reviewers at `bomdrift baseline add ` instead - (the v0.5 `/bomdrift suppress` comment-driven flow on GitLab is - deferred to v0.8). + shape, points reviewers at `bomdrift baseline add ` (with an + optional advanced webhook bridge for in-comment suppression — see + [GitLab CI](./gitlab-ci.md)). +- `bitbucket` — `/issues/new` URL shape, `bomdrift baseline add ` + manual suppression flow. +- `azure-devops` — `/_workitems/create?templateName=false-positive` + URL shape, `bomdrift baseline add ` manual suppression flow. When the flag is omitted, bomdrift auto-detects from CI environment -variables: `GITLAB_CI=true` flips to GitLab; otherwise GitHub. The -explicit flag always wins. Also configurable via `[diff] platform = -"gitlab"` in `.bomdrift.toml`. +variables in this order: `GITLAB_CI=true` → GitLab, +`BITBUCKET_BUILD_NUMBER` → Bitbucket, `TF_BUILD` → Azure DevOps, +otherwise GitHub. The explicit flag always wins. Also configurable +via `[diff] platform = ""` in `.bomdrift.toml`. Set in lockstep with `--repo-url` (or `BOMDRIFT_REPO_URL`, or — on GitLab CI — `CI_PROJECT_URL`). Without a URL the footer is omitted diff --git a/examples/azure-devops/README.md b/examples/azure-devops/README.md new file mode 100644 index 0000000..932d2e9 --- /dev/null +++ b/examples/azure-devops/README.md @@ -0,0 +1,43 @@ +# bomdrift + Azure DevOps Pipelines + +Drop-in template for running bomdrift on Azure DevOps Repos PRs. + +## Quickstart + +1. Copy [`azure-pipelines.yml`](./azure-pipelines.yml) to your repo root. +2. Create a Personal Access Token with **Code (Read & Write)** scope. + Expose it as a masked pipeline secret variable named + `BOMDRIFT_API_TOKEN`. +3. Open a PR. The pipeline posts an inline thread. + +## Why a PAT and not `System.AccessToken`? + +`System.AccessToken`'s scope is too narrow to update PR threads on +most orgs. A maintainer-issued PAT is the most-portable option. + +## Token model + +| Step | Token used | Scope | +|---|---|---| +| `bomdrift_diff` | `BOMDRIFT_API_TOKEN` | PAT, `Code (Read & Write)` | + +## Caveats + +- The pipeline reads `System.PullRequest.PullRequestId` and + `Build.Repository.ID` at runtime. Manual builds outside a PR + context have neither. +- Comment-driven `/bomdrift suppress` is not wired up for Azure + DevOps in v0.9. + +## Troubleshooting + +| Symptom | Cause | Fix | +|---|---|---| +| 403 from `/_apis/git/repositories/.../threads` | PAT scope too narrow | Re-issue with `Code (Read & Write)`. | +| Multiple threads per PR | Marker not surviving Azure's HTML sanitizer | Confirm the comment body is sent as `commentType: 1` (text). | + +## What v0.9 does NOT ship + +- Comment-driven suppression. +- Pipeline auto-bootstrap (`bomdrift init` does not write an Azure + Pipelines YAML in v0.9). diff --git a/examples/azure-devops/azure-pipelines.yml b/examples/azure-devops/azure-pipelines.yml new file mode 100644 index 0000000..8be069f --- /dev/null +++ b/examples/azure-devops/azure-pipelines.yml @@ -0,0 +1,58 @@ +# Azure DevOps Pipelines template for bomdrift. + +trigger: none + +pr: + branches: + include: + - "*" + +pool: + vmImage: ubuntu-latest + +variables: + BOMDRIFT_VERSION: "0.9.0" + +steps: + - script: | + set -euo pipefail + sudo apt-get update -qq && sudo apt-get install -y -qq curl jq + curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh -s -- -y + . "$HOME/.cargo/env" + cargo install bomdrift --locked --version "$BOMDRIFT_VERSION" + curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sudo sh -s -- -b /usr/local/bin + displayName: Install bomdrift + Syft + + - script: | + set -euo pipefail + . "$HOME/.cargo/env" + git fetch origin "$(System.PullRequest.TargetBranch)" + git worktree add ../before "origin/$(System.PullRequest.TargetBranch)" + syft dir:../before -o cyclonedx-json=before.cdx.json + syft dir:. -o cyclonedx-json=after.cdx.json + bomdrift diff before.cdx.json after.cdx.json \ + --output markdown --platform azure-devops > comment.md + printf '\n\n' >> comment.md + displayName: bomdrift diff + + - script: | + set -euo pipefail + ORG_URL="$(System.TeamFoundationCollectionUri)" + PROJECT="$(System.TeamProject)" + REPO_ID="$(Build.Repository.ID)" + PR_ID="$(System.PullRequest.PullRequestId)" + API="${ORG_URL}${PROJECT}/_apis/git/repositories/${REPO_ID}/pullRequests/${PR_ID}/threads?api-version=7.1" + AUTH_HEADER="Authorization: Basic $(printf ":%s" "$BOMDRIFT_API_TOKEN" | base64 -w0)" + EXISTING=$(curl -s -H "$AUTH_HEADER" "$API" \ + | jq -r '.value[] | select(.comments[0].content | tostring | test("")) | .id' | head -n1) + if [ -n "$EXISTING" ]; then + UPDATE_API="${ORG_URL}${PROJECT}/_apis/git/repositories/${REPO_ID}/pullRequests/${PR_ID}/threads/${EXISTING}/comments/1?api-version=7.1" + UBODY=$(jq -Rs --arg cmt "$(cat comment.md)" '{content:$cmt, commentType:1}' < /dev/null) + curl -s -H "$AUTH_HEADER" -H "Content-Type: application/json" -X PATCH "$UPDATE_API" -d "$UBODY" > /dev/null + else + BODY=$(jq -Rs --arg cmt "$(cat comment.md)" '{comments:[{parentCommentId:0, content:$cmt, commentType:1}], status:1}' < /dev/null) + curl -s -H "$AUTH_HEADER" -H "Content-Type: application/json" -X POST "$API" -d "$BODY" > /dev/null + fi + displayName: Upsert PR thread + env: + BOMDRIFT_API_TOKEN: $(BOMDRIFT_API_TOKEN) diff --git a/examples/bitbucket-pipelines/README.md b/examples/bitbucket-pipelines/README.md new file mode 100644 index 0000000..9c4fe9c --- /dev/null +++ b/examples/bitbucket-pipelines/README.md @@ -0,0 +1,49 @@ +# bomdrift + Bitbucket Pipelines + +Drop-in template for running bomdrift on Bitbucket Cloud PRs. The +pipeline runs on every PR build, generates SBOMs with Syft for the +target branch and the PR head, renders the diff to markdown, and +upserts a Bitbucket PR comment marked ``. + +## Quickstart + +1. Copy [`bitbucket-pipelines.yml`](./bitbucket-pipelines.yml) to your + project root, or `import:` it from a shared template repo. +2. Create a Bitbucket App Password with the `pullrequest:write` scope. + Expose it as a masked Pipelines repository variable named + `BOMDRIFT_API_TOKEN`. +3. Open a PR. The `bomdrift:diff` step runs and posts a comment. + Subsequent pushes update the same comment by the marker. + +## Token model + +| Step | Token used | Scope | +|---|---|---| +| `bomdrift:diff` | `BOMDRIFT_API_TOKEN` | App Password, `pullrequest:write` | + +bomdrift never auto-pushes a baseline change to your branch from a PR +build. To suppress a finding, run `bomdrift baseline add ` locally +and commit `.bomdrift/baseline.json` to your branch — same flow as +GitLab and Azure DevOps. + +## Caveats + +- Bitbucket's `pullrequest:write` App Password scope is broad on some + workspaces. Audit your workspace's permission bundles before issuing + the token. +- Comment-driven `/bomdrift suppress` is **not** wired up for + Bitbucket in v0.9. The recommended flow is the manual baseline edit. + +## Troubleshooting + +| Symptom | Cause | Fix | +|---|---|---| +| `401 Unauthorized` from `/2.0/repositories/.../pullrequests/.../comments` | Token lacks `pullrequest:write` | Re-issue App Password with the right scope. | +| Multiple bomdrift comments accrue per PR | Marker stripped by upstream comment renderer | Confirm the marker `` survives a round-trip via the API. | + +## What v0.9 does NOT ship + +- Comment-driven suppression for Bitbucket. Use the manual + `bomdrift baseline add` flow. +- Pipeline auto-bootstrap (`bomdrift init` does not write a Bitbucket + YAML in v0.9). Copy this file manually. diff --git a/examples/bitbucket-pipelines/bitbucket-pipelines.yml b/examples/bitbucket-pipelines/bitbucket-pipelines.yml new file mode 100644 index 0000000..6c8fc86 --- /dev/null +++ b/examples/bitbucket-pipelines/bitbucket-pipelines.yml @@ -0,0 +1,46 @@ +# Bitbucket Pipelines template for bomdrift. +# +# Required variables: +# - BOMDRIFT_API_TOKEN: App Password with `pullrequest:write` scope. + +image: rust:1.88 + +definitions: + steps: + - step: &bomdrift-diff + name: bomdrift diff + caches: + - cargo + script: + - export BOMDRIFT_VERSION="0.9.0" + - apt-get update -qq && apt-get install -y -qq curl jq + - curl -sSfL https://raw.githubusercontent.com/anchore/syft/main/install.sh | sh -s -- -b /usr/local/bin + - cargo install bomdrift --locked --version "$BOMDRIFT_VERSION" + - git fetch origin "$BITBUCKET_PR_DESTINATION_BRANCH" + - git worktree add ../before "origin/$BITBUCKET_PR_DESTINATION_BRANCH" + - syft dir:../before -o cyclonedx-json=before.cdx.json + - syft dir:. -o cyclonedx-json=after.cdx.json + - bomdrift diff before.cdx.json after.cdx.json --output markdown --platform bitbucket > comment.md + - echo "" >> comment.md + - echo "" >> comment.md + - REPO="$BITBUCKET_REPO_FULL_NAME" + - PR_ID="$BITBUCKET_PR_ID" + - API="https://api.bitbucket.org/2.0/repositories/$REPO/pullrequests/$PR_ID/comments" + - | + EXISTING=$(curl -s -u "x-token-auth:$BOMDRIFT_API_TOKEN" "$API?pagelen=100" \ + | jq -r '.values[] | select(.content.raw | test("")) | .id' | head -n1) + - | + BODY=$(jq -Rs '{content:{raw: .}}' < comment.md) + - | + if [ -n "$EXISTING" ]; then + curl -s -u "x-token-auth:$BOMDRIFT_API_TOKEN" -X PUT -H "Content-Type: application/json" \ + "$API/$EXISTING" -d "$BODY" >/dev/null + else + curl -s -u "x-token-auth:$BOMDRIFT_API_TOKEN" -X POST -H "Content-Type: application/json" \ + "$API" -d "$BODY" >/dev/null + fi + +pipelines: + pull-requests: + "**": + - step: *bomdrift-diff diff --git a/src/cli.rs b/src/cli.rs index 4a22efc..a5a3d7b 100644 --- a/src/cli.rs +++ b/src/cli.rs @@ -158,6 +158,17 @@ pub enum Platform { /// comments). #[value(name = "gitlab")] GitLab, + /// Bitbucket Cloud or Bitbucket Data Center. Footer points + /// reviewers at the `/issues/new` form and uses `bomdrift baseline + /// add ` for suppression — Bitbucket has no in-comment + /// suppression flow in v0.9. + #[value(name = "bitbucket")] + Bitbucket, + /// Azure DevOps Repos (Azure Pipelines). Footer points reviewers at + /// the work-item create form and uses `bomdrift baseline add ` + /// for suppression. + #[value(name = "azure-devops")] + AzureDevOps, } impl From for markdown::Platform { @@ -170,6 +181,8 @@ impl From for markdown::Platform { match value { Platform::GitHub => markdown::Platform::GitHub, Platform::GitLab => markdown::Platform::GitLab, + Platform::Bitbucket => markdown::Platform::Bitbucket, + Platform::AzureDevOps => markdown::Platform::AzureDevOps, } } } diff --git a/src/lib.rs b/src/lib.rs index a121c06..277f283 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -275,17 +275,22 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { .clone() .or_else(|| std::env::var("BOMDRIFT_REPO_URL").ok()) .or_else(|| std::env::var("CI_PROJECT_URL").ok()) + .or_else(|| std::env::var("BITBUCKET_GIT_HTTP_ORIGIN").ok()) + .or_else(|| std::env::var("BUILD_REPOSITORY_URI").ok()) .filter(|s| !s.is_empty()); // Platform precedence: explicit `--platform` (or `[diff] platform` // in `.bomdrift.toml`, already merged into `args.platform`) wins; - // otherwise auto-detect from CI env. `GITLAB_CI=true` is GitLab's - // canonical CI marker — set unconditionally on every job in every - // GitLab pipeline. Fall through to `Platform::GitHub` (the default) - // so existing GitHub Action consumers see no behavior change. + // otherwise auto-detect from CI env. Detection order: GitLab + // (`GITLAB_CI=true`), Bitbucket (`BITBUCKET_BUILD_NUMBER`), Azure + // DevOps (`TF_BUILD`), then default GitHub. let platform = args.platform.unwrap_or_else(|| { if std::env::var("GITLAB_CI").is_ok_and(|v| v == "true") { crate::cli::Platform::GitLab + } else if std::env::var("BITBUCKET_BUILD_NUMBER").is_ok() { + crate::cli::Platform::Bitbucket + } else if std::env::var("TF_BUILD").is_ok() { + crate::cli::Platform::AzureDevOps } else { crate::cli::Platform::GitHub } diff --git a/src/render/markdown.rs b/src/render/markdown.rs index 943b743..e29baaa 100644 --- a/src/render/markdown.rs +++ b/src/render/markdown.rs @@ -41,6 +41,10 @@ pub enum Platform { /// `/bomdrift suppress` hint and points at `bomdrift baseline add` /// instead. GitLab, + /// Bitbucket Cloud or Bitbucket Data Center. + Bitbucket, + /// Azure DevOps Repos. + AzureDevOps, } /// Renderer toggles. Defaults match v0.2 behavior so existing callers keep @@ -397,12 +401,6 @@ fn write_footer(out: &mut String, opts: &Options) { ); } Platform::GitLab => { - // GitLab issue creation uses `/-/issues/new` (the `-/` is the - // namespace separator GitLab inserts between the project URL - // and the issue tracker route). `issuable_template=` selects a - // saved description template if the project has one named - // `false-positive`; projects without that template still get - // a working "new issue" form. let _ = writeln!( out, "**False positive?** [Report it]({repo}/-/issues/new?issuable_template=false-positive) · \ @@ -411,6 +409,31 @@ fn write_footer(out: &mut String, opts: &Options) { [Docs](https://metbcy.github.io/bomdrift/)", ); } + Platform::Bitbucket => { + // Bitbucket Cloud uses `/issues/new` (no labels query string). + // Comment-driven suppress is not in scope for v0.9 — point + // reviewers at the manual CLI flow. + let _ = writeln!( + out, + "**False positive?** [Report it]({repo}/issues/new) · \ + **Suppress a finding?** Run `bomdrift baseline add ` and commit \ + `.bomdrift/baseline.json` to your PR branch · \ + [Docs](https://metbcy.github.io/bomdrift/)", + ); + } + Platform::AzureDevOps => { + // Azure DevOps work items use the `/_workitems/create` + // route. `templateName` is honored when the project has a + // matching work-item template; projects without one still + // get the default form. + let _ = writeln!( + out, + "**False positive?** [Report it]({repo}/_workitems/create?templateName=false-positive) · \ + **Suppress a finding?** Run `bomdrift baseline add ` and commit \ + `.bomdrift/baseline.json` to your PR branch · \ + [Docs](https://metbcy.github.io/bomdrift/)", + ); + } } } @@ -1269,6 +1292,51 @@ mod tests { assert!(md.contains("/bomdrift suppress")); } + #[test] + fn footer_renders_bitbucket_shape() { + let cs = ChangeSet { + added: vec![comp("a", "1.0", Ecosystem::Npm, None)], + ..Default::default() + }; + let md = render_with_options( + &cs, + &Enrichment::default(), + Options { + repo_url: Some("https://bitbucket.org/team/proj".to_string()), + platform: Platform::Bitbucket, + ..Default::default() + }, + ); + assert!( + md.contains("https://bitbucket.org/team/proj/issues/new"), + "expected Bitbucket /issues/new URL; got:\n{md}" + ); + assert!(md.contains("bomdrift baseline add")); + assert!(!md.contains("/bomdrift suppress")); + } + + #[test] + fn footer_renders_azure_devops_shape() { + let cs = ChangeSet { + added: vec![comp("a", "1.0", Ecosystem::Npm, None)], + ..Default::default() + }; + let md = render_with_options( + &cs, + &Enrichment::default(), + Options { + repo_url: Some("https://dev.azure.com/org/project/_git/repo".to_string()), + platform: Platform::AzureDevOps, + ..Default::default() + }, + ); + assert!( + md.contains("/_workitems/create?templateName=false-positive"), + "expected Azure DevOps work-item URL; got:\n{md}" + ); + assert!(md.contains("bomdrift baseline add")); + } + #[test] fn why_this_matters_link_appears_in_each_finding_section() { let cs = ChangeSet { From 18f564768d0a7db1059832768b0cc0c3a0b6d706 Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:36:44 -0700 Subject: [PATCH 5/8] feat(enrich/registry): npm/PyPI/crates.io metadata + recently-published/deprecated/maintainer-set-changed findings MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit New `src/enrich/registry.rs` module with three best-effort fetchers mirroring the OSV / EPSS / KEV pattern: - npm: `https://registry.npmjs.org/` (URL-encoded `@scope/name`) — extracts `time`, per-version `deprecated`, per-version `maintainers[]`. - PyPI: `https://pypi.org/pypi//json` — extracts `info.yanked`, classifiers, `releases[].upload_time_iso_8601`. - crates.io: `https://crates.io/api/v1/crates/` (UA required) — extracts `crate.updated_at`, `versions[].yanked`, `versions[].published_at`. Three new finding kinds wired through Enrichment, JSON, markdown, SARIF (`bomdrift.recently-published`, `bomdrift.deprecated`, `bomdrift.maintainer-set-changed` with stable partialFingerprints), and the calibration tap. Disk cache at `/bomdrift/registry//.json`, 24h TTL, atomic temp-file + rename. Best-effort: any failure mode (timeout, parse error, unsupported ecosystem) returns no findings — diff rendering is never blocked. New flags / config: - `--no-registry` and `[diff] no_registry = true` to skip entirely. - `--recently-published-days ` and `[diff] recently_published_days = N` to tune the threshold (default 14 days). - `--fail-on recently-published` / `--fail-on deprecated` exit-2 thresholds. Maintainer-set-changed is npm-only (PyPI / crates.io don't expose a clean per-version maintainer view); fires for VersionChanged components. Address parallel v0.8 follow-up: extend OSV cache schema to carry `aliases` so cache hits don't lose alias data. Old entries without the field deserialize with an empty vec for graceful migration. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- docs/src/SUMMARY.md | 1 + docs/src/enrichers/registry.md | 75 ++++ src/cli.rs | 18 + src/config.rs | 10 + src/enrich/cache.rs | 48 ++- src/enrich/mod.rs | 16 + src/enrich/osv.rs | 13 +- src/enrich/registry.rs | 654 +++++++++++++++++++++++++++++++++ src/lib.rs | 47 +++ src/render/json.rs | 3 + src/render/markdown.rs | 105 ++++++ src/render/sarif.rs | 119 ++++++ 12 files changed, 1096 insertions(+), 13 deletions(-) create mode 100644 docs/src/enrichers/registry.md create mode 100644 src/enrich/registry.rs diff --git a/docs/src/SUMMARY.md b/docs/src/SUMMARY.md index 59cde45..660c411 100644 --- a/docs/src/SUMMARY.md +++ b/docs/src/SUMMARY.md @@ -28,6 +28,7 @@ - [Typosquat detection](./enrichers/typosquat.md) - [Multi-major version jumps](./enrichers/version-jump.md) - [Maintainer age signal](./enrichers/maintainer-age.md) +- [Registry metadata (npm/PyPI/crates.io)](./enrichers/registry.md) # Operations diff --git a/docs/src/enrichers/registry.md b/docs/src/enrichers/registry.md new file mode 100644 index 0000000..2963ff9 --- /dev/null +++ b/docs/src/enrichers/registry.md @@ -0,0 +1,75 @@ +# Registry-metadata enrichers (npm / PyPI / crates.io) + +bomdrift queries package registries for each newly-added component +(plus npm version-changed components for the maintainer-set check) +and surfaces three kinds of finding: + +- **Recently published** — the publish timestamp is within + `--recently-published-days` (default 14 days). Recent publishes + correlate with takeover swaps and namespace-reuse attacks. +- **Deprecated** — the package or version is flagged deprecated on + npm, yanked on PyPI / crates.io, or carries an "Inactive" PyPI + classifier. +- **Maintainer set changed (npm only)** — the maintainer set listed + for the new version differs from the maintainer set listed for the + old version. Classic xz / Jia Tan precursor. + +## Sources + +| Ecosystem | URL | Headers | +|---|---|---| +| npm | `https://registry.npmjs.org/` (URL-encoded `@scope/name`) | `User-Agent: bomdrift/` | +| PyPI | `https://pypi.org/pypi//json` | — | +| crates.io | `https://crates.io/api/v1/crates/` | `User-Agent: bomdrift/0.9.0 (https://github.com/Metbcy/bomdrift)` (required by crates.io) | + +## Disk cache + +Per ecosystem under `/bomdrift/registry//.json`, +24-hour TTL, atomic temp-file + rename writes. Mirrors the OSV / EPSS +/ KEV cache shape. + +## Best-effort + +A registry timeout, parse error, or unsupported ecosystem returns +`Ok` with no findings. Diff rendering NEVER blocks on registry +responses. + +## Flags + +- `--no-registry` — skip all three checks (alias to disabling the + `[diff] no_registry = true` config key). +- `--recently-published-days ` — override the default 14-day + threshold. Set `--recently-published-days 0` to disable that check + while keeping deprecation / maintainer-set-changed. +- `--fail-on recently-published`, `--fail-on deprecated` — exit-2 + thresholds. + +## Output + +- **Markdown**: three new sections — "Recently published", + "Deprecated upstream", "Maintainer set changed (npm)" — in the + per-category area. +- **JSON**: `enrichment.recently_published`, + `enrichment.deprecated`, `enrichment.maintainer_set_changed`. +- **SARIF**: rules `bomdrift.recently-published`, + `bomdrift.deprecated`, `bomdrift.maintainer-set-changed` with + stable `partialFingerprints.primaryHash/v1`. +- **Calibration** rows (`--debug-calibration`): + `recently-published|||14`, + `deprecated|||any`, + `maintainer-set-changed|||1`. + +## Why npm-only for maintainer-set-changed? + +PyPI and crates.io don't expose a clean "maintainers per version" +view in their public REST API: + +- **PyPI**: the `info.maintainer` and `info.author` fields are + free-text and inconsistent across releases. There's no historical + record per release. +- **crates.io**: `owners` is package-level, not version-level, so we + can't tell which owners had publish rights at the time of an + individual version. + +When the upstream APIs gain a per-version maintainer view we'll +extend the enricher; a future-version follow-up. diff --git a/src/cli.rs b/src/cli.rs index a5a3d7b..14552db 100644 --- a/src/cli.rs +++ b/src/cli.rs @@ -336,6 +336,17 @@ pub struct DiffArgs { /// claims); un-suppressed findings emit as `affected`. v0.9+. #[arg(long)] pub emit_vex: Option, + /// Skip registry-metadata enrichers (npm/PyPI/crates.io) entirely. + /// Use for offline runs or when you don't want bomdrift to fan out + /// HTTP requests to package registries. + #[arg(long)] + pub no_registry: bool, + /// Recently-published threshold in days. Components published + /// within this window trip a `RecentlyPublished` finding. Default + /// 14 days; set to 0 to disable the kind without disabling the + /// other registry checks. + #[arg(long)] + pub recently_published_days: Option, /// VEX `author` for `--emit-vex`. Falls back to repo_url, then /// to `"bomdrift"`. v0.9+. #[arg(long)] @@ -401,6 +412,13 @@ pub enum FailOn { Kev, /// Trip on a license-policy violation (Phase D, v0.8+). LicenseViolation, + /// Trip when a registry-metadata enricher (npm/PyPI/crates.io) flags + /// any added component as published within the + /// recently-published threshold (default 14 days). v0.9+. + RecentlyPublished, + /// Trip when a registry-metadata enricher flags any component as + /// deprecated or yanked upstream. v0.9+. + Deprecated, /// Trip on ANY finding (CVE, typosquat, version-jump, young-maintainer) /// OR any license-changed-without-version-bump pair (the suspicious case). Any, diff --git a/src/config.rs b/src/config.rs index b9bba56..55a6f45 100644 --- a/src/config.rs +++ b/src/config.rs @@ -63,6 +63,10 @@ pub struct DiffConfig { /// Default OpenVEX justification when an entry doesn't supply one. /// Defaults to `"vulnerable_code_not_in_execute_path"`. pub vex_default_justification: Option, + /// Skip registry-metadata enrichers (npm/PyPI/crates.io). v0.9+. + pub no_registry: Option, + /// Override the default 14-day recently-published threshold. v0.9+. + pub recently_published_days: Option, } pub fn apply_diff_config(args: &mut DiffArgs) -> Result<()> { @@ -143,6 +147,10 @@ fn apply_loaded_diff_config(args: &mut DiffArgs, config: Config) { if args.vex_default_justification.is_none() { args.vex_default_justification = diff.vex_default_justification.filter(|s| !s.is_empty()); } + args.no_registry |= diff.no_registry.unwrap_or(false); + if args.recently_published_days.is_none() { + args.recently_published_days = diff.recently_published_days; + } // [license] block: CLI flags override (not merge) when set. Mirrors // Dependency Review Action semantics so users moving between bomdrift @@ -215,6 +223,8 @@ mod tests { emit_vex: None, vex_author: None, vex_default_justification: None, + no_registry: false, + recently_published_days: None, } } diff --git a/src/enrich/cache.rs b/src/enrich/cache.rs index 55bcfa0..92a0936 100644 --- a/src/enrich/cache.rs +++ b/src/enrich/cache.rs @@ -59,6 +59,12 @@ const OSV_SUBDIR: &str = "osv"; struct CacheEntry { fetched_at: u64, severity: Severity, + /// Cross-database aliases captured at fetch time. Newly added in + /// v0.9. Old cache entries without this field deserialize with an + /// empty vec; downstream consumers tolerate the empty case by + /// falling back to the primary advisory ID. + #[serde(default)] + aliases: Vec, } /// Filesystem-backed severity cache. Construct via [`Cache::open`] (production) @@ -87,10 +93,17 @@ impl Cache { Self { root, now_secs } } - /// Look up cached severity for `advisory_id`. Returns `None` on cache - /// miss, missing file, parse error, or expired entry — every failure - /// mode collapses to "go fetch fresh". + /// Look up cached severity + aliases for `advisory_id`. Returns + /// `None` on cache miss, missing file, parse error, or expired + /// entry — every failure mode collapses to "go fetch fresh". pub fn get(&self, advisory_id: &str) -> Option { + self.get_full(advisory_id).map(|(s, _)| s) + } + + /// Like [`Cache::get`] but also returns the aliases stored in the + /// cache entry. Empty when the entry was written by a pre-v0.9 + /// build. + pub fn get_full(&self, advisory_id: &str) -> Option<(Severity, Vec)> { let path = self.path_for(advisory_id); let body = std::fs::read(&path).ok()?; let entry: CacheEntry = serde_json::from_slice(&body).ok()?; @@ -98,27 +111,30 @@ impl Cache { if now.saturating_sub(entry.fetched_at) > CACHE_TTL_SECS { return None; } - Some(entry.severity) + Some((entry.severity, entry.aliases)) } /// Persist `severity` for `advisory_id`. Best-effort: filesystem errors /// are silently dropped because the caller has the live response in hand /// and we never want a write failure to corrupt the in-memory data path. pub fn put(&self, advisory_id: &str, severity: Severity) { + self.put_full(advisory_id, severity, &[]); + } + + /// Like [`Cache::put`] but stores aliases alongside the severity. + pub fn put_full(&self, advisory_id: &str, severity: Severity, aliases: &[String]) { if std::fs::create_dir_all(&self.root).is_err() { return; } let entry = CacheEntry { fetched_at: (self.now_secs)(), severity, + aliases: aliases.to_vec(), }; let Ok(body) = serde_json::to_vec(&entry) else { return; }; let target = self.path_for(advisory_id); - // Temp-file + rename pattern, mirroring src/refresh.rs's atomicity - // contract: a concurrent reader either sees the previous entry or - // the new one, never a torn write. let mut tmp = target.as_os_str().to_owned(); tmp.push(".tmp"); let tmp = PathBuf::from(tmp); @@ -196,6 +212,24 @@ mod tests { let _ = std::fs::remove_dir_all(&dir); } + #[test] + fn put_full_roundtrips_aliases() { + let dir = tempdir_unique("aliases"); + let cache = Cache::with_root(dir.clone(), fixed_clock); + cache.put_full( + "GHSA-aliases-1", + Severity::High, + &["CVE-2024-1".to_string(), "CVE-2024-2".to_string()], + ); + let (sev, aliases) = cache.get_full("GHSA-aliases-1").unwrap(); + assert_eq!(sev, Severity::High); + assert_eq!( + aliases, + vec!["CVE-2024-1".to_string(), "CVE-2024-2".to_string()] + ); + let _ = std::fs::remove_dir_all(&dir); + } + #[test] fn get_returns_none_for_missing_advisory() { let dir = tempdir_unique("miss"); diff --git a/src/enrich/mod.rs b/src/enrich/mod.rs index c311442..3334da4 100644 --- a/src/enrich/mod.rs +++ b/src/enrich/mod.rs @@ -15,6 +15,7 @@ pub mod kev; pub mod license; pub mod maintainer; pub mod osv; +pub mod registry; pub mod typosquat; pub mod version_jump; @@ -23,6 +24,7 @@ use std::collections::HashMap; use serde::{Deserialize, Serialize}; use maintainer::MaintainerAgeFinding; +use registry::{Deprecated, MaintainerSetChanged, RecentlyPublished}; use typosquat::TyposquatFinding; use version_jump::VersionJumpFinding; @@ -58,6 +60,17 @@ pub struct Enrichment { /// `cs.license_changed` which detects same-version license changes. /// Empty when no `[license]` block is configured. pub license_violations: Vec, + /// Components newly added in the diff whose registry-recorded + /// publish date is younger than the configured threshold (default + /// 14 days). v0.9+. + #[serde(default, skip_serializing_if = "Vec::is_empty")] + pub recently_published: Vec, + /// Components flagged deprecated / yanked upstream. v0.9+. + #[serde(default, skip_serializing_if = "Vec::is_empty")] + pub deprecated: Vec, + /// npm-only: maintainer set changed across a version bump. v0.9+. + #[serde(default, skip_serializing_if = "Vec::is_empty")] + pub maintainer_set_changed: Vec, /// VEX annotations attached to findings whose status is `affected` /// or `under_investigation` (Phase G, v0.9). Keyed by an opaque /// finding-identity string; renderers look up by the same identity. @@ -89,6 +102,9 @@ impl Enrichment { || !self.version_jumps.is_empty() || !self.maintainer_age.is_empty() || !self.license_violations.is_empty() + || !self.recently_published.is_empty() + || !self.deprecated.is_empty() + || !self.maintainer_set_changed.is_empty() } } diff --git a/src/enrich/osv.rs b/src/enrich/osv.rs index 0db5fc1..542db3e 100644 --- a/src/enrich/osv.rs +++ b/src/enrich/osv.rs @@ -99,20 +99,18 @@ fn enrich_with( let mut cache_hits = 0usize; for id in &unique_ids { if let Some(c) = cache - && let Some(cached) = c.get(id) + && let Some((sev, aliases)) = c.get_full(id) { - // v0.7 cache only stored severity; aliases stay empty on a - // cache hit. EPSS/KEV (Phase B) tolerates empty aliases. - details.insert(id.clone(), (cached, Vec::new())); + details.insert(id.clone(), (sev, aliases)); cache_hits += 1; continue; } match fetch_detail(&agent, vuln_url_base, id) { Ok((sev, aliases)) => { - details.insert(id.clone(), (sev, aliases)); if let Some(c) = cache { - c.put(id, sev); + c.put_full(id, sev, &aliases); } + details.insert(id.clone(), (sev, aliases)); } Err(_) => { lookup_failures += 1; @@ -166,6 +164,9 @@ fn enrich_with( version_jumps: Vec::new(), maintainer_age: Vec::new(), license_violations: Vec::new(), + recently_published: Vec::new(), + deprecated: Vec::new(), + maintainer_set_changed: Vec::new(), vex_annotations: std::collections::HashMap::new(), vex_suppressed_count: 0, }) diff --git a/src/enrich/registry.rs b/src/enrich/registry.rs new file mode 100644 index 0000000..8206083 --- /dev/null +++ b/src/enrich/registry.rs @@ -0,0 +1,654 @@ +//! Registry-metadata enrichers (Phase K, v0.9). +//! +//! Three best-effort registry fetchers — npm, PyPI, crates.io — that +//! surface "recently published", "deprecated", and "maintainer set +//! changed" signals on newly added (or in npm's case, version-changed) +//! components. +//! +//! Best-effort contract: a network failure, parse error, or unknown +//! ecosystem returns Ok with no findings. Diff rendering must never +//! block on registry responses. +//! +//! Disk cache: `/bomdrift/registry//.json`, 24h +//! TTL, mirrors the OSV / EPSS cache shape. + +use std::path::PathBuf; +use std::time::{Duration, SystemTime, UNIX_EPOCH}; + +use serde::{Deserialize, Serialize}; + +use crate::diff::ChangeSet; +use crate::model::{Component, Ecosystem}; + +const SUBDIR: &str = "registry"; +const CACHE_TTL_SECS: u64 = 24 * 60 * 60; +const DEFAULT_TIMEOUT: Duration = Duration::from_secs(15); + +/// Default "recently published" age threshold (days). Components with +/// a publish timestamp younger than this trip a [`RecentlyPublished`] +/// finding. Tunable via `--recently-published-days`. +pub const MIN_PUBLISHED_AGE_DAYS: i64 = 14; + +/// Newly-added component whose registry-recorded publish date is +/// younger than the configured threshold (default 14 days). +#[derive(Debug, Clone, PartialEq, Eq, Serialize)] +pub struct RecentlyPublished { + pub component: Component, + pub published_at: String, + pub days_old: i64, +} + +/// Component flagged as deprecated or yanked upstream. +#[derive(Debug, Clone, PartialEq, Eq, Serialize)] +pub struct Deprecated { + pub component: Component, + pub message: Option, +} + +/// Maintainer set differs between two npm versions of the same package. +#[derive(Debug, Clone, PartialEq, Eq, Serialize)] +pub struct MaintainerSetChanged { + pub before: Component, + pub after: Component, + pub added: Vec, + pub removed: Vec, +} + +/// All registry-derived findings collected by [`enrich`]. +#[derive(Debug, Clone, Default, PartialEq, Eq)] +pub struct RegistryFindings { + pub recently_published: Vec, + pub deprecated: Vec, + pub maintainer_set_changed: Vec, +} + +#[derive(Debug, Clone, Default, Serialize, Deserialize)] +struct CacheEntry { + fetched_at: u64, + /// ISO 8601 publish/update timestamp (most-recent version). May + /// be `None` when the registry didn't expose a parseable date. + published_at: Option, + /// Per-version published_at (npm `time[""]`, crates.io + /// `versions[].published_at`). Empty when the registry doesn't + /// expose per-version dates cleanly. + #[serde(default)] + versions: std::collections::HashMap, + deprecated_message: Option, + /// npm-only: maintainers per version. + #[serde(default)] + maintainers: std::collections::HashMap>, +} + +/// Run the registry enrichers. `recently_published_days` overrides +/// [`MIN_PUBLISHED_AGE_DAYS`] when `Some`. +pub fn enrich(cs: &ChangeSet, recently_published_days: Option) -> RegistryFindings { + enrich_with(cs, recently_published_days, DEFAULT_TIMEOUT) +} + +fn enrich_with( + cs: &ChangeSet, + recently_published_days: Option, + timeout: Duration, +) -> RegistryFindings { + let mut out = RegistryFindings::default(); + let threshold = recently_published_days.unwrap_or(MIN_PUBLISHED_AGE_DAYS); + let agent = ureq::AgentBuilder::new().timeout(timeout).build(); + let cache_root = cache_root(); + + for c in &cs.added { + let Some(eco) = supported_ecosystem(c) else { + continue; + }; + let Some(entry) = lookup(&agent, cache_root.as_ref(), eco, &c.name) else { + continue; + }; + // Recently-published check: prefer per-version date if known, + // otherwise the top-level published_at. + let date = entry + .versions + .get(&c.version) + .cloned() + .or_else(|| entry.published_at.clone()); + if let Some(d) = date.as_deref() + && let Some(days) = days_since(d) + && days < threshold + { + out.recently_published.push(RecentlyPublished { + component: c.clone(), + published_at: d.to_string(), + days_old: days, + }); + } + if let Some(msg) = entry.deprecated_message.clone() { + out.deprecated.push(Deprecated { + component: c.clone(), + message: Some(msg), + }); + } + } + + // Maintainer-set-changed (npm only). + for (before, after) in &cs.version_changed { + let Some(RegEco::Npm) = supported_ecosystem(after) else { + continue; + }; + let Some(entry) = lookup(&agent, cache_root.as_ref(), RegEco::Npm, &after.name) else { + continue; + }; + let bef = entry + .maintainers + .get(&before.version) + .cloned() + .unwrap_or_default(); + let aft = entry + .maintainers + .get(&after.version) + .cloned() + .unwrap_or_default(); + if bef.is_empty() && aft.is_empty() { + continue; + } + let bset: std::collections::BTreeSet<&String> = bef.iter().collect(); + let aset: std::collections::BTreeSet<&String> = aft.iter().collect(); + if bset == aset { + continue; + } + let added: Vec = aset.difference(&bset).map(|s| (*s).clone()).collect(); + let removed: Vec = bset.difference(&aset).map(|s| (*s).clone()).collect(); + out.maintainer_set_changed.push(MaintainerSetChanged { + before: before.clone(), + after: after.clone(), + added, + removed, + }); + } + + out +} + +/// Internal eco discriminator — Copy-able because we route on it +/// repeatedly. Only the three registry-supported ecosystems are +/// represented; everything else returns `None` from +/// [`supported_ecosystem`]. +#[derive(Debug, Clone, Copy, PartialEq, Eq)] +enum RegEco { + Npm, + PyPI, + Cargo, +} + +impl RegEco { + fn dir(self) -> &'static str { + match self { + RegEco::Npm => "npm", + RegEco::PyPI => "pypi", + RegEco::Cargo => "cargo", + } + } +} + +fn supported_ecosystem(c: &Component) -> Option { + match c.ecosystem { + Ecosystem::Npm => Some(RegEco::Npm), + Ecosystem::PyPI => Some(RegEco::PyPI), + Ecosystem::Cargo => Some(RegEco::Cargo), + _ => None, + } +} + +fn lookup( + agent: &ureq::Agent, + cache_root: Option<&PathBuf>, + eco: RegEco, + name: &str, +) -> Option { + if let Some(root) = cache_root + && let Some(cached) = read_cache(root, eco, name) + { + return Some(cached); + } + let entry = match eco { + RegEco::Npm => fetch_npm(agent, name), + RegEco::PyPI => fetch_pypi(agent, name), + RegEco::Cargo => fetch_cargo(agent, name), + }; + if let (Some(root), Some(e)) = (cache_root, entry.as_ref()) { + write_cache(root, eco, name, e); + } + entry +} +fn fetch_npm(agent: &ureq::Agent, name: &str) -> Option { + let url = format!("https://registry.npmjs.org/{}", url_encode(name)); + let resp = agent + .get(&url) + .set( + "user-agent", + concat!("bomdrift/", env!("CARGO_PKG_VERSION")), + ) + .call() + .ok()?; + let json: serde_json::Value = resp.into_json().ok()?; + let mut entry = CacheEntry { + fetched_at: now_secs(), + ..Default::default() + }; + if let Some(t) = json.get("time").and_then(|v| v.as_object()) { + entry.published_at = t + .get("modified") + .and_then(|v| v.as_str()) + .map(str::to_string); + for (k, v) in t { + if k == "modified" || k == "created" { + continue; + } + if let Some(s) = v.as_str() { + entry.versions.insert(k.clone(), s.to_string()); + } + } + } + if let Some(versions) = json.get("versions").and_then(|v| v.as_object()) { + // Use the latest version's deprecated message if any version is + // deprecated. npm sets this per-version. + for v in versions.values() { + if let Some(d) = v.get("deprecated").and_then(|d| d.as_str()) { + entry.deprecated_message = Some(d.to_string()); + } + // Per-version maintainers. + if let Some(version) = v.get("version").and_then(|x| x.as_str()) + && let Some(maints) = v.get("maintainers").and_then(|m| m.as_array()) + { + let mut names: Vec = maints + .iter() + .filter_map(|m| m.get("name").and_then(|n| n.as_str()).map(str::to_string)) + .collect(); + names.sort(); + names.dedup(); + entry.maintainers.insert(version.to_string(), names); + } + } + } + Some(entry) +} + +fn fetch_pypi(agent: &ureq::Agent, name: &str) -> Option { + let url = format!("https://pypi.org/pypi/{}/json", url_encode(name)); + let resp = agent + .get(&url) + .set( + "user-agent", + concat!("bomdrift/", env!("CARGO_PKG_VERSION")), + ) + .call() + .ok()?; + let json: serde_json::Value = resp.into_json().ok()?; + let mut entry = CacheEntry { + fetched_at: now_secs(), + ..Default::default() + }; + let info = json.get("info"); + if let Some(yanked) = info.and_then(|i| i.get("yanked")).and_then(|v| v.as_bool()) + && yanked + { + let reason = info + .and_then(|i| i.get("yanked_reason")) + .and_then(|v| v.as_str()) + .unwrap_or("yanked"); + entry.deprecated_message = Some(format!("PyPI yanked: {reason}")); + } + if let Some(classifiers) = info + .and_then(|i| i.get("classifiers")) + .and_then(|v| v.as_array()) + { + for c in classifiers { + if let Some(s) = c.as_str() + && (s.contains("Inactive") || s.contains("Abandoned")) + { + entry + .deprecated_message + .get_or_insert_with(|| format!("PyPI classifier: {s}")); + } + } + } + if let Some(releases) = json.get("releases").and_then(|v| v.as_object()) { + for (ver, files) in releases { + if let Some(arr) = files.as_array() + && let Some(first) = arr.first() + && let Some(s) = first.get("upload_time_iso_8601").and_then(|v| v.as_str()) + { + entry.versions.insert(ver.clone(), s.to_string()); + } + } + } + Some(entry) +} + +fn fetch_cargo(agent: &ureq::Agent, name: &str) -> Option { + let url = format!("https://crates.io/api/v1/crates/{}", url_encode(name)); + let resp = agent + .get(&url) + .set( + "user-agent", + "bomdrift/0.9.0 (https://github.com/Metbcy/bomdrift)", + ) + .call() + .ok()?; + let json: serde_json::Value = resp.into_json().ok()?; + let mut entry = CacheEntry { + fetched_at: now_secs(), + ..Default::default() + }; + entry.published_at = json + .get("crate") + .and_then(|c| c.get("updated_at")) + .and_then(|v| v.as_str()) + .map(str::to_string); + if let Some(versions) = json.get("versions").and_then(|v| v.as_array()) { + for v in versions { + let Some(num) = v.get("num").and_then(|n| n.as_str()) else { + continue; + }; + if let Some(p) = v.get("published_at").and_then(|x| x.as_str()) { + entry.versions.insert(num.to_string(), p.to_string()); + } + if v.get("yanked").and_then(|y| y.as_bool()).unwrap_or(false) { + entry.deprecated_message = Some(format!("crates.io yanked: version {num} yanked")); + } + } + } + Some(entry) +} + +// --- cache I/O --- + +fn cache_root() -> Option { + crate::refresh::default_cache_root() + .ok() + .map(|r| r.join(SUBDIR)) +} + +fn cache_path(root: &std::path::Path, eco: RegEco, name: &str) -> PathBuf { + root.join(eco.dir()) + .join(format!("{}.json", sanitize(name))) +} + +fn read_cache(root: &std::path::Path, eco: RegEco, name: &str) -> Option { + let p = cache_path(root, eco, name); + let body = std::fs::read(&p).ok()?; + let entry: CacheEntry = serde_json::from_slice(&body).ok()?; + if now_secs().saturating_sub(entry.fetched_at) > CACHE_TTL_SECS { + return None; + } + Some(entry) +} + +fn write_cache(root: &std::path::Path, eco: RegEco, name: &str, entry: &CacheEntry) { + let p = cache_path(root, eco, name); + if let Some(parent) = p.parent() + && std::fs::create_dir_all(parent).is_err() + { + return; + } + let Ok(body) = serde_json::to_vec(entry) else { + return; + }; + let mut tmp = p.clone(); + tmp.set_extension("json.tmp"); + if std::fs::write(&tmp, body).is_err() { + return; + } + let _ = std::fs::rename(&tmp, &p); +} + +fn sanitize(name: &str) -> String { + name.chars() + .map(|c| { + if c.is_ascii_alphanumeric() || matches!(c, '-' | '_' | '.') { + c + } else { + '_' + } + }) + .collect() +} + +fn url_encode(s: &str) -> String { + // Minimal encoder for `@scope/name` and the like. + let mut out = String::with_capacity(s.len()); + for c in s.chars() { + match c { + '/' => out.push_str("%2F"), + '@' => out.push_str("%40"), + ' ' => out.push_str("%20"), + _ => out.push(c), + } + } + out +} + +fn now_secs() -> u64 { + SystemTime::now() + .duration_since(UNIX_EPOCH) + .map(|d| d.as_secs()) + .unwrap_or(0) +} + +fn days_since(iso8601: &str) -> Option { + use time::OffsetDateTime; + use time::format_description::well_known::Rfc3339; + // PyPI / npm / crates.io all emit RFC 3339 timestamps. + let t = OffsetDateTime::parse(iso8601, &Rfc3339).ok()?; + let now = crate::clock::now(); + Some((now - t).whole_days()) +} + +// --- parsers exposed for tests (skip network) --- + +#[cfg(test)] +fn parse_npm_value(json: &serde_json::Value) -> CacheEntry { + let mut entry = CacheEntry { + fetched_at: 0, + ..Default::default() + }; + if let Some(t) = json.get("time").and_then(|v| v.as_object()) { + entry.published_at = t + .get("modified") + .and_then(|v| v.as_str()) + .map(str::to_string); + for (k, v) in t { + if k == "modified" || k == "created" { + continue; + } + if let Some(s) = v.as_str() { + entry.versions.insert(k.clone(), s.to_string()); + } + } + } + if let Some(versions) = json.get("versions").and_then(|v| v.as_object()) { + for v in versions.values() { + if let Some(d) = v.get("deprecated").and_then(|d| d.as_str()) { + entry.deprecated_message = Some(d.to_string()); + } + if let Some(version) = v.get("version").and_then(|x| x.as_str()) + && let Some(maints) = v.get("maintainers").and_then(|m| m.as_array()) + { + let mut names: Vec = maints + .iter() + .filter_map(|m| m.get("name").and_then(|n| n.as_str()).map(str::to_string)) + .collect(); + names.sort(); + names.dedup(); + entry.maintainers.insert(version.to_string(), names); + } + } + } + entry +} + +#[cfg(test)] +fn parse_pypi_value(json: &serde_json::Value) -> CacheEntry { + let mut entry = CacheEntry { + fetched_at: 0, + ..Default::default() + }; + let info = json.get("info"); + if let Some(yanked) = info.and_then(|i| i.get("yanked")).and_then(|v| v.as_bool()) + && yanked + { + let reason = info + .and_then(|i| i.get("yanked_reason")) + .and_then(|v| v.as_str()) + .unwrap_or("yanked"); + entry.deprecated_message = Some(format!("PyPI yanked: {reason}")); + } + if let Some(classifiers) = info + .and_then(|i| i.get("classifiers")) + .and_then(|v| v.as_array()) + { + for c in classifiers { + if let Some(s) = c.as_str() + && (s.contains("Inactive") || s.contains("Abandoned")) + { + entry + .deprecated_message + .get_or_insert_with(|| format!("PyPI classifier: {s}")); + } + } + } + if let Some(releases) = json.get("releases").and_then(|v| v.as_object()) { + for (ver, files) in releases { + if let Some(arr) = files.as_array() + && let Some(first) = arr.first() + && let Some(s) = first.get("upload_time_iso_8601").and_then(|v| v.as_str()) + { + entry.versions.insert(ver.clone(), s.to_string()); + } + } + } + entry +} + +#[cfg(test)] +fn parse_cargo_value(json: &serde_json::Value) -> CacheEntry { + let mut entry = CacheEntry { + fetched_at: 0, + ..Default::default() + }; + entry.published_at = json + .get("crate") + .and_then(|c| c.get("updated_at")) + .and_then(|v| v.as_str()) + .map(str::to_string); + if let Some(versions) = json.get("versions").and_then(|v| v.as_array()) { + for v in versions { + let Some(num) = v.get("num").and_then(|n| n.as_str()) else { + continue; + }; + if let Some(p) = v.get("published_at").and_then(|x| x.as_str()) { + entry.versions.insert(num.to_string(), p.to_string()); + } + if v.get("yanked").and_then(|y| y.as_bool()).unwrap_or(false) { + entry.deprecated_message = Some(format!("crates.io yanked: version {num} yanked")); + } + } + } + entry +} + +#[cfg(test)] +mod tests { + use super::*; + use serde_json::json; + + #[test] + fn npm_parse_recent_publish_and_deprecated() { + let v = json!({ + "time": { + "modified": "2026-04-29T00:00:00.000Z", + "1.0.0": "2024-01-01T00:00:00.000Z", + "2.0.0": "2026-04-29T00:00:00.000Z" + }, + "versions": { + "1.0.0": { + "version": "1.0.0", + "maintainers": [{"name": "alice"}, {"name": "bob"}] + }, + "2.0.0": { + "version": "2.0.0", + "deprecated": "use newer-pkg instead", + "maintainers": [{"name": "alice"}, {"name": "carol"}] + } + } + }); + let e = parse_npm_value(&v); + assert_eq!( + e.versions.get("2.0.0").map(|s| s.as_str()), + Some("2026-04-29T00:00:00.000Z") + ); + assert_eq!( + e.deprecated_message.as_deref(), + Some("use newer-pkg instead") + ); + assert_eq!( + e.maintainers.get("1.0.0").unwrap(), + &vec!["alice".to_string(), "bob".to_string()] + ); + assert_eq!( + e.maintainers.get("2.0.0").unwrap(), + &vec!["alice".to_string(), "carol".to_string()] + ); + } + + #[test] + fn pypi_parse_yanked() { + let v = json!({ + "info": { + "yanked": true, + "yanked_reason": "security", + "classifiers": ["Development Status :: 7 - Inactive"] + }, + "releases": { + "1.0.0": [{"upload_time_iso_8601": "2024-01-01T00:00:00Z"}] + } + }); + let e = parse_pypi_value(&v); + assert!(e.deprecated_message.as_deref().unwrap().contains("yanked")); + assert_eq!( + e.versions.get("1.0.0").map(|s| s.as_str()), + Some("2024-01-01T00:00:00Z") + ); + } + + #[test] + fn cargo_parse_yanked_and_recent() { + let v = json!({ + "crate": { "updated_at": "2026-04-29T00:00:00+00:00" }, + "versions": [ + { "num": "1.0.0", "yanked": false, "published_at": "2024-01-01T00:00:00+00:00" }, + { "num": "2.0.0", "yanked": true, "published_at": "2026-04-29T00:00:00+00:00" } + ] + }); + let e = parse_cargo_value(&v); + assert_eq!(e.published_at.as_deref(), Some("2026-04-29T00:00:00+00:00")); + assert!(e.deprecated_message.as_deref().unwrap().contains("yanked")); + assert_eq!(e.versions.len(), 2); + } + + #[test] + fn url_encode_handles_npm_scopes() { + assert_eq!(url_encode("@scope/name"), "%40scope%2Fname"); + assert_eq!(url_encode("plain"), "plain"); + } + + #[test] + fn days_since_zero_for_now() { + // SAFETY: serialized by other clock tests. + unsafe { + std::env::set_var("SOURCE_DATE_EPOCH", "1777593600"); + } + let d = days_since("2026-05-01T00:00:00Z").unwrap(); + assert_eq!(d, 0); + unsafe { + std::env::remove_var("SOURCE_DATE_EPOCH"); + } + } +} diff --git a/src/lib.rs b/src/lib.rs index 277f283..7d3aa5d 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -177,6 +177,15 @@ fn run_diff(mut args: DiffArgs) -> Result<()> { }; enrichment.license_violations = enrich::license::enrich(&cs, &license_policy); + // Registry-metadata enrichers (Phase K, v0.9). Best-effort — a + // registry timeout returns Ok with no findings. + if !args.no_registry { + let findings = enrich::registry::enrich(&cs, args.recently_published_days); + enrichment.recently_published = findings.recently_published; + enrichment.deprecated = findings.deprecated; + enrichment.maintainer_set_changed = findings.maintainer_set_changed; + } + // Apply the baseline AFTER all enrichers run — suppression operates on // the realized finding set, not on intermediate inputs. This keeps the // baseline file format stable as new enrichers are added: a new finding @@ -379,6 +388,8 @@ pub fn tripped(cs: &ChangeSet, e: &Enrichment, threshold: FailOn) -> bool { FailOn::LicenseChange => !cs.license_changed.is_empty(), FailOn::Kev => any_kev(e), FailOn::LicenseViolation => !e.license_violations.is_empty(), + FailOn::RecentlyPublished => !e.recently_published.is_empty(), + FailOn::Deprecated => !e.deprecated.is_empty(), FailOn::Any => e.has_findings() || !cs.license_changed.is_empty() || any_kev(e), } } @@ -520,6 +531,42 @@ fn write_calibration_lines( format, ); } + for f in &e.recently_published { + write_calibration_row( + out, + "recently-published", + f.component + .purl + .as_deref() + .unwrap_or(f.component.name.as_str()), + CalibrationScore::Int(f.days_old), + CalibrationThreshold::Int(crate::enrich::registry::MIN_PUBLISHED_AGE_DAYS), + format, + ); + } + for f in &e.deprecated { + write_calibration_row( + out, + "deprecated", + f.component + .purl + .as_deref() + .unwrap_or(f.component.name.as_str()), + CalibrationScore::Text(f.message.as_deref().unwrap_or("(deprecated)")), + CalibrationThreshold::Text("any"), + format, + ); + } + for f in &e.maintainer_set_changed { + write_calibration_row( + out, + "maintainer-set-changed", + f.after.purl.as_deref().unwrap_or(f.after.name.as_str()), + CalibrationScore::Int((f.added.len() + f.removed.len()) as i64), + CalibrationThreshold::Int(1), + format, + ); + } } /// Numeric or symbolic score for a calibration row. Float/Int rendered diff --git a/src/render/json.rs b/src/render/json.rs index da1a532..25be12d 100644 --- a/src/render/json.rs +++ b/src/render/json.rs @@ -205,6 +205,9 @@ mod tests { maintainer_age: Vec::new(), license_violations: Vec::new(), + recently_published: Vec::new(), + deprecated: Vec::new(), + maintainer_set_changed: Vec::new(), vex_annotations: HashMap::new(), vex_suppressed_count: 0, }; diff --git a/src/render/markdown.rs b/src/render/markdown.rs index e29baaa..18289e6 100644 --- a/src/render/markdown.rs +++ b/src/render/markdown.rs @@ -129,6 +129,23 @@ pub fn render_with_options(cs: &ChangeSet, enrichment: &Enrichment, opts: Option enrichment.license_violations.len() ); } + if !enrichment.recently_published.is_empty() { + let _ = writeln!( + out, + "| Recently published | {} |", + enrichment.recently_published.len() + ); + } + if !enrichment.deprecated.is_empty() { + let _ = writeln!(out, "| Deprecated | {} |", enrichment.deprecated.len()); + } + if !enrichment.maintainer_set_changed.is_empty() { + let _ = writeln!( + out, + "| Maintainer set changed | {} |", + enrichment.maintainer_set_changed.len() + ); + } if enrichment.vex_suppressed_count > 0 { let _ = writeln!( out, @@ -346,6 +363,94 @@ pub fn render_with_options(cs: &ChangeSet, enrichment: &Enrichment, opts: Option section_close(&mut out); } + if !enrichment.recently_published.is_empty() { + section_open( + &mut out, + "Recently published (added deps)", + enrichment.recently_published.len(), + None, + ); + out.push_str( + "These newly added dependencies were published to their registry within the \ + configured threshold (default 14 days). Recent publishes correlate with \ + takeover swaps and namespace-reuse attacks. \ + [Why this matters](https://metbcy.github.io/bomdrift/enrichers/registry.html)\n\n", + ); + out.push_str("| Ecosystem | Name | Version | Published | Days |\n|---|---|---|---|---:|\n"); + for f in &enrichment.recently_published { + let _ = writeln!( + out, + "| {} | {} | {} | {} | {} |", + f.component.ecosystem, + f.component.name, + f.component.version, + f.published_at, + f.days_old, + ); + } + section_close(&mut out); + } + + if !enrichment.deprecated.is_empty() { + section_open( + &mut out, + "Deprecated upstream", + enrichment.deprecated.len(), + None, + ); + out.push_str( + "These dependencies are flagged deprecated or yanked by their package registry. \ + [Why this matters](https://metbcy.github.io/bomdrift/enrichers/registry.html)\n\n", + ); + out.push_str("| Ecosystem | Name | Version | Message |\n|---|---|---|---|\n"); + for f in &enrichment.deprecated { + let _ = writeln!( + out, + "| {} | {} | {} | {} |", + f.component.ecosystem, + f.component.name, + f.component.version, + f.message.as_deref().unwrap_or("(deprecated upstream)"), + ); + } + section_close(&mut out); + } + + if !enrichment.maintainer_set_changed.is_empty() { + section_open( + &mut out, + "Maintainer set changed (npm)", + enrichment.maintainer_set_changed.len(), + None, + ); + out.push_str( + "These npm dependencies have a different set of maintainers compared to the \ + previous version. New publish-rights are a classic takeover-attack precursor. \ + [Why this matters](https://metbcy.github.io/bomdrift/enrichers/registry.html)\n\n", + ); + out.push_str("| Name | Before | After | Added | Removed |\n|---|---|---|---|---|\n"); + for f in &enrichment.maintainer_set_changed { + let _ = writeln!( + out, + "| {} | {} | {} | {} | {} |", + f.after.name, + f.before.version, + f.after.version, + if f.added.is_empty() { + "(none)".to_string() + } else { + f.added.join(", ") + }, + if f.removed.is_empty() { + "(none)".to_string() + } else { + f.removed.join(", ") + }, + ); + } + section_close(&mut out); + } + write_footer(&mut out, &opts); out diff --git a/src/render/sarif.rs b/src/render/sarif.rs index d8f3240..70661d8 100644 --- a/src/render/sarif.rs +++ b/src/render/sarif.rs @@ -151,6 +151,38 @@ fn rules() -> Value { advisory heuristic).", "https://metbcy.github.io/bomdrift/license-policy.html", ), + rule( + "bomdrift.recently-published", + "recently-published", + "Newly added component was published to its registry recently", + "The component's most recent registry publish timestamp is \ + younger than the configured threshold (default 14 days). \ + Recent publishes correlate with takeover swaps and \ + namespace-reuse attacks. Always informational severity \ + (`warning`).", + "https://metbcy.github.io/bomdrift/enrichers/registry.html", + ), + rule( + "bomdrift.deprecated", + "deprecated", + "Component is deprecated or yanked upstream", + "The component's package registry (npm / PyPI / crates.io) \ + marks this version (or the package) as deprecated, yanked, \ + or inactive. Severity `error` because the upstream signal \ + is unambiguous.", + "https://metbcy.github.io/bomdrift/enrichers/registry.html", + ), + rule( + "bomdrift.maintainer-set-changed", + "maintainer-set-changed", + "npm package's maintainer set changed across the version bump", + "The set of npm maintainers listed for the new version \ + differs from the maintainer set listed for the old \ + version. New maintainers gaining publish rights is a \ + classic takeover-attack precursor (cf. xz / Jia Tan). \ + Severity `warning`.", + "https://metbcy.github.io/bomdrift/enrichers/registry.html", + ), ]) } @@ -419,6 +451,90 @@ fn results(cs: &ChangeSet, e: &Enrichment) -> Value { })); } + // ---- bomdrift.recently-published ---- + for f in &e.recently_published { + let name = &f.component.name; + let purl_or_name = f.component.purl.as_deref().unwrap_or(name); + let fp = fingerprint(&["bomdrift.recently-published", purl_or_name, &f.published_at]); + out.push(json!({ + "ruleId": "bomdrift.recently-published", + "level": "warning", + "message": { + "text": format!( + "`{name}` was published {} day(s) ago ({}). Recent publishes correlate with takeover swaps.", + f.days_old, f.published_at, + ), + }, + "locations": [synthetic_location()], + "partialFingerprints": { "primaryHash/v1": fp }, + "properties": { + "purl": f.component.purl, + "name": name, + "version": f.component.version, + "publishedAt": f.published_at, + "daysOld": f.days_old, + }, + })); + } + + // ---- bomdrift.deprecated ---- + for f in &e.deprecated { + let name = &f.component.name; + let purl_or_name = f.component.purl.as_deref().unwrap_or(name); + let msg = f.message.as_deref().unwrap_or("(deprecated upstream)"); + let fp = fingerprint(&["bomdrift.deprecated", purl_or_name, msg]); + out.push(json!({ + "ruleId": "bomdrift.deprecated", + "level": "error", + "message": { + "text": format!("`{name}` is deprecated upstream: {msg}"), + }, + "locations": [synthetic_location()], + "partialFingerprints": { "primaryHash/v1": fp }, + "properties": { + "purl": f.component.purl, + "name": name, + "version": f.component.version, + "message": msg, + }, + })); + } + + // ---- bomdrift.maintainer-set-changed ---- + for f in &e.maintainer_set_changed { + let name = &f.after.name; + let purl_or_name = f.after.purl.as_deref().unwrap_or(name); + let added = f.added.join(","); + let removed = f.removed.join(","); + let fp = fingerprint(&[ + "bomdrift.maintainer-set-changed", + purl_or_name, + &added, + &removed, + ]); + out.push(json!({ + "ruleId": "bomdrift.maintainer-set-changed", + "level": "warning", + "message": { + "text": format!( + "`{name}` maintainer set changed: +{} / -{}.", + if added.is_empty() { "(none)".into() } else { added.clone() }, + if removed.is_empty() { "(none)".into() } else { removed.clone() }, + ), + }, + "locations": [synthetic_location()], + "partialFingerprints": { "primaryHash/v1": fp }, + "properties": { + "purl": f.after.purl, + "name": name, + "before": f.before.version, + "after": f.after.version, + "added": f.added, + "removed": f.removed, + }, + })); + } + Value::Array(out) } @@ -491,6 +607,9 @@ mod tests { "bomdrift.young-maintainer", "bomdrift.license-change", "bomdrift.license-violation", + "bomdrift.recently-published", + "bomdrift.deprecated", + "bomdrift.maintainer-set-changed", ], "rule IDs are stable public API — order also stable for byte-determinism", ); From a1348257944a6c7ea1f3304e5f75292ce53a989c Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:44:33 -0700 Subject: [PATCH 6/8] feat(gitlab): comment-driven suppress with security-reviewed Cloudflare Worker bridge New `bomdrift baseline add --from-comment ` flag. Parses the raw note body, extracts the first `/bomdrift suppress [ reason: ]` directive, validates the ID (GHSA/CVE/MAL/OSV), and either appends the suppression or exits non-zero with a clear stderr message so a misconfigured webhook bridge fails loudly. The grammar mirrors comment-suppress/entrypoint.sh's; a cross-reference comment in both files notes the lockstep contract. examples/gitlab-ci/comment-bridge/ ships a Cloudflare Worker reference implementation enforcing five guards: 1. Webhook secret (constant-time X-Gitlab-Token compare). 2. Event-type filter (Note Hook only). 3. Project-ID allowlist. 4. Commenter access_level >= 30. 5. MR-context guard (rejects fork-MR exfiltration). The bridge triggers a GitLab pipeline that runs `bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY"` and pushes the change. suppress.gitlab-ci.yml is updated to handle both the manual (BOMDRIFT_SUPPRESS_ID) and bridge (BOMDRIFT_NOTE_BODY) paths. Includes a Vercel/Netlify/Lambda port note and a deployment guide with a documented threat model. gitlab-ci.md gains an Advanced: Comment-driven suppression section with the trade-off statement up front; baseline.md covers the new --from-comment flag. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- comment-suppress/entrypoint.sh | 5 + docs/src/baseline.md | 18 ++++ docs/src/gitlab-ci.md | 42 ++++++++ examples/gitlab-ci/comment-bridge/README.md | 82 ++++++++++++++ .../comment-bridge/vercel-equivalent.md | 28 +++++ examples/gitlab-ci/comment-bridge/worker.js | 92 ++++++++++++++++ examples/gitlab-ci/suppress.gitlab-ci.yml | 48 ++++++--- src/baseline.rs | 101 ++++++++++++++++++ src/cli.rs | 22 +++- src/lib.rs | 33 +++++- 10 files changed, 449 insertions(+), 22 deletions(-) create mode 100644 examples/gitlab-ci/comment-bridge/README.md create mode 100644 examples/gitlab-ci/comment-bridge/vercel-equivalent.md create mode 100644 examples/gitlab-ci/comment-bridge/worker.js diff --git a/comment-suppress/entrypoint.sh b/comment-suppress/entrypoint.sh index a46c903..bd470f3 100755 --- a/comment-suppress/entrypoint.sh +++ b/comment-suppress/entrypoint.sh @@ -72,6 +72,11 @@ fi # is preserved alongside the advisory id. Pattern matches the start of # any line (case-insensitive) so reviewers can write # `reason: awaiting upstream patch (issue #42)` on a continuation line. +# +# This shell parser MUST stay in lockstep with the Rust +# `baseline::parse_comment_directive` parser used by the GitLab +# webhook bridge (`bomdrift baseline add --from-comment`). Any +# grammar change has to land in both places. reason="$(printf '%s\n' "$comment_body" \ | grep -iE '^\s*reason:\s*' \ | head -n1 \ diff --git a/docs/src/baseline.md b/docs/src/baseline.md index 59099b3..48156bd 100644 --- a/docs/src/baseline.md +++ b/docs/src/baseline.md @@ -145,6 +145,24 @@ bomdrift baseline add CVE-2026-12345 --path custom/baseline.json The command is idempotent — re-adding an existing ID is a no-op. +### `--from-comment` (v0.9+) + +When the GitLab comment-suppress bridge (or any other webhook +handler) hands you a raw note body, pass it via `--from-comment` +and let bomdrift extract the directive: + +```bash +bomdrift baseline add --from-comment "Looks fine. /bomdrift suppress GHSA-mwcw-c2x4-8c55 reason: vendor PR #42 already merged" +``` + +The flag accepts the entire comment body. bomdrift parses the first +`/bomdrift suppress [ reason: ]` line, validates the ID +shape, and either appends the entry (writing object-form when a +reason is present) or exits non-zero with a clear stderr message +when no directive is found. The grammar is identical to the GitHub +`comment-suppress` sub-action — the two parsers are deliberately +kept in lockstep. + ## Workflow integration A typical CI pattern commits the baseline alongside the source code and diff --git a/docs/src/gitlab-ci.md b/docs/src/gitlab-ci.md index e768e1e..2336200 100644 --- a/docs/src/gitlab-ci.md +++ b/docs/src/gitlab-ci.md @@ -176,3 +176,45 @@ permissions). | In-comment suppression | ✅ | v0.8 | | Manual suppression job | n/a | ✅ | | `` marker | ✅ | ✅ (same shape — cross-platform tooling can grep one shape) | + +## Comment-driven suppression (advanced) + +> **Trade-off up front.** Comment-driven suppression turns a +> reviewer comment like `/bomdrift suppress GHSA-...` into an +> automatic baseline edit. To wire it up safely you need to operate +> a small public webhook handler. The manual suppression job +> documented above is supported and lower-risk; reach for the +> bridge only when the zero-click UX is worth running a service. + +The GitHub flow ships out-of-the-box (`comment-suppress` sub-action +fronted by the existing webhook). GitLab requires a webhook handler +because GitLab's `Note Hook` doesn't include a command-prefix filter. + +### Bridge + +`examples/gitlab-ci/comment-bridge/` ships a Cloudflare Worker +reference implementation that enforces five security guards: + +1. Webhook secret verification (constant-time `X-Gitlab-Token`). +2. Event-type filter (`Note Hook` only). +3. Project-ID allowlist. +4. Commenter access_level >= 30 (Developer+ on the project). +5. MR-context guard (rejects fork-MR comment exfiltration). + +When the guards pass, the worker triggers the GitLab pipeline with +`BOMDRIFT_NOTE_BODY` set to the raw comment body. The +`bomdrift:suppress` job in `suppress.gitlab-ci.yml` then runs +`bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY"` to +extract the directive and update `.bomdrift/baseline.json`. + +The threat model is documented in +[`examples/gitlab-ci/comment-bridge/README.md`](https://github.com/Metbcy/bomdrift/tree/main/examples/gitlab-ci/comment-bridge#threat-model). +The same logic ports to Vercel / Netlify / AWS Lambda — see +[`vercel-equivalent.md`](https://github.com/Metbcy/bomdrift/blob/main/examples/gitlab-ci/comment-bridge/vercel-equivalent.md). + +### Recommended hosting + +Cloudflare Workers — the reference. The free tier covers most +webhook traffic. `wrangler tail` makes live debugging easy. +Vercel / Netlify Edge Functions are equally good if your team +already operates on those platforms. diff --git a/examples/gitlab-ci/comment-bridge/README.md b/examples/gitlab-ci/comment-bridge/README.md new file mode 100644 index 0000000..c13e19a --- /dev/null +++ b/examples/gitlab-ci/comment-bridge/README.md @@ -0,0 +1,82 @@ +# GitLab comment-driven suppress bridge + +Reference implementation of the webhook handler that turns a +`/bomdrift suppress ` MR comment on GitLab into a manual +pipeline trigger which runs +`bomdrift baseline add --from-comment ` on the MR branch. + +The bridge is **opt-in advanced infrastructure**. Most teams should +prefer the manual flow in `examples/gitlab-ci/README.md`. Only +deploy this if you've decided the zero-click suppression UX is worth +operating a small public service. + +## Architecture + +``` +┌─────────────────┐ Note Hook ┌─────────────────────┐ +│ GitLab MR note │ ───────────────▶ │ Cloudflare Worker │ +│ /bomdrift … │ X-Gitlab-Token │ (this directory) │ +└─────────────────┘ └─────────┬───────────┘ + │ verifies 5 guards + ▼ + ┌─────────────────────┐ + │ GitLab pipeline │ + │ trigger │ + └─────────┬───────────┘ + ▼ + ┌─────────────────────┐ + │ bomdrift baseline │ + │ add --from-comment│ + └─────────────────────┘ +``` + +## Threat model + +Five guards. Each prevents a distinct class of attack: + +| # | Guard | Attack prevented | +|---|---|---| +| 1 | **Webhook secret verification** (`X-Gitlab-Token` constant-time compare) | Unauthenticated POSTs from anyone on the internet. | +| 2 | **Event-type filter** (only `Note Hook`) | Type-confusion: a forged `Push Hook` body that contains a `/bomdrift suppress` line. | +| 3 | **Project-ID allowlist** | Foreign-project replay. | +| 4 | **Commenter-permission check** (`access_level >= 30`, Developer+) | Random outsiders commenting `/bomdrift suppress …` on a public project. | +| 5 | **MR-context guard** (`merge_request.state == "opened"` AND `target_project_id == project.id`) | Fork-MR exfiltration. | + +Failures return 4xx without invoking the pipeline trigger. Use +`wrangler tail` for live debugging. + +## Deployment + +1. `npm install -g wrangler` +2. `wrangler secret put` for `WEBHOOK_SECRET`, `PROJECT_ALLOWLIST`, + `GITLAB_API_URL`, `BOT_API_TOKEN`, `PIPELINE_TRIGGER_TOKEN`. +3. `wrangler deploy`. +4. In GitLab → Settings → Webhooks: add the Worker URL with + **Comments** events, SSL verification ON, and the + `WEBHOOK_SECRET` value. +5. Smoke-test by adding `/bomdrift suppress GHSA-test-1234-aaaa` to + an MR comment. `wrangler tail` should show the trigger firing. + +## curl-based smoke test (no Worker) + +The actual `--from-comment` parser is unit-tested in the bomdrift +crate. To smoke-test locally: + +```sh +bomdrift baseline add --from-comment "Looks fine. /bomdrift suppress GHSA-mwcw-c2x4-8c55" +# stderr: bomdrift: added 'GHSA-mwcw-c2x4-8c55' to .bomdrift/baseline.json +bomdrift baseline add --from-comment "no directive here" +# exit code: 1 +``` + +## Troubleshooting + +| Symptom | Likely cause | +|---|---| +| 401 from worker | `X-Gitlab-Token` mismatch with `WEBHOOK_SECRET`. | +| 403 from worker | Project not allowlisted, or commenter lacks Developer access, or fork-MR. | +| 200 from worker, no pipeline trigger | `PIPELINE_TRIGGER_TOKEN` invalid; `wrangler tail`. | + +## Hosting alternatives + +See [`vercel-equivalent.md`](./vercel-equivalent.md). diff --git a/examples/gitlab-ci/comment-bridge/vercel-equivalent.md b/examples/gitlab-ci/comment-bridge/vercel-equivalent.md new file mode 100644 index 0000000..16d06a9 --- /dev/null +++ b/examples/gitlab-ci/comment-bridge/vercel-equivalent.md @@ -0,0 +1,28 @@ +# Hosting the comment-suppress bridge on Vercel / Netlify / AWS Lambda + +The Cloudflare Worker reference implementation in `worker.js` uses +only the standard Web Fetch API (`Request`, `Response`, `fetch`, +`FormData`). It ports to other edge-function platforms with minimal +adaptation: + +- **Vercel Edge Functions** — drop `worker.js` in + `api/bomdrift.js`, rename `export default { fetch }` to + `export default async function handler(request)`. Configure env + vars in the Vercel UI. +- **Netlify Edge Functions** — same shape; configure via + `netlify.toml` and the Netlify UI. +- **AWS Lambda@Edge / Lambda + API Gateway** — wrap the handler in + the Lambda event/response envelope; port `FormData` to + `URLSearchParams`. + +The threat model (five guards) is the same on every host. Only +these change per host: + +1. How env vars are injected. +2. How the body is read. +3. The deploy command. + +Recommend Cloudflare Workers as the reference because its free tier +(100k req/day) covers most webhook traffic and the deploy story +(`wrangler deploy`) is the simplest. Vercel / Netlify are equally +good if your team already operates on those platforms. diff --git a/examples/gitlab-ci/comment-bridge/worker.js b/examples/gitlab-ci/comment-bridge/worker.js new file mode 100644 index 0000000..253258d --- /dev/null +++ b/examples/gitlab-ci/comment-bridge/worker.js @@ -0,0 +1,92 @@ +/* Cloudflare Worker — GitLab comment-driven suppress bridge. + * + * Five guards before triggering the bomdrift suppress pipeline: + * 1. Webhook secret (constant-time compare). + * 2. Event-type filter (Note Hook only). + * 3. Project-ID allowlist. + * 4. Commenter access_level >= 30. + * 5. MR-context guard (state=opened AND target_project_id===project.id). + * + * Required secrets: + * WEBHOOK_SECRET, PROJECT_ALLOWLIST, GITLAB_API_URL, BOT_API_TOKEN, + * PIPELINE_TRIGGER_TOKEN. + */ + +export default { + async fetch(request, env) { + if (request.method !== "POST") return new Response("method", { status: 405 }); + + // Guard 1. + const provided = request.headers.get("X-Gitlab-Token") ?? ""; + if (!constantTimeEqual(provided, env.WEBHOOK_SECRET ?? "")) { + return new Response("forbidden", { status: 401 }); + } + + // Guard 2. + if ((request.headers.get("X-Gitlab-Event") ?? "") !== "Note Hook") { + return new Response("ignored", { status: 204 }); + } + + let body; + try { + body = await request.json(); + } catch { + return new Response("bad json", { status: 400 }); + } + + // Guard 3. + const projectId = body?.project?.id; + const allow = (env.PROJECT_ALLOWLIST ?? "").split(",").map((s) => s.trim()); + if (!projectId || !allow.includes(String(projectId))) { + return new Response("project not allowlisted", { status: 403 }); + } + + // Guard 5. + const mr = body?.merge_request; + if (!mr || mr.state !== "opened") { + return new Response("not an open MR", { status: 204 }); + } + if (mr.target_project_id !== projectId) { + return new Response("fork-MR refused", { status: 403 }); + } + + // Quick parse: comment looks like a directive? + const text = body?.object_attributes?.note ?? ""; + if (!/\/bomdrift\s+suppress\s+\S+/.test(text)) { + return new Response("no directive", { status: 204 }); + } + + // Guard 4. + const userId = body?.user?.id ?? body?.object_attributes?.author_id; + if (!userId) return new Response("no commenter id", { status: 400 }); + const memberUrl = `${env.GITLAB_API_URL}/api/v4/projects/${projectId}/members/all/${userId}`; + const memberResp = await fetch(memberUrl, { + headers: { "PRIVATE-TOKEN": env.BOT_API_TOKEN }, + }); + if (!memberResp.ok) return new Response("permission lookup failed", { status: 403 }); + const member = await memberResp.json(); + if ((member.access_level ?? 0) < 30) { + return new Response("commenter not Developer+", { status: 403 }); + } + + // All guards passed. Trigger the pipeline. The directive body is + // forwarded as `BOMDRIFT_NOTE_BODY`; the suppress job invokes + // `bomdrift baseline add --from-comment "$BOMDRIFT_NOTE_BODY"`. + const triggerUrl = `${env.GITLAB_API_URL}/api/v4/projects/${projectId}/trigger/pipeline`; + const ref = mr.source_branch ?? "main"; + const form = new FormData(); + form.append("token", env.PIPELINE_TRIGGER_TOKEN); + form.append("ref", ref); + form.append("variables[BOMDRIFT_NOTE_BODY]", text); + const trig = await fetch(triggerUrl, { method: "POST", body: form }); + if (!trig.ok) return new Response("pipeline trigger failed", { status: 502 }); + return new Response("triggered", { status: 204 }); + }, +}; + +function constantTimeEqual(a, b) { + if (a.length !== b.length) return false; + let acc = 0; + for (let i = 0; i < a.length; i++) acc |= a.charCodeAt(i) ^ b.charCodeAt(i); + return acc === 0; +} diff --git a/examples/gitlab-ci/suppress.gitlab-ci.yml b/examples/gitlab-ci/suppress.gitlab-ci.yml index 626f2d5..4d9207a 100644 --- a/examples/gitlab-ci/suppress.gitlab-ci.yml +++ b/examples/gitlab-ci/suppress.gitlab-ci.yml @@ -28,10 +28,15 @@ bomdrift:suppress: rules: - if: $CI_PIPELINE_SOURCE == "merge_request_event" when: manual + # When the comment-bridge worker triggers a pipeline, the source + # is `pipeline` and the directive arrives as `BOMDRIFT_NOTE_BODY`. + - if: $BOMDRIFT_NOTE_BODY + when: always variables: BOMDRIFT_SUPPRESS_ID: "" + BOMDRIFT_NOTE_BODY: "" BOMDRIFT_BASELINE_PATH: ".bomdrift/baseline.json" - BOMDRIFT_VERSION: "v0.7.0" + BOMDRIFT_VERSION: "v0.9.0" before_script: - apk add --no-cache bash curl git tar gzip ca-certificates script: @@ -39,22 +44,11 @@ bomdrift:suppress: set -euo pipefail bash <<'BOMDRIFT' - if [ -z "${BOMDRIFT_SUPPRESS_ID}" ]; then - echo "BOMDRIFT_SUPPRESS_ID is empty — pass an advisory ID (GHSA / CVE / MAL pattern) when triggering the job" >&2 - exit 1 - fi if [ -z "${BOMDRIFT_PUSH_TOKEN:-}" ]; then echo "BOMDRIFT_PUSH_TOKEN not set; cannot push baseline update" >&2 exit 1 fi - # --- Validate the ID shape early; the CLI does this too but a clear - # --- error here saves the download round-trip. ----------------------- - case "${BOMDRIFT_SUPPRESS_ID}" in - GHSA-* | CVE-* | MAL-*) ;; - *) echo "Unrecognized advisory ID shape: ${BOMDRIFT_SUPPRESS_ID}" >&2; exit 1 ;; - esac - ARCH="$(uname -m)" case "$ARCH" in x86_64) ARCH="x86_64-unknown-linux-musl" ;; @@ -67,22 +61,42 @@ bomdrift:suppress: chmod +x ./bomdrift # --- Resolve MR head ref -------------------------------------------- - HEAD_REF="${CI_MERGE_REQUEST_SOURCE_BRANCH_NAME}" + HEAD_REF="${CI_MERGE_REQUEST_SOURCE_BRANCH_NAME:-${CI_COMMIT_REF_NAME}}" git fetch origin "${HEAD_REF}" git checkout "${HEAD_REF}" - # --- Append + commit ------------------------------------------------- - ./bomdrift baseline add "${BOMDRIFT_SUPPRESS_ID}" --path "${BOMDRIFT_BASELINE_PATH}" + # --- Two paths ------------------------------------------------------- + # Path A: comment-bridge supplied the raw note body — let the + # bomdrift CLI parse the `/bomdrift suppress [ reason: ...]` + # directive (matches the comment-suppress sub-action grammar). + # + # Path B (manual fallback): use the explicit BOMDRIFT_SUPPRESS_ID + # variable — same shape as v0.7 / v0.8. + if [ -n "${BOMDRIFT_NOTE_BODY}" ]; then + ./bomdrift baseline add \ + --from-comment "${BOMDRIFT_NOTE_BODY}" \ + --path "${BOMDRIFT_BASELINE_PATH}" + else + if [ -z "${BOMDRIFT_SUPPRESS_ID}" ]; then + echo "Either BOMDRIFT_NOTE_BODY or BOMDRIFT_SUPPRESS_ID must be set" >&2 + exit 1 + fi + case "${BOMDRIFT_SUPPRESS_ID}" in + GHSA-* | CVE-* | MAL-*) ;; + *) echo "Unrecognized advisory ID shape: ${BOMDRIFT_SUPPRESS_ID}" >&2; exit 1 ;; + esac + ./bomdrift baseline add "${BOMDRIFT_SUPPRESS_ID}" --path "${BOMDRIFT_BASELINE_PATH}" + fi if git diff --quiet -- "${BOMDRIFT_BASELINE_PATH}"; then - echo "Baseline unchanged — ${BOMDRIFT_SUPPRESS_ID} was already suppressed." + echo "Baseline unchanged — directive was a no-op." exit 0 fi git config user.email "bomdrift-suppress@${CI_SERVER_HOST}" git config user.name "bomdrift suppress" git add "${BOMDRIFT_BASELINE_PATH}" - git commit -m "chore(bomdrift): suppress ${BOMDRIFT_SUPPRESS_ID}" + git commit -m "chore(bomdrift): suppress (via $( [ -n "${BOMDRIFT_NOTE_BODY}" ] && echo comment || echo manual ))" AUTH_REMOTE="https://oauth2:${BOMDRIFT_PUSH_TOKEN}@${CI_SERVER_HOST}/${CI_PROJECT_PATH}.git" git push "${AUTH_REMOTE}" "HEAD:${HEAD_REF}" diff --git a/src/baseline.rs b/src/baseline.rs index 195ca23..b114ef2 100644 --- a/src/baseline.rs +++ b/src/baseline.rs @@ -439,6 +439,72 @@ pub enum AddOutcome { AlreadyPresent, } +/// Parse the body of a PR/MR comment and extract a single +/// `/bomdrift suppress [ reason: ]` directive. The grammar +/// is documented in CLI help and in +/// `examples/gitlab-ci/comment-bridge/`'s threat model. The same +/// shape is honored by `comment-suppress/entrypoint.sh` for the +/// GitHub flow — keep these in lockstep. +/// +/// Returns `Ok(Some((id, optional_reason)))` on a single match, +/// `Ok(None)` on no match, `Err` on a malformed ID. +pub fn parse_comment_directive(body: &str) -> Result)>> { + // Looks for `/bomdrift suppress [ reason: ]` on each + // line; the directive may be preceded by free-form prose and/or + // mention markers. A leading `^\s*` anchor on the directive itself + // is too strict — reviewers paste the directive after a comment. + for line in body.lines() { + let Some(idx) = line.find("/bomdrift") else { + continue; + }; + let rest = &line[idx + "/bomdrift".len()..]; + let rest = rest.trim_start(); + let Some(rest) = rest.strip_prefix("suppress") else { + continue; + }; + let rest = rest.trim_start(); + if rest.is_empty() { + continue; + } + let mut iter = rest.splitn(2, char::is_whitespace); + let raw_id = iter.next().unwrap_or("").trim(); + if raw_id.is_empty() { + continue; + } + if !is_valid_advisory_id(raw_id) { + anyhow::bail!( + "comment directive contained a malformed advisory ID: {raw_id:?} \ + (expected GHSA-/CVE-/MAL-/OSV- prefix and alnum/dash body)" + ); + } + let reason = iter.next().and_then(|tail| { + let tail = tail.trim(); + tail.strip_prefix("reason:") + .map(|r| r.trim().to_string()) + .filter(|s| !s.is_empty()) + }); + return Ok(Some((raw_id.to_string(), reason))); + } + Ok(None) +} + +fn is_valid_advisory_id(s: &str) -> bool { + // Aligns with comment-suppress/entrypoint.sh's regex: + // ^(GHSA-[a-z0-9-]+|CVE-[0-9]{4}-[0-9]+|MAL-[0-9]{4}-[0-9]+|OSV-[A-Z0-9-]+)$ + // Kept slightly looser here (we accept GHSA-uppercase and OSV-* too) + // so future advisory schemes don't trip the bridge unnecessarily. + let Some((prefix, rest)) = s.split_once('-') else { + return false; + }; + if !matches!(prefix, "GHSA" | "CVE" | "MAL" | "OSV") { + return false; + } + if rest.is_empty() { + return false; + } + rest.chars().all(|c| c.is_ascii_alphanumeric() || c == '-') +} + fn doc_kind(v: &serde_json::Value) -> &'static str { match v { serde_json::Value::Null => "null", @@ -892,4 +958,39 @@ mod tests { std::fs::create_dir_all(&path).unwrap(); path } + + // ---- v0.9 comment-directive parser ---- + + #[test] + fn parse_comment_directive_extracts_id_only() { + let body = "Looks fine. /bomdrift suppress GHSA-mwcw-c2x4-8c55"; + let r = parse_comment_directive(body).unwrap().unwrap(); + assert_eq!(r.0, "GHSA-mwcw-c2x4-8c55"); + assert_eq!(r.1, None); + } + + #[test] + fn parse_comment_directive_extracts_id_and_reason() { + let body = "/bomdrift suppress CVE-2024-12345 reason: vendor confirmed false-positive"; + let r = parse_comment_directive(body).unwrap().unwrap(); + assert_eq!(r.0, "CVE-2024-12345"); + assert_eq!(r.1.as_deref(), Some("vendor confirmed false-positive")); + } + + #[test] + fn parse_comment_directive_returns_none_when_no_directive() { + assert!( + parse_comment_directive("no directive here") + .unwrap() + .is_none() + ); + } + + #[test] + fn parse_comment_directive_rejects_malformed_id() { + let err = parse_comment_directive("/bomdrift suppress not-an-id") + .unwrap_err() + .to_string(); + assert!(err.contains("malformed")); + } } diff --git a/src/cli.rs b/src/cli.rs index 14552db..3bade4d 100644 --- a/src/cli.rs +++ b/src/cli.rs @@ -69,7 +69,10 @@ pub struct BaselineAddArgs { /// match by ID. Use the diff-output baseline format (the JSON shape /// emitted by `bomdrift diff --output json`) for finer per-purl /// suppression instead. - pub id: String, + /// + /// Optional when `--from-comment` is supplied — the directive in + /// the comment body provides the ID instead. + pub id: Option, /// Path to the baseline file. Created if missing; parent directory is /// created if missing. @@ -88,6 +91,23 @@ pub struct BaselineAddArgs { /// the entry expires. Free-form text. #[arg(long)] pub reason: Option, + + /// Parse the body of a forge-issued PR/MR comment and extract the + /// suppress directive. Accepts the raw note body as a single + /// string. The directive grammar (matched case-sensitively at the + /// start of any line, after optional leading whitespace): + /// + /// ```text + /// /bomdrift suppress [ reason: ] + /// ``` + /// + /// `` must match `(?:GHSA|CVE|MAL|OSV)-[A-Z0-9-]+`. When no + /// matching line is found, the command exits with a non-zero code + /// and prints a clear stderr message — so a webhook bridge that + /// invokes this flag doesn't silently no-op on a non-suppress + /// comment. v0.9+. + #[arg(long)] + pub from_comment: Option, } #[derive(Args, Debug)] diff --git a/src/lib.rs b/src/lib.rs index 7d3aa5d..f8d679f 100644 --- a/src/lib.rs +++ b/src/lib.rs @@ -75,24 +75,49 @@ fn run_baseline(action: BaselineAction) -> Result<()> { clock::parse_ymd(s) .with_context(|| format!("--expires must be YYYY-MM-DD, got {s:?}"))?; } + + // --from-comment overrides positional id/reason. Used by the + // GitLab webhook bridge (Phase L). Non-zero exit when the + // body has no directive — silent no-op would let mis-configured + // bridges look like they worked. + let (id, reason_owned) = if let Some(body) = &args.from_comment { + match baseline::parse_comment_directive(body)? { + Some((id, reason)) => (id, reason), + None => { + eprintln!( + "bomdrift: --from-comment body contained no `/bomdrift suppress ` directive" + ); + std::process::exit(1); + } + } + } else { + let Some(id) = args.id.clone() else { + eprintln!( + "bomdrift baseline add: missing required ADVISORY_ID (use a positional argument or --from-comment )" + ); + std::process::exit(2); + }; + (id, args.reason.clone()) + }; + let outcome = baseline::add_suppression_full( &args.path, - &args.id, + &id, args.expires.as_deref(), - args.reason.as_deref(), + reason_owned.as_deref(), )?; match outcome { baseline::AddOutcome::Added => { eprintln!( "bomdrift: added '{id}' to {path}", - id = args.id.trim(), + id = id.trim(), path = args.path.display(), ); } baseline::AddOutcome::AlreadyPresent => { eprintln!( "bomdrift: '{id}' already present in {path}; no change", - id = args.id.trim(), + id = id.trim(), path = args.path.display(), ); } From affaac087f7ba2c2b9d58fe438260e93e02efb56 Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:46:14 -0700 Subject: [PATCH 7/8] docs: explicit non-goals + pairing recommendations MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit README gains a fleshed-out Non-goals section covering reachability, tarball static analysis, auto-fix PR generation, continuous monitoring, container scans, SAST/secrets, risk-score dashboards, and closed-source advisory feeds — each with a one-line rationale. A Pair with... table summarizes recommended complementary tools. STATUS adopts the same content (shorter form) as 'Out-of-scope by design' alongside the now-shipped v0.9 capability rows. docs/src/roadmap.md is reorganized: v0.9 items move to Shipped, v0.8 gains a recap, and the Future candidates list is updated with items the v0.9 phases didn't cover (per-exception SPDX granularity, PyPI/crates.io maintainer-set, OpenVEX vocabulary). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- README.md | 55 ++++++++++++++++++++-- STATUS.md | 29 ++++++++++-- docs/src/roadmap.md | 111 ++++++++++++++++++-------------------------- 3 files changed, 123 insertions(+), 72 deletions(-) diff --git a/README.md b/README.md index c808598..e4acd94 100644 --- a/README.md +++ b/README.md @@ -257,9 +257,58 @@ PRs welcome. The `good first issue` label tracks focused asks for new contributo ## Non-goals -- **SBOM generation.** Use [Syft](https://github.com/anchore/syft) — it's already great. bomdrift only consumes SBOMs (and as of v0.5 invokes Syft itself inside the Action so consumers don't have to). -- **Dependency-tree visualization.** [`cargo tree`](https://doc.rust-lang.org/cargo/commands/cargo-tree.html), [`pnpm why`](https://pnpm.io/cli/why), and friends do this well. -- **Replacing your SCA scanner.** OSV-scanner, Grype, Trivy all have richer vulnerability databases. bomdrift's CVE enrichment is *change-focused*: only on what's new in this diff. +bomdrift's design constraints (OSS-first, single-binary, no +telemetry, change-focused) put a number of capabilities deliberately +out of scope. We don't ship them, but we recommend pairing bomdrift +with tools that do. + +- **SBOM generation.** Use [Syft](https://github.com/anchore/syft) — + it's already great. bomdrift only consumes SBOMs (and as of v0.5 + invokes Syft itself inside the Action so consumers don't have to). +- **Dependency-tree visualization.** + [`cargo tree`](https://doc.rust-lang.org/cargo/commands/cargo-tree.html), + [`pnpm why`](https://pnpm.io/cli/why), and friends do this well. +- **Reachability / call-graph analysis.** "Is this CVE reachable + from my code's entry points?" requires AST + call-graph + infrastructure orthogonal to SBOM diffing. *Pair with Endor Labs + or Snyk Reachability.* +- **Static analysis of registry tarballs.** Detecting malicious code + inside a published package needs a sandbox + behavior heuristics. + *Pair with [Socket](https://socket.dev/).* +- **Auto-fix PR generation.** bomdrift surfaces findings; it doesn't + open follow-up PRs. *Pair with Renovate or Dependabot.* +- **Continuous monitoring / always-on agent.** bomdrift is a + one-shot CLI invoked from CI. There's no daemon, no telemetry, no + scheduled background polling. *Run bomdrift in a scheduled CI + workflow if you want periodic re-checks.* +- **Container / OCI image scanning.** SBOM + image-layer scanning is + Trivy / Grype's lane. Use them; bomdrift focuses on + application-dependency drift between two SBOMs. +- **SAST / secrets scanning.** Different problem space; well + served by GitHub Advanced Security, Semgrep, or gitleaks. +- **Risk-score dashboards / asset-context aggregation.** Cross-repo + dashboards inevitably require telemetry, which violates bomdrift's + no-telemetry tenet. *Pair with Endor / Snyk if your org needs + centralized risk reporting.* +- **Closed-source advisory databases.** bomdrift uses OSV.dev (the + open advisory aggregator). Closed proprietary feeds aren't + consumed in the OSS distribution. +- **Replacing your SCA scanner.** OSV-scanner, Grype, Trivy all + have richer vulnerability databases for *full-tree* scans. + bomdrift's CVE enrichment is **change-focused**: only on what's + new in this diff. + +### Pair with… + +| Need | Recommended tool | +|---|---| +| Reachability analysis | Endor Labs, Snyk Reachability | +| Tarball / behavior analysis | Socket | +| Auto-fix PRs | Renovate, Dependabot | +| Container image scans | Trivy, Grype | +| SAST / secrets | GitHub Advanced Security, Semgrep, gitleaks | +| Cross-repo risk dashboards | Endor, Snyk | +| SBOM generation | Syft (bomdrift bundles this in the Action) | ## License diff --git a/STATUS.md b/STATUS.md index 31878a2..a78b1d8 100644 --- a/STATUS.md +++ b/STATUS.md @@ -17,12 +17,35 @@ keeping the project OSS-first: no hosted dashboard, no account, no telemetry. | CISA KEV (known-exploited) flagging | Supported (v0.8+) — auto, opt-out via `--no-kev` | | License allow/deny policy | Supported (v0.8+) — `[license]` block / CLI flags | | Suppression expiry (`expires` + `reason`) | Supported (v0.8+) — time-boxed risk acceptance | -| GitLab CI merge requests | Supported through the `examples/gitlab-ci/` template (v0.7+); in-comment suppression deferred to v0.9 | +| GitLab CI merge requests | Supported through the `examples/gitlab-ci/` template (v0.7+); comment-driven suppression supported via Cloudflare Worker bridge (v0.9+) | | GitHub Enterprise / self-hosted runners | Expected to work, not broadly tested yet | -| Bitbucket / Azure DevOps | Planned for v0.9 | -| VEX consume / emit | Planned for v0.9 | +| Bitbucket Pipelines | Supported (v0.9+) — `examples/bitbucket-pipelines/` | +| Azure DevOps Pipelines | Supported (v0.9+) — `examples/azure-devops/` | +| VEX consume / emit | Supported (v0.9+) — OpenVEX 0.2.0 + CycloneDX VEX 1.6 | +| SPDX expression evaluation | Supported (v0.9+) — full `Expression::evaluate` via `spdx` crate | +| Registry-metadata enrichers (npm/PyPI/crates.io) | Supported (v0.9+) — recently-published, deprecated, maintainer-set-changed | | Hosted dashboard / SaaS | Not planned | +## Out-of-scope by design + +bomdrift's design constraints (OSS-first, single-binary, no +telemetry, change-focused) put a number of capabilities deliberately +out of scope. Pair bomdrift with the suggested complementary tools +when you need them — see the README's +[Non-goals](https://github.com/Metbcy/bomdrift#non-goals) section +for the rationale. + +| Out-of-scope | Pair with | +|---|---| +| Reachability / call-graph analysis | Endor Labs, Snyk Reachability | +| Tarball / behavior analysis | Socket | +| Auto-fix PR generation | Renovate, Dependabot | +| Container / OCI image scanning | Trivy, Grype | +| SAST / secrets scanning | GitHub Advanced Security, Semgrep, gitleaks | +| Risk-score dashboards (cross-repo) | Endor, Snyk | +| Continuous monitoring / always-on agent | Run bomdrift in scheduled CI | +| Closed-source advisory feeds | bomdrift uses OSV.dev only | + ## Known limitations - The zero-config Action path is built for `pull_request` workflows. For diff --git a/docs/src/roadmap.md b/docs/src/roadmap.md index a3ecc8c..03ce1b3 100644 --- a/docs/src/roadmap.md +++ b/docs/src/roadmap.md @@ -3,87 +3,66 @@ What's planned, what's deliberately out of scope, and what the acceptance criteria for new contributions look like. +## Shipped (v0.9 — interoperability + breadth) + +- **VEX consume** — `--vex ` accepts OpenVEX 0.2.0 + CycloneDX + VEX 1.6 statements; `not_affected` / `fixed` suppress findings, + `under_investigation` annotates. +- **VEX emit** — `--emit-vex ` emits an OpenVEX 0.2.0 document + with explicit per-entry `vex_status` (default + `under_investigation`, never auto-promoted). +- **Full SPDX expression evaluator** via the `spdx` crate. Deprecates + `allow_ambiguous`. +- **Bitbucket Pipelines + Azure DevOps Pipelines** templates with + auto-detection (`BITBUCKET_BUILD_NUMBER`, `TF_BUILD`) and + per-platform footer shapes. +- **Registry-metadata enrichers** — npm/PyPI/crates.io. New kinds: + recently-published, deprecated, maintainer-set-changed (npm only). +- **GitLab comment-driven suppression** via a security-reviewed + Cloudflare Worker reference bridge (five guards). +- **Explicit non-goals + pair-with recommendations** in README and + STATUS. + ## Shipped (v0.8 — supply-chain hardening) -- **SARIF + GitHub Code Scanning** with stable per-result fingerprints +- SARIF + GitHub Code Scanning with stable per-result fingerprints and one-line action opt-in (`upload-to-code-scanning: true`). -- **EPSS scoring** on every CVE-aliased advisory; `--fail-on-epss` - threshold gating. -- **CISA KEV flagging** of known-exploited advisories; - `--fail-on kev`. -- **License allow/deny policy** with `*`-suffix glob matching and +- EPSS scoring on every CVE-aliased advisory; `--fail-on-epss`. +- CISA KEV flagging of known-exploited advisories; `--fail-on kev`. +- License allow/deny policy with `*`-suffix glob matching and fail-closed compound-expression handling. New `bomdrift.license-violation` SARIF rule. -- **Baseline `expires` + `reason`** for time-boxed risk acceptance, - with stderr warnings on expired entries. -- **`time` crate adoption + `clock` module** — single source of truth - for date/time, honors `SOURCE_DATE_EPOCH`. -- **OSV CVE aliases** threaded through `VulnRef` (prerequisite for - EPSS / KEV / VEX). -- **`--debug-calibration-format jsonl`** alternative to the v0.7 - pipe-delimited format. -- **`--output-file `** CLI flag (avoids `>` redirection in YAML). - -## Planned (v0.9 — interoperability + breadth) - -- **VEX consume** — `--vex ` accepts OpenVEX 0.2.0 + CycloneDX - VEX 1.6 statements; `not_affected` / `fixed` suppress findings, - `under_investigation` annotates. -- **VEX emit** — `--emit-vex ` emits an OpenVEX document from - baseline-suppressed findings. Defaults to - `under_investigation` (the safe truth-claim); per-entry - `vex_status` override required for `not_affected`. -- **SPDX expression evaluator** — replaces v0.8's atomic+glob matcher - with full `(MIT OR Apache-2.0)` evaluation via the `spdx` crate. - Deprecates `allow_ambiguous`. -- **Multi-SCM templates** — Bitbucket Pipelines + Azure DevOps with - per-platform footer shapes and PR-comment upsert recipes. -- **Registry-metadata enrichers** — npm `time.modified`, PyPI - `info.yanked`, crates.io `versions[].yanked`. New finding kinds: - `RecentlyPublished`, `Deprecated`, `MaintainerSetChanged`. -- **GitLab in-comment suppression** with explicit security guards - (token verification, event filter, project allowlist, commenter - permissions, fork-MR safety). Reference Cloudflare Worker bridge. -- **Explicit non-goals doc** — reachability, tarball static analysis, - auto-fix PR generation, container image scanning, SAST/secrets, - risk-score dashboards. Pair with Endor/Snyk for reachability, - Renovate/Dependabot for auto-fix. +- Baseline `expires` + `reason` with stderr warnings on expiry. +- `time` crate + `clock` module honoring `SOURCE_DATE_EPOCH`. +- OSV CVE aliases threaded through `VulnRef`. +- `--debug-calibration-format jsonl` and `--output-file `. ## Future candidates (not committed) -- **GraphQL maintainer-age** — was investigated for v0.4 and deferred. - The current REST implementation already uses `?per_page=1` + Link-header - parsing for top contributor and contributor count. The remaining - round-trip cost is the per-author commit-history pagination, and - GitHub's GraphQL `history()` connection doesn't expose ASC ordering — - finding the oldest commit still requires cursor pagination. v0.5 may - approach this via `User.contributionsCollection` or accept that REST - is the right tool here. +- **Per-exception SPDX allow/deny** — currently the WITH-exception + identity is informational only; allow/deny narrows to base + license. v1.0 candidate. +- **PyPI / crates.io maintainer-set-changed** — blocked on + per-version maintainer data in upstream APIs. +- **VEX vocabulary beyond OpenVEX's 8 justifications** — bomdrift + uses the spec's enum verbatim. If a richer vocab emerges we'll + follow. +- **GraphQL maintainer-age** — was investigated for v0.4 and + deferred. Cursor-pagination cost still steers us toward REST. - **Custom rules / plugin system** — let consumers add - organization-specific enrichers (e.g. "flag any dep from - internal-mirror.example.com without a SHA-256 attestation"). - Probably WASM-based for sandboxing. -- **GitLab in-comment suppression** — v0.7 ships the GitLab CI - template + `--platform gitlab` (the diff path); v0.9 will add the - comment-driven `/bomdrift suppress ` flow with explicit - security guards. -- **Calibration tuning from `--debug-calibration` data** — v0.7 - added the diagnostic flag; v0.8 may revise - `SIMILARITY_THRESHOLD`, `YOUNG_MAINTAINER_DAYS`, and OSV cache - TTL defaults based on adopter-collected samples shared on - issue #5. -- **OCI artifact attestation** — verify SBOMs are themselves signed - by the build system before diffing. Pairs with cosign attest. + organization-specific enrichers. Probably WASM-based. +- **OCI artifact attestation** — verify SBOMs are signed by the + build system before diffing. ### Calibration backlog -These are tunable thresholds where the v0.3 default may not be the -right answer at scale. Adjusting requires real-world signal data, so -they're tracked as "watch the false-positive rate": +Tunable thresholds where the default may not be the right answer +at scale: - Typosquat `SIMILARITY_THRESHOLD` (currently 0.92). - Maintainer-age `YOUNG_MAINTAINER_DAYS` (currently 90). -- OSV severity cache TTL (currently 24h). +- Registry `MIN_PUBLISHED_AGE_DAYS` (currently 14). +- OSV / EPSS / KEV / Registry cache TTL (currently 24h). ## Non-goals From 1f631a47f6fe2a91aa232c71c784590329e85bf2 Mon Sep 17 00:00:00 2001 From: bomdrift Date: Wed, 29 Apr 2026 14:48:34 -0700 Subject: [PATCH 8/8] chore(release): prepare v0.9.0 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Bump Cargo.toml + Cargo.lock to 0.9.0. - Bump README / quickstart / issue template references from v0.8.0 to v0.9.0. - Add the 0.9.0 CHANGELOG entry covering Phases G–L2 with explicit Scope notes for what stayed deferred (per-exception SPDX allow/deny, PyPI/crates.io maintainer sets, Bitbucket/Azure DevOps comment-suppress, OpenVEX vocabulary extensions). Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> --- .github/ISSUE_TEMPLATE/action-broke.md | 2 +- CHANGELOG.md | 101 +++++++++++++++++++++++++ Cargo.lock | 2 +- Cargo.toml | 2 +- README.md | 8 +- docs/src/quickstart.md | 6 +- 6 files changed, 111 insertions(+), 10 deletions(-) diff --git a/.github/ISSUE_TEMPLATE/action-broke.md b/.github/ISSUE_TEMPLATE/action-broke.md index db76fdd..fb0d4d2 100644 --- a/.github/ISSUE_TEMPLATE/action-broke.md +++ b/.github/ISSUE_TEMPLATE/action-broke.md @@ -36,6 +36,6 @@ failure is usually obvious if you expand all groups. --> ## Environment -- **bomdrift version pin**: `@v1` / `@v0.8.0` / `@` +- **bomdrift version pin**: `@v1` / `@v0.9.0` / `@` - **Runner**: - **Trigger event**: diff --git a/CHANGELOG.md b/CHANGELOG.md index efa3763..d3fd153 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -7,6 +7,107 @@ project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] +## [0.9.0] - 2026-05-01 + +The "interoperability + breadth" milestone. v0.9 adds VEX (Vulnerability +Exploitability eXchange) consumption + emission, full SPDX expression +evaluation, multi-SCM templates (Bitbucket Pipelines + Azure DevOps), +registry-metadata enrichers (npm/PyPI/crates.io), and a security-reviewed +GitLab comment-driven suppression bridge. + +### Added + +- **VEX consume (`--vex `, repeatable).** Auto-detects OpenVEX + 0.2.0 vs CycloneDX VEX 1.6 per file. Statements with status + `not_affected` / `fixed` suppress matching findings (counted in the + new "Suppressed by VEX" markdown summary row); `under_investigation` + annotates with a `VEX:` badge in markdown and + `properties.vexStatus` in SARIF; `affected` annotates as a no-op. + Match keys are `(VulnRef.id OR alias, purl_with_version)` with a + documented synthetic-id convention for non-CVE finding kinds + (`bomdrift.::`). +- **VEX emit (`--emit-vex `).** Writes a single OpenVEX 0.2.0 + document covering baseline-suppressed entries (status from the + per-entry `vex_status`, defaulting to `under_investigation` — + baseline ≠ "not affected", never auto-promoted) and un-suppressed + findings (status `affected` with `status_notes` describing the + finding kind). New baseline fields `vex_status` and + `vex_justification`. New `[diff] vex_author` and + `[diff] vex_default_justification` config keys. +- **Full SPDX expression evaluator** via the `spdx = "0.10"` crate. + Replaces v0.8's atomic+glob matcher: `(MIT OR Apache-2.0)` with + `allow=[MIT]` permits; `(MIT AND GPL-3.0-only)` with + `deny=[GPL-3.0-only]` violates; `Apache-2.0 WITH LLVM-exception` + parses cleanly (base license checked, exception identity + informational only). Non-SPDX strings fall back to the v0.8 + atomic+glob path. `allow_ambiguous` is deprecated with a one-time + stderr warning. +- **Bitbucket Pipelines + Azure DevOps Pipelines** support. + `Platform::Bitbucket` and `Platform::AzureDevOps` variants on the + CLI; auto-detection via `BITBUCKET_BUILD_NUMBER` and `TF_BUILD` + envs; `BITBUCKET_GIT_HTTP_ORIGIN` and `BUILD_REPOSITORY_URI` + honored as `--repo-url` fallbacks. Per-platform footer URL shapes + (`/issues/new` for Bitbucket; `/_workitems/create` for Azure + DevOps). Drop-in templates with READMEs in + `examples/bitbucket-pipelines/` and `examples/azure-devops/`. +- **Registry-metadata enrichers (`src/enrich/registry.rs`).** Three + best-effort fetchers — npm, PyPI, crates.io — with disk cache at + `/bomdrift/registry//.json` (24h TTL, + atomic temp-file + rename). Three new finding kinds: + `RecentlyPublished` (default <14d threshold, tunable via + `--recently-published-days`), `Deprecated` (npm + `versions[].deprecated`, PyPI `info.yanked` + Inactive + classifiers, crates.io `versions[].yanked`), and + `MaintainerSetChanged` (npm only — PyPI/crates.io don't expose + per-version maintainers cleanly). New `--no-registry` flag and + `[diff] no_registry = true` config key. New `--fail-on + recently-published` and `--fail-on deprecated` thresholds. New + SARIF rules `bomdrift.recently-published`, `bomdrift.deprecated`, + `bomdrift.maintainer-set-changed` with stable + `partialFingerprints.primaryHash/v1`. +- **GitLab comment-driven suppress.** `bomdrift baseline add + --from-comment ` parses the raw note body, extracts the + first `/bomdrift suppress [ reason: ]` directive, and + fails non-zero with a clear stderr message when no directive is + found (so a misconfigured webhook bridge fails loudly). + `examples/gitlab-ci/comment-bridge/` ships a Cloudflare Worker + reference implementation enforcing five guards: webhook secret + (constant-time compare), event-type filter, project-ID + allowlist, commenter access_level >= 30, and an MR-context + guard that rejects fork-MR exfiltration. Vercel/Netlify/Lambda + port note included. +- **Explicit non-goals + pair-with recommendations** in README, + STATUS, and the roadmap. Reachability, tarball static analysis, + auto-fix PR generation, continuous monitoring, container + scanning, SAST/secrets, risk-score dashboards, and closed-source + advisory feeds are deliberately out of scope; each is paired with + a recommended complementary tool. + +### Changed + +- OSV cache schema extended with `aliases: Vec` so cache + hits no longer drop alias data. Old cache entries without the + field deserialize with an empty vec (graceful migration). +- `Command::Diff` argument is now boxed (`Box`) to satisfy + clippy's large-enum-variant lint after the v0.9 flag growth. + Internal-only change; no user-facing impact. +- `BaselineAddArgs.id` becomes optional (`Option`) to allow + `--from-comment` invocations without a positional ID. Existing + positional callers are unchanged. + +### Scope notes (deferred) + +- **Per-exception SPDX allow/deny** — the WITH-exception identifier + is currently informational only; allow/deny narrows to the base + license. +- **PyPI / crates.io maintainer-set-changed** — blocked on per-version + maintainer data in upstream APIs. +- **Bitbucket / Azure DevOps comment-driven suppression** — only the + diff path ships in v0.9; comment-suppress is GitHub-only (and + GitLab via the bridge). +- **VEX vocabulary beyond OpenVEX's 8 justifications** — bomdrift + uses the spec enum verbatim; no extension yet. + ## [0.8.0] - 2026-04-29 The "supply-chain hardening" milestone. v0.8 finishes SARIF for GitHub diff --git a/Cargo.lock b/Cargo.lock index 256096d..5e7fba4 100644 --- a/Cargo.lock +++ b/Cargo.lock @@ -123,7 +123,7 @@ dependencies = [ [[package]] name = "bomdrift" -version = "0.8.0" +version = "0.9.0" dependencies = [ "anyhow", "clap", diff --git a/Cargo.toml b/Cargo.toml index e4add8f..b0201ba 100644 --- a/Cargo.toml +++ b/Cargo.toml @@ -1,6 +1,6 @@ [package] name = "bomdrift" -version = "0.8.0" +version = "0.9.0" edition = "2024" rust-version = "1.88" description = "SBOM diff with supply-chain risk signals (CVEs, typosquats, maintainer-age)." diff --git a/README.md b/README.md index e4acd94..1dc3be3 100644 --- a/README.md +++ b/README.md @@ -81,7 +81,7 @@ jobs: # verify-signatures: true (set false on trusted mirrors) ``` -Pin to `@v1` for the latest v0.x; pin to `@v0.8.0` for reproducible builds. Run `bomdrift init` if you want a checked-in `.bomdrift.toml` policy and both workflows scaffolded locally. See the [Action reference](https://metbcy.github.io/bomdrift/github-action.html) for every input. +Pin to `@v1` for the latest v0.x; pin to `@v0.9.0` for reproducible builds. Run `bomdrift init` if you want a checked-in `.bomdrift.toml` policy and both workflows scaffolded locally. See the [Action reference](https://metbcy.github.io/bomdrift/github-action.html) for every input. #### Optional: in-comment suppression (v0.5+) @@ -112,7 +112,7 @@ Comment `/bomdrift suppress GHSA-xxxx` on any PR; the sub-action appends to `.bo Pre-built binaries cover Linux x86_64 + aarch64, macOS aarch64, and Windows x86_64. Each archive is cosign-signed via Sigstore + GitHub OIDC. ```bash -VERSION=v0.8.0 +VERSION=v0.9.0 TARGET=x86_64-unknown-linux-gnu curl -sSL -o bomdrift.tar.gz \ "https://github.com/Metbcy/bomdrift/releases/download/${VERSION}/bomdrift-${VERSION}-${TARGET}.tar.gz" @@ -128,7 +128,7 @@ Verify the archive's signature before you trust the binary — see [Release sign ### From source ```bash -cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.8.0 bomdrift +cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.9.0 bomdrift ``` Requires Rust 1.85+ (the project uses edition 2024). @@ -230,7 +230,7 @@ Every release archive is signed with cosign keyless via Sigstore (GitHub OIDC). ```bash # Replace VERSION + TARGET with your downloaded archive's pair -VERSION=v0.8.0 +VERSION=v0.9.0 TARGET=x86_64-unknown-linux-gnu ARCHIVE=bomdrift-${VERSION}-${TARGET}.tar.gz diff --git a/docs/src/quickstart.md b/docs/src/quickstart.md index 7876ae0..29f24d1 100644 --- a/docs/src/quickstart.md +++ b/docs/src/quickstart.md @@ -25,7 +25,7 @@ jobs: ``` The `@v1` mutable tag tracks the latest v0.x release. Pin to a specific -version (`@v0.8.0`) if you prefer reproducible builds. See +version (`@v0.9.0`) if you prefer reproducible builds. See [GitHub Action](./github-action.md) for every input. If you prefer a checked-in policy file, install the binary and run @@ -39,7 +39,7 @@ Pre-built binaries cover Linux x86_64 + aarch64, macOS aarch64, and Windows x86_64. Each archive is cosign-signed via Sigstore + GitHub OIDC. ```bash -VERSION=v0.8.0 +VERSION=v0.9.0 TARGET=x86_64-unknown-linux-gnu curl -sSL -o bomdrift.tar.gz \ "https://github.com/Metbcy/bomdrift/releases/download/${VERSION}/bomdrift-${VERSION}-${TARGET}.tar.gz" @@ -56,7 +56,7 @@ To verify the archive's signature before you trust the binary, see ## From source ```bash -cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.8.0 bomdrift +cargo install --locked --git https://github.com/Metbcy/bomdrift --tag v0.9.0 bomdrift ``` Requires Rust 1.85+ (the project uses edition 2024).