Summary
ExposureDerivationService.DeriveAndUpsertPostgresAsync updates only LastObservedAt, Status, ResolvedAt, and LastSeenRunId on the ON CONFLICT DO UPDATE branch. The linkage fields (InstalledSoftwareId, SoftwareProductId, MatchedVersion, MatchSource) stay frozen at the original INSERT. Over time the exposure row stops describing the install that's currently driving it.
This isn't producing wrong dashboard counts, but it's a real debugging hazard. In a tenant we inspected:
- 5 335 critical-Open exposures have
InstalledSoftwareId = NULL because the originally-linked install was pruned by the stale-install sweep, and a later derivation re-emitted the exposure via a different install — but the upsert never re-pointed the linkage.
- 30 862 more critical-Open exposures are linked to an install whose current
Version differs from the exposure's frozen MatchedVersion.
End result: questions like "which install is this exposure citing?" or "what version was matched against the applicability?" can't be answered from the row.
Recommended fix
In src/PatchHound.Infrastructure/Services/ExposureDerivationService.cs (the INSERT … ON CONFLICT … DO UPDATE statement in DeriveAndUpsertPostgresAsync), extend the DO UPDATE SET clause:
ON CONFLICT ("TenantId", "DeviceId", "VulnerabilityId")
DO UPDATE SET
"LastObservedAt" = GREATEST(EXCLUDED."LastObservedAt",
"DeviceVulnerabilityExposures"."LastObservedAt"),
"Status" = 'Open',
"ResolvedAt" = NULL,
"LastSeenRunId" = EXCLUDED."LastSeenRunId",
"InstalledSoftwareId" = EXCLUDED."InstalledSoftwareId",
"SoftwareProductId" = EXCLUDED."SoftwareProductId",
"MatchedVersion" = EXCLUDED."MatchedVersion",
"MatchSource" = EXCLUDED."MatchSource"
The deduped CTE already picks one row per (device_id, vulnerability_id) with a deterministic preference for Product over Cpe matches (see comment around CASE match_source WHEN 'Product' THEN 0 ELSE 1 END), so the EXCLUDED.* values are the canonical "current evidence" tuple.
Mirror the change in the InMemory test path
DeviceVulnerabilityExposure.Reobserve(observedAt, runId) currently only updates LastObservedAt / Status / LastSeenRunId. Either:
- Add a richer
Reobserve(observedAt, runId, installedSoftwareId, softwareProductId, matchedVersion, matchSource) overload and have InMemoryBulkExposureWriter.UpsertAsync call it, or
- Update the fields directly on the entity in the writer (less ceremony, but breaks encapsulation).
Prefer the entity overload for consistency with the rest of the codebase's factory-boundary discipline (see feedback_canonical_entity_factory_validation.md).
Acceptance criteria
Notes
Summary
ExposureDerivationService.DeriveAndUpsertPostgresAsyncupdates onlyLastObservedAt,Status,ResolvedAt, andLastSeenRunIdon theON CONFLICT DO UPDATEbranch. The linkage fields (InstalledSoftwareId,SoftwareProductId,MatchedVersion,MatchSource) stay frozen at the originalINSERT. Over time the exposure row stops describing the install that's currently driving it.This isn't producing wrong dashboard counts, but it's a real debugging hazard. In a tenant we inspected:
InstalledSoftwareId = NULLbecause the originally-linked install was pruned by the stale-install sweep, and a later derivation re-emitted the exposure via a different install — but the upsert never re-pointed the linkage.Versiondiffers from the exposure's frozenMatchedVersion.End result: questions like "which install is this exposure citing?" or "what version was matched against the applicability?" can't be answered from the row.
Recommended fix
In
src/PatchHound.Infrastructure/Services/ExposureDerivationService.cs(theINSERT … ON CONFLICT … DO UPDATEstatement inDeriveAndUpsertPostgresAsync), extend theDO UPDATE SETclause:The
dedupedCTE already picks one row per(device_id, vulnerability_id)with a deterministic preference forProductoverCpematches (see comment aroundCASE match_source WHEN 'Product' THEN 0 ELSE 1 END), so theEXCLUDED.*values are the canonical "current evidence" tuple.Mirror the change in the InMemory test path
DeviceVulnerabilityExposure.Reobserve(observedAt, runId)currently only updatesLastObservedAt/Status/LastSeenRunId. Either:Reobserve(observedAt, runId, installedSoftwareId, softwareProductId, matchedVersion, matchSource)overload and haveInMemoryBulkExposureWriter.UpsertAsynccall it, orPrefer the entity overload for consistency with the rest of the codebase's factory-boundary discipline (see
feedback_canonical_entity_factory_validation.md).Acceptance criteria
ExposureDerivationServiceCteTests(all 6 still pass).LastSeenRunIdis still bumped on every conflict soResolveStaleAsynccontinues to leave true stale exposures behind.Notes
FirstObservedAtstays untouched on conflict (correct — first observation is by definition a one-time event).