Proposal: graded confidence — an epistemic-reliability axis, distinct from integrity (#140) and citation (#92/#94)

Following up on my comment in #95, after reading how the trust surface developed
this week (#140 cryptographic provenance, #92/#94 SOURCES + inline citation, the
new SPEC §12 trust-and-safety in #58). Those cover three distinct senses of
"trust" — and there's a fourth still open, which is the one we've had to solve in
production.

**Four axes, so we don't talk past each other:**

| Axis | "Trust that…" | Where it lives now |
|---|---|---|
| Authenticity / integrity | the bundle is from who it claims, unaltered | #140 (signing, SHA-256 manifest) |
| Safety | content is data, not instructions (prompt-injection) | #58 §12 |
| Groundedness / citation | a claim is linked to evidence | #92 SOURCES, #94 inline citation |
| **Reliability (this proposal)** | **how much to believe the claim itself** | **— open —** |

A bundle can be perfectly signed (#140), injection-safe (#58), and have every
claim cited (#94) — and an agent still can't tell a claim *verified against the
live system* from one that merely *restates vendor documentation* (often stale or
aspirational) from one that's *inferred and unverified*. A citation proves a claim
is grounded; it doesn't grade the ground. For a retrieval agent that's tolerable.
For an agent that **acts on a real system**, it's where damage happens. (This is
the dimension #95 explicitly scoped out.)

**What we run in production** — single-domain agent KB, enterprise IT service
management, ~250 atomic notes over ~140 build cycles, agents acting on live
customer systems:
- a graded **`confidence`** value per concept in frontmatter —
  `HIGH | MEDIUM | LOW | UNVERIFIED` — orthogonal to "is it cited";
- a small **evidence-kind vocabulary** on claims (`observed-on-system` /
  `documentation` / `partner-confirmed` / `inferred`) — this types the *kind* of
  evidence, not just its URL, so it's a different sense of "provenance" than 
 #140's
  *origin* provenance;
- an explicit **trust ordering** when two cited sources disagree
  (`observed > partner-confirmed > observed+doc > doc > inferred`) and a
  **conflict** rule: *document both sides, never silently pick*.

The taxonomy is domain-specific; the *axis* is not. Every producer whose agents
*act* reinvents some version of "how sure are we", and without a shared optional
slot none of it survives the cross-org hop OKF exists for.

**The ask — minimal, optional, already spec-permitted ("producers MAY include
additional keys"):**
- an optional `confidence:` frontmatter key with a small recommended enum, and
- an optional convention for typing evidence-kind on inline citations (riding on
  #94, 
  complementary to #140 
  — different sense of "provenance").

Not a mandate; minimal producers ignore it. I'm not attached to the names — happy
to align with #140/#92 so we don't end up with three meanings of "provenance".

**Questions:**
1. Is graded epistemic confidence in scope as an optional OKF convention, or
   deliberately producer-side forever?
2. If in scope — its own key, or folded into the 
    #92 SOURCES / 
    #140 provenance work?
3. Would a worked example bundle (an OKF-shaped slice of our corpus with the
   confidence layer applied) help the discussion?

Thanks for the pace and openness here — the trust model is clearly being built
right now, which is why I wanted to get this axis on the table before it settles.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal: graded confidence — an epistemic-reliability axis, distinct from integrity (#140) and citation (#92/#94) #151

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Axis	"Trust that…"	Where it lives now
Authenticity / integrity	the bundle is from who it claims, unaltered	#140 (signing, SHA-256 manifest)
Safety	content is data, not instructions (prompt-injection)	#58 §12
Groundedness / citation	a claim is linked to evidence	#92 SOURCES, #94 inline citation
Reliability (this proposal)	how much to believe the claim itself	— open —

Uh oh!

Proposal: graded confidence — an epistemic-reliability axis, distinct from integrity (#140) and citation (#92/#94) #151

Description

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions