Skip to content

Add Intermediate Verbosity Mode for Minimal Failure Slices #995

@Alan-Jowett

Description

@Alan-Jowett

Issue: Add Intermediate Verbosity Mode for Minimal Failure Slices

Summary
Prevail currently supports two extremes of diagnostic output:

  1. Minimal mode — prints only the failure reason
  2. Verbose mode — prints the full CFG, full abstract state, and every instruction’s invariants

For large BPF programs (10k–20k+ instructions), verbose mode produces extremely large logs that are difficult for humans to navigate and expensive for downstream tooling to process. However, minimal mode often lacks enough context to understand why a failure occurred.

This issue proposes adding a new intermediate verbosity mode that emits a minimal semantic slice of the program relevant to the failure. This slice includes only the instructions, invariants, and abstract state that contributed to the failure decision.

This mode would dramatically improve debuggability without requiring the full CFG dump.


Motivation
Verbose output can easily exceed hundreds of thousands of lines for large programs. Most of this information is irrelevant to understanding the failure.

In practice, only a small subset of the program is needed:

  • the failing instruction
  • the predecessor chain that contributed to the abstract state
  • the registers and memory regions involved in the failure
  • the invariants that were actually consulted
  • the minimal abstract state projection at the failure point

This is analogous to:

  • counterexample traces in model checking
  • program slicing in static analysis
  • minimal unsat cores in SMT solving

Prevail already computes all the necessary information internally; it simply does not expose it in a compact form.


Proposal: Add a “failure slice” verbosity mode
Introduce a new flag, e.g.:

--failure-slice

This mode would emit a structured, compact diagnostic slice containing:

  1. Error classification
  2. Location of failure (block, instruction index)
  3. The failing instruction
  4. Minimal predecessor trace (semantic backward slice)
  5. Projected abstract state (only relevant registers and memory regions)
  6. Relevant invariants used in the failure decision

This provides significantly more insight than minimal mode, while remaining orders of magnitude smaller than verbose mode.


Proposed Algorithm

  1. Identify the failure point
    Prevail already knows:
  • the failing instruction
  • the block
  • the abstract state at that point
  • the error classification

Emit these directly.


  1. Compute the semantic backward slice
    Starting from the failing instruction:

  2. Identify which registers and memory regions were consulted in the failure decision.

    • e.g., pointer region, offset interval, helper arg type, loop bound
  3. Walk backwards through the CFG following only predecessor states that contributed to the merged state at the failure point.

    • ignore unreachable predecessors
    • ignore predecessors pruned by widening
    • ignore predecessors whose state was not merged
  4. For each predecessor:

    • include the instruction
    • update the set of “relevant registers” based on data dependencies
    • stop when all relevant registers reach their last write

This produces a minimal instruction trace (typically 10–40 instructions).


  1. Project the abstract state
    From the full abstract state at the failure point:

  2. Keep only registers that appear in the backward slice.

  3. Keep only memory regions reachable from those registers.

  4. Keep only invariants that were actually checked for the failure type.

  5. Drop all unrelated registers, regions, and invariants.

This produces a compact state representation.


  1. Emit the failure slice in a structured format
    Example structure:

`
[ERROR]
ERRSTACKOOB

[LOCATION]
block=17
instruction_index=4231

[INSTRUCTION]
r2 = (u64 )(r1 + r3)

[TRACE]
4230: r3 += 8
4229: if r3 < 64 goto 4231
...
4210: r1 = r10 - 64

[STATE]
R1: ptr(region=stack, offset=[0,64])
R3: scalar([0,72])
STACK[0..64]: known
`

This is typically 200–500 lines, even for very large programs.


Benefits

  • Dramatically reduces output size for large programs
  • Much easier for humans to read
  • Ideal for automated tooling
  • Avoids the cost of full CFG dumps
  • Provides a deterministic, minimal explanation of the failure
  • Aligns with best practices in static analysis and model checking

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions