Skip to content

Extend proven fragment to cover the full CompilationModel language surface #1723

@Th0rgal

Description

@Th0rgal

Goal

Widen the generic Layer 2 whole-contract preservation theorem from the current SupportedSpec fragment to the full CompilationModel language surface, so that every construct expressible through the verity_contract macro is covered by the generic compile-preservation proof — no unsupported-surface exclusions remain.

Background

The generic whole-contract theorem (compile_preserves_semantics in Contract.lean) is currently proved only for the SupportedSpec fragment. Six independent exclusion surfaces gate what enters the theorem:

Surface Gate function What it excludes
Core stmtTouchesUnsupportedCoreSurface Complex expression forms: mapping, mappingWord, mappingPackedWord, mapping2, mappingChain, structMember, keccak256, call/staticcall/delegatecall, returndataSize, extcodesize, returndataOptionalBoolAt, externalCall, internalCall, arrayLength, arrayElement, storageArrayLength, storageArrayElement, dynamicBytesEq, constructorArg
State stmtTouchesUnsupportedStateSurface Storage-rich forms: storage, storageAddr, all mapping* reads, structMember* reads, storageArray* reads, setMapping*, setStructMember*, storageArrayPush/storageArrayPop/setStorageArrayElement
Calls stmtTouchesUnsupportedCallSurface internalCall, internalCallAssign, externalCallBind, ecm, calldatacopy, returndataCopy, revertReturndata, mappingChain
Helpers stmtTouchesUnsupportedHelperSurface internalCall, internalCallAssign, mappingChain in certain positions
Effects stmtTouchesUnsupportedEffectSurface requireError, revertError, returnValues, returnArray, returnBytes, returnStorageWords, emit, rawLog, externalCallBind, ecm
Foreign stmtTouchesUnsupportedForeignSurface externalCallBind, ecm, mappingChain
Low-level stmtTouchesUnsupportedLowLevelSurface calldatacopy, returndataCopy, revertReturndata, low-level call/staticcall/delegatecall in expressions

Each surface represents a family of CompilationModel constructs that compile and execute correctly (tested by differential and property tests) but lack formal compile-preservation proofs. The task is to prove preservation for each family and thread the proof through the generic theorem.

What's Already in the Proven Fragment

The current proven fragment covers:

Expressions (proven in generic induction core)

  • Literals, parameters, local variables
  • Arithmetic: add, sub, mul, div, mod, sdiv, smod
  • Comparison: eq, lt, gt, le, ge, slt, sgt
  • Bitwise: bitAnd, bitOr, bitXor, bitNot, shl, shr, sar, signextend
  • Logic: logicalAnd, logicalOr, logicalNot
  • Higher-level: min, max, ceilDiv, ite, wMulDown, wDivUp, mulDivDown, mulDivUp
  • Environment: caller, contractAddress, chainid, msgValue, blockTimestamp, blockNumber, blobbasefee, calldatasize
  • Memory: mload, tload, calldataload

Statements (proven in generic induction core)

  • letVar, assignVar
  • setStorage (simple slot write)
  • setStorageAddr
  • require (with string message)
  • return (single value)
  • mstore, tstore
  • stop
  • ite (if/else branching — recursive into both branches)
  • forEach (bounded loops — recursive into body)

Layer 3 (all 8 Yul statement types fully proved)

Layer 3 (IR → Yul) is already generic for all statement types. The bottleneck is Layer 2 (CompilationModel → IR).

Detailed Widening Plan

Tier 1: Storage and Mapping Operations (highest impact)

Target: Eliminate stmtTouchesUnsupportedStateSurface entirely.

Construct What to prove Complexity
Expr.storage / Stmt.setStorage (already partial) Storage read is sload(slot); write is sstore(slot, val) Low — extend existing proof
Expr.storageAddr / Stmt.setStorageAddr Same as storage but with address masking Low
Expr.mapping / Stmt.setMapping sload(keccak256(abi.encode(key, base))) Medium — needs mappingSlot lemma
Expr.mappingWord / Stmt.setMappingWord mappingSlot + word offset Medium
Expr.mappingPackedWord / Stmt.setMappingPackedWord mappingSlot + word offset + bit masking Medium-High
Expr.mapping2 / Stmt.setMapping2 Double keccak nesting Medium
Expr.mapping2Word / Stmt.setMapping2Word Double keccak + word offset Medium
Expr.mappingUint / Stmt.setMappingUint Uint256-keyed variant Medium
Expr.mappingChain / Stmt.setMappingChain Arbitrary-depth nesting (recursive) High
Expr.structMember / Stmt.setStructMember Resolves to mappingPackedWord at compile time Medium (reduces to packed word case)
Expr.structMember2 / Stmt.setStructMember2 Double mapping variant Medium
Expr.storageArrayLength / Expr.storageArrayElement Dynamic array header + element access Medium
Stmt.storageArrayPush / Stmt.storageArrayPop / Stmt.setStorageArrayElement Dynamic array mutations Medium-High

Strategy: The mappingSlot kernel Keccak engine is already proven (solidityMappingSlot_lt_evmModulus is now a theorem). Build the storage-mapping bridge lemma library:

  1. Prove compileMapping correctness: compiled Yul sload(mappingSlot(base, key)) = source evalMapping field key
  2. Extend to word offsets and packed words via arithmetic lemmas
  3. Prove array operations via length-slot and element-slot arithmetic

Tier 2: Internal Helper Calls (already extensively scaffolded)

Target: Eliminate stmtTouchesUnsupportedHelperSurface entirely.

The proof infrastructure is already built — this is the most scaffolded gap:

  • SourceSemantics.lean defines execStmtListWithHelpers, evalExprWithHelpers, interpretInternalFunctionFuel
  • SupportedSpec.lean defines InternalHelperSummaryContract, helper-rank interfaces, expression-position world-preservation obligations
  • GenericInduction.lean defines the helper-aware induction seam (CompiledStmtStepWithHelpersAndHelperIR / StmtListGenericWithHelpersAndHelperIR) and all split bridges
  • The helper-free conservative extension is closed (interpretIRWithInternalsZeroConservativeExtensionGoal_closed)
  • The whole-contract helper-aware wrappers exist (compile_preserves_semantics_with_helper_proofs_and_helper_ir_supported)

What remains:

  1. Prove SupportedFunctionBodyWithHelpersAndHelperIRPreservationGoal for the four genuine internal-helper cases:
    • StmtListDirectInternalHelperCallStepInterface — void helper calls
    • StmtListDirectInternalHelperAssignStepInterface — helper-return bindings
    • StmtListExprInternalHelperStepInterface — expression-position helper calls
    • StmtListStructuralInternalHelperStepInterface — structural recursion through helpers
  2. Consume helper-summary soundness + decreasing rank evidence end-to-end
  3. Remove the SupportedStmtList.helperSurfaceClosed gate

Tier 3: Effects and Observables

Target: Eliminate stmtTouchesUnsupportedEffectSurface entirely.

Construct What to prove Complexity
Stmt.emit Event emission = LOG opcode sequence Medium — needs memory/ABI encoding bridge
Stmt.rawLog Raw log emission = LOG0–LOG4 Medium
Stmt.requireError ABI-encoded typed revert Medium-High — needs ABI encoding correctness
Stmt.revertError ABI-encoded typed revert (unconditional) Medium
Stmt.returnValues Multi-value ABI return encoding Medium-High
Stmt.returnArray Dynamic array ABI return encoding High
Stmt.returnBytes Dynamic bytes ABI return encoding High
Stmt.returnStorageWords Storage-read + ABI return High (combines storage + ABI)

Strategy: Build an ABI encoding correctness library first:

  1. Prove scalar ABI encoding (uint256, address, bool, bytes32)
  2. Prove dynamic offset/length encoding for bytes/arrays
  3. Compose with mstore correctness to show memory layout matches ABI spec
  4. Thread through event emission (topics = indexed args, data = non-indexed ABI encoding)

Tier 4: External Calls and Low-Level Mechanics

Target: Eliminate stmtTouchesUnsupportedCallSurface, stmtTouchesUnsupportedForeignSurface, and stmtTouchesUnsupportedLowLevelSurface.

Construct What to prove Complexity
Expr.call / staticcall / delegatecall Low-level EVM call = compiled Yul call instruction High — needs full call-frame semantics
Expr.returndataSize returndatasize() extraction Low once call-frame exists
Expr.returndataOptionalBoolAt Compound returndata check Medium
Expr.extcodesize extcodesize(addr) extraction Low
Stmt.calldatacopy / returndataCopy Memory copy operations Medium
Stmt.revertReturndata Forward returndata as revert Medium
Stmt.externalCallBind ABI-decoded external call High — needs ABI encode/decode roundtrip
Stmt.ecm External Call Module dispatch High — needs per-module correctness
Expr.externalCall Linked external function call Medium-High

Strategy: This tier requires modeling cross-contract call semantics. The recommended approach:

  1. Model calls as opaque state transitions with pre/post conditions (matching the ECM assumption pattern)
  2. Prove that the compiled Yul call sequence satisfies those pre/post conditions
  3. The actual external contract behavior remains an assumption (as it must — we can't prove other contracts correct)

Tier 5: Remaining Core Expressions

Target: Eliminate exprTouchesUnsupportedCoreSurface entirely.

Construct What to prove Complexity
Expr.constructorArg Constructor argument loading from bytecode Medium
Expr.keccak256 keccak256(offset, size) on memory region Medium — needs memory model
Expr.arrayLength / Expr.arrayElement Calldata array access with bounds checking Medium
Expr.dynamicBytesEq Dynamic bytes equality comparison Medium

Tier 6: Parameter Profile Widening

Target: Widen SupportedExternalParamType and SupportedExternalReturnProfile.

Currently supported To add
uint256, int256, uint8, address, bytes32, bool bytes, string, array, fixedArray, tuple
Zero-return or single-scalar return Multi-value returns, dynamic returns

Strategy: Each new parameter type needs:

  1. ABI decoding correctness from calldata
  2. ABI encoding correctness for return values
  3. Composition with the dispatch theorem

Execution Order

Tier 2 (helpers) ←── Already scaffolded, highest leverage
  ↓
Tier 1 (storage/mappings) ←── Highest contract coverage impact
  ↓
Tier 3 (effects) ←── Requires ABI library from Tier 1 work
  ↓
Tier 5 (remaining core exprs) ←── Small, can parallelize
  ↓
Tier 4 (external calls) ←── Most complex, benefits from all prior work
  ↓
Tier 6 (param widening) ←── Final step to full EDSL coverage

Tiers 1 and 2 can be parallelized. Tier 5 can be parallelized with Tier 3.

Success Criteria

  • stmtTouchesUnsupportedCoreSurface returns false for all CompilationModel constructs (function deleted or always-false)
  • stmtTouchesUnsupportedStateSurface eliminated
  • stmtTouchesUnsupportedCallSurface eliminated
  • stmtTouchesUnsupportedHelperSurface eliminated
  • stmtTouchesUnsupportedEffectSurface eliminated
  • stmtTouchesUnsupportedForeignSurface eliminated
  • stmtTouchesUnsupportedLowLevelSurface eliminated
  • SupportedExternalParamType accepts all ParamType constructors
  • SupportedExternalReturnProfile accepts all return configurations
  • The generic whole-contract theorem covers every verity_contract-generated CompilationModel
  • 0 sorry in proof modules
  • make check passes, no differential test regressions

Related Issues

Estimated Effort

3–6 months total across all tiers, with Tiers 1–2 deliverable in 4–8 weeks.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions