From 2c3d6276cbba453676576c7569e6b9902adf0634 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Wed, 6 May 2026 23:15:27 +0800 Subject: [PATCH 01/15] add validator arch --- docs/SUMMARY.md | 1 + docs/node/validator-architecture.md | 278 ++++++++++++++++++++++++++++ 2 files changed, 279 insertions(+) create mode 100644 docs/node/validator-architecture.md diff --git a/docs/SUMMARY.md b/docs/SUMMARY.md index 48dcf2d..204c4b7 100644 --- a/docs/SUMMARY.md +++ b/docs/SUMMARY.md @@ -44,4 +44,5 @@ ## Node Operation - [Stateless Validation](node/stateless-validation.md) + - [Validator Architecture](node/validator-architecture.md) - [Get Block Witness](node/witness.md) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md new file mode 100644 index 0000000..c799baf --- /dev/null +++ b/docs/node/validator-architecture.md @@ -0,0 +1,278 @@ +--- +description: Architecture and implementation guide for building a MegaETH-compatible stateless validator. +--- + +# Validator architecture + +This page describes the reference architecture of MegaETH's stateless validator and the per-block validation pipeline it runs. +It is written for engineers building a compatible validator from scratch — in another language, against a different EVM stack, or against a custom workload. + +The reference implementation lives at [`megaeth-labs/stateless-validator`](https://github.com/megaeth-labs/stateless-validator) and is used throughout this page as the source of truth. +For day-to-day operation of that client, see [Stateless Validation](stateless-validation.md). +For the wire format of the witness, see [Get Block Witness](witness.md). + +## What a stateless validator does + +A stateless validator independently re-executes every MegaETH block against a compact cryptographic witness, then checks that every commitment in the block header matches the resulting post-state. +It holds **no chain state of its own** — a fresh witness arrives with each block and supplies just the slice of state that block touches. + +| Aspect | Detail | +| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Input | A `(block, witness)` pair fetched per height. The witness is the response of [`mega_getBlockWitness`](witness.md). | +| Output | A locally-persisted record that the block validates, plus optional `mega_setValidatedBlocks` callbacks to a downstream service. | +| Trust input | One **anchor**: a block hash and the chain's genesis config, supplied at startup. The next block must extend the anchor exactly. | +| Non-goal | Picking the canonical fork. The validator validates whatever block sequence it is fed; pair it with a consensus client (e.g. `op-node`) to derive canonicality. | + +## Reference architecture + +The reference client is a three-stage async pipeline. +Each `(block, witness)` pair flows through the same stages; only the validator workers run in parallel. + +```text + ┌─────────────────┐ + RPC ────────► │ Block fetcher │ + └────────┬────────┘ + │ (block, witness) + ┌──────┴──────┐ + ▼ ▼ + ┌──────────┐ ┌──────────┐ ... N workers + │ Worker 1 │ │ Worker 2 │ + └─────┬────┘ └─────┬────┘ + └─────┬──────┘ + │ ValidatedBlock + ┌─────────▼─────────┐ + │ Chain advancer │ ────► local chain store + └───────────────────┘ +``` + +| Component | Role | Reference | +| ---------------- | --------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| Block fetcher | Streams `(block, witness)` pairs from RPC. Independent semaphores cap data and witness concurrency. | [`crates/stateless-common/src/rpc_client.rs`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs) | +| Validator worker | Verifies the witness, replays the block, computes post-state, compares against the header. | [`crates/stateless-core/src/executor.rs:411`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L411) (`validate_block`) | +| Chain advancer | Reorders out-of-order results, detects reorgs by parent-hash mismatch, persists in height order. | [`crates/stateless-core/src/pipeline/mod.rs:44`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/pipeline/mod.rs#L44) (`run_pipeline`) | +| Contract cache | Resolves contract bytecode by code hash with three tiers: in-memory → disk (redb) → RPC. | `crates/stateless-db/` | + +Workers do not coordinate. +A custom implementation can collapse the pipeline into a single sequential loop without changing correctness — parallelism is purely a throughput choice. + +## Validation pipeline + +The per-block sequence below is what `validate_block` performs. +A re-implementation MUST run every numbered step; reorderings are allowed only where they preserve all stated invariants. + +{% stepper %} + +{% step %} + +### Fetch the block and witness + +Call `eth_getBlockByHash` (or `eth_getBlockByNumber`) for the block, and [`mega_getBlockWitness`](witness.md) for the witness. +Pin the witness call to `(blockNumber, blockHash)`; a `blockNumber`-only call is non-deterministic across forks. + +The reference client fetches both in parallel from independent RPC pools — see [`get_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L402) and [`get_witness`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L530). + +{% endstep %} + +{% step %} + +### Decode the witness payload + +Strip the `v0:` prefix, base64-decode, Zstd-decompress, and bincode-deserialize (legacy config) into `(SaltWitness, MptWitness)`. +The exact pipeline and a Rust reference snippet live on [Get Block Witness](witness.md#decoding-pipeline). + +{% endstep %} + +{% step %} + +### Verify the SALT proof against the previous state root + +The `SaltProof` inside `SaltWitness` is a multi-point IPA opening on the Banderwagon curve. +Run the verifier against the **previous block's** `state_root` (taken from the parent header, or from the trusted anchor on the very first block). + +If the proof does not verify, **reject the block immediately** — every subsequent step assumes the witnessed key-value pairs are authenticated. + +For the proof's mathematical structure, see the [SALT repository](https://github.com/megaeth-labs/salt). + +{% endstep %} + +{% step %} + +### Build a state-read backend over the witness + +Treat `SaltWitness.kvs` as the only source of state for the duration of replay. +Every account or storage read during execution falls through to a lookup in this map and resolves to one of three outcomes: + +| Witness entry | Verifier behavior | +| --------------------------- | --------------------------------------------------------------------------------- | +| `Some(value)` — present | Return the decoded account or slot. | +| `Some(None)` — proven empty | Return the EVM "empty" sentinel (zero balance / nonce, no code; `0` for storage). | +| Key absent from `kvs` | **Error.** Halt validation immediately. | + +{% hint style="warning" %} +The third case is what blocks witness omission attacks. +A malicious witness producer that left a key out (rather than proving it empty) would otherwise let the verifier silently treat real state as zero. +The verifier MUST treat "absent from `kvs`" as a fatal error, not as "empty". +{% endhint %} + +In the reference client, this backend is [`WitnessDatabase`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/evm_database.rs#L64), which implements [`revm::DatabaseRef`](https://docs.rs/revm/latest/revm/trait.DatabaseRef.html). +A custom implementation needs the equivalent surface for whatever EVM it embeds. + +{% endstep %} + +{% step %} + +### Resolve contract bytecode by hash + +Account entries in the witness carry the `codehash`, not the bytecode itself. +This is intentional — bytecode is large, changes infrequently, and is content-addressed, so the witness only references it. + +Maintain a local cache keyed by `codehash`. +On a miss, fetch via `eth_getCodeByHash` (or fall back to `eth_getCode` for a known holder address) and **verify** that `keccak256(code) == codehash` before installing it. +A miss that cannot be resolved is a fatal error for the block being validated. + +{% endstep %} + +{% step %} + +### Replay the block's transactions + +Execute every transaction with the chain's hardfork rules and accumulate state changes, receipts, and the cumulative gas counter. + +A custom EVM must match MegaEVM's semantics exactly — see [Re-execution requirements](#re-execution-requirements). + +The reference client wires this through [`MegaBlockExecutorFactory` and `MegaEvmFactory`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L273) from the [`mega-evm`](https://github.com/megaeth-labs/mega-evm) crate, which extends `revm` rather than forking it. +Re-implementers can either link `mega-evm` directly or build a compatible EVM from the [Specification](https://docs.megaeth.com/spec/megaevm/dual-gas-model). + +The `BLOCKHASH` opcode is served from the witnessed [EIP-2935](https://eips.ethereum.org/EIPS/eip-2935) history-storage contract entries — see [`evm_database.rs:149`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/evm_database.rs#L149). +There is no separate "ancestor headers" field in the witness. + +{% endstep %} + +{% step %} + +### Apply pre- and post-execution system updates + +Before the first transaction, apply OP-Stack pre-block hooks (e.g. L1-attributes deposit). +After the last transaction, apply post-block hooks (withdrawals processing on the L1 message-passer, beacon-root updates, etc.). +The exact set is fixed by the active hardfork; mirror the reference client's `replay_block` to stay aligned. + +{% endstep %} + +{% step %} + +### Update the withdrawals MPT and recompute `withdrawals_root` + +`MptWitness` carries the storage trie of the L2-to-L1 message-passer contract (`0x4200000000000000000000000000000000000016`) as RLP-encoded MPT nodes plus its pre-state root. + +Apply the block's withdrawal-message writes against this trie, then recompute the root. +This must match `block.withdrawals_root`. + +The reference path is [`MptWitness::verify`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/withdrawals.rs#L80). + +{% endstep %} + +{% step %} + +### Apply state changes to SALT and recompute `state_root` + +Flatten the EVM's collected state changes into `(SaltKey, SaltValue)` pairs. +The reference client uses two intermediate types — `PlainKey` (account address or `address ++ slot`) and `PlainValue` (encoded account or 32-byte slot) — defined in [`crates/stateless-core/src/data_types.rs`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/data_types.rs). + +Encoding rules (mirrored on [Get Block Witness](witness.md#saltvalue) for the reverse direction): + +| Update | `key_len` | `value_len` | Layout | +| ---------------- | --------- | ----------- | ----------------------------------------------------------------------------------- | +| EOA account | 20 | 40 | 8-byte big-endian nonce ‖ 32-byte big-endian balance. | +| Contract account | 20 | 72 | 8-byte big-endian nonce ‖ 32-byte big-endian balance ‖ 32-byte keccak256 code hash. | +| Storage slot | 52 | 32 | Key is `address(20) ‖ slot(32)`; value is the 32-byte big-endian U256. | + +Apply these updates to the SALT trie in canonical (sorted) key order and recompute the root. +This must match `block.state_root`. + +{% endstep %} + +{% step %} + +### Compare every header commitment + +The block validates only if **every** check below passes: + +| Field | Source | +| ------------------ | ------------------------------------------------------------------ | +| `state_root` | Recomputed SALT root from the previous step. | +| `withdrawals_root` | Recomputed MPT root from step 8. | +| `receipts_root` | Merkle root of the transactions' receipts collected during replay. | +| `logs_bloom` | Aggregated 256-byte bloom filter over emitted logs. | +| `gas_used` | Cumulative gas counter from replay. | + +A single mismatch is a fatal error for the block — do **not** advance the local chain. +The reference comparisons are at [`executor.rs:534-559`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L534). + +{% endstep %} + +{% step %} + +### Advance the local chain + +If all checks pass, persist the block as the new tip and (optionally) report it via `mega_setValidatedBlocks`. + +If the validated block's `parent_hash` does not match the previous tip, treat it as a reorg: walk back to the divergence and re-validate forward along the new branch. + +{% endstep %} + +{% endstepper %} + +## Re-execution requirements + +A custom EVM must implement standard Cancun/Shanghai semantics **plus** the MegaEVM-specific extensions below. +Each link points to the normative specification. + +| Topic | Reference | +| ----------------- | ------------------------------------------------------------------------------------------------------------------------------ | +| Dual gas model | [Dual Gas Model](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — compute gas vs. storage gas accounting per opcode. | +| Resource limits | [Resource Limits](https://docs.megaeth.com/spec/megaevm/resource-limits) — per-block and per-transaction caps. | +| System contracts | [System Contracts](https://docs.megaeth.com/spec/system-contracts/overview) — addresses and behaviors that MUST be replicated. | +| Precompiles | [Precompiles](https://docs.megaeth.com/spec/megaevm/precompiles) — MegaETH-specific precompile semantics, if any. | +| Volatile data | [Volatile Data Access](../dev/execution/volatile-data.md) — non-deterministic reads and how they are handled at re-execution. | +| Hardfork schedule | The genesis JSON the validator is started with. Mirror the same schedule in your own client. | + +If `block_replay_time_seconds` (in the reference client) exceeds the chain's block period, you are not real-time — diagnose with the per-stage histograms in [Stateless Validation](stateless-validation.md#useful-metrics). + +## Trust model and reorgs + +The validator's **only** trust input is the anchor block hash supplied at startup; the genesis JSON, hardfork schedule, and chain ID are derived from it together with the operator-provided genesis file. +Everything downstream is verified: + +- The witness is verified cryptographically against the parent's state root before replay. +- Bytecode is verified by recomputing `keccak256(code)` on every cache miss. +- The post-state is verified by recomputing every header commitment. + +The validator does **not** decide which fork is canonical. +For a fully trust-minimized setup, pair it with `op-node` and a MegaETH replica that derive the canonical chain from L1 — see [Stateless Validation > Trust model](stateless-validation.md#trust-model). + +Reorgs are detected when a freshly validated block's `parent_hash` does not match the local tip. +The chain advancer truncates back to the divergence height and re-validates the new branch from there; the canonical-chain row cap (`canonical-chain-max-length`) bounds how far back this can reach. + +## Reference implementation + +The reference client is a Cargo workspace. +The crates below are the entry points a re-implementation will most often want to mirror: + +| Crate | Path | Role | +| --------------------- | ------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ | +| `stateless-core` | [`crates/stateless-core/`](https://github.com/megaeth-labs/stateless-validator/tree/main/crates/stateless-core) | Validation pipeline, witness-backed `revm` database, SALT update path. | +| `stateless-common` | [`crates/stateless-common/`](https://github.com/megaeth-labs/stateless-validator/tree/main/crates/stateless-common) | Multi-endpoint RPC client with retry/backoff and independent concurrency caps. | +| `stateless-db` | [`crates/stateless-db/`](https://github.com/megaeth-labs/stateless-validator/tree/main/crates/stateless-db) | redb-backed chain store, contract cache, table layout. | +| `stateless-validator` | [`bin/stateless-validator/`](https://github.com/megaeth-labs/stateless-validator/tree/main/bin/stateless-validator) | CLI, configuration, metrics endpoint, signal handling. | + +Companion repositories: + +- [`megaeth-labs/salt`](https://github.com/megaeth-labs/salt) — the authenticated key-value store and IPA proof system. Defines [`SaltWitness`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/salt_witness.rs#L46), [`SaltKey`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L198), [`SaltValue`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L274), and [`SaltProof`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/prover.rs#L103). +- [`megaeth-labs/mega-evm`](https://github.com/megaeth-labs/mega-evm) — the MegaEVM execution layer, layered on top of `revm`. + +## Related pages + +- [Get Block Witness](witness.md) — wire format, decoding pipeline, and field-by-field type definitions for the witness payload. +- [Stateless Validation](stateless-validation.md) — operator guide for running the reference client (CLI, metrics, anchoring, troubleshooting). +- [Architecture](../architecture.md) — how transactions move through MegaETH and where validators fit in. +- [Specification](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — normative MegaEVM behavior. From 0a21ac7b3e6c8cf83bb264ed881dfded5ce1510a Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 11:29:20 +0800 Subject: [PATCH 02/15] docs(node): document what the validator reads from genesis Add a Genesis configuration section to the validator architecture guide. Spell out that the validator extracts only chain_id and the hardfork schedule (Ethereum, OP-Stack, MegaETH) from the genesis JSON, with file:line references into chain_spec.rs. Note the fields that are loaded but not consumed (alloc, gasLimit, baseFeePerGas, ...) and warn that any divergence in hardfork timestamps surfaces only as a state_root mismatch. Tighten the Trust model section to list the two startup trust inputs (genesis JSON, anchor block hash) explicitly. --- docs/node/validator-architecture.md | 33 +++++++++++++++++++++++++++-- 1 file changed, 31 insertions(+), 2 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index c799baf..fd01dad 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -20,9 +20,34 @@ It holds **no chain state of its own** — a fresh witness arrives with each blo | ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Input | A `(block, witness)` pair fetched per height. The witness is the response of [`mega_getBlockWitness`](witness.md). | | Output | A locally-persisted record that the block validates, plus optional `mega_setValidatedBlocks` callbacks to a downstream service. | -| Trust input | One **anchor**: a block hash and the chain's genesis config, supplied at startup. The next block must extend the anchor exactly. | +| Trust input | Two values supplied at startup: a **genesis JSON** (chain ID + hardfork schedule) and an **anchor block hash** (the chain head the next block must extend). | | Non-goal | Picking the canonical fork. The validator validates whatever block sequence it is fed; pair it with a consensus client (e.g. `op-node`) to derive canonicality. | +## Genesis configuration + +The genesis JSON is the validator's primary configuration anchor. +Misconfigure it and every subsequent fork-conditional check silently runs against the wrong rules — the validator will produce mismatched roots with no "wrong chain" error to point you at the cause. +Treat it like a chain-identity contract: load it once, persist it, and never edit it. + +The reference client loads genesis via `--genesis-file` on first run, stores it in the local database with [`store_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/bin/stateless-validator/src/validator_db.rs#L88), and re-reads the stored copy on every subsequent boot. + +Despite the file carrying the full Genesis schema (allocations, gas limit, timestamp, base fee, ...), the validator consumes only two pieces of state from it: + +| Derived value | Source in `config` | Use during validation | +| ----------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Chain ID | `chainId` | Drives the EVM `CHAINID` opcode and EIP-155 transaction-signature checks. | +| Hardfork schedule | `Block` and `Time` fields | Activates Ethereum (Cancun, Shanghai, ...), OP-Stack (Ecotone, Granite, Holocene, Isthmus, ...), and MegaETH (MiniRex, MiniRex1-2, Rex, Rex1-5) at their pre-declared block numbers or timestamps. | + +The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — initial state arrives via the witness, not from genesis. + +Reference: [`ChainSpec::from_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L59) reads `genesis.config.chain_id` directly, hands the full `Genesis` to `OpChainSpec::from_genesis` to extract Ethereum and OP-Stack fork conditions, and pulls MegaETH-specific forks via [`MegaethGenesisHardforks::extract_from`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L123). +The three sets are merged into a single ordered hardfork schedule that drives every fork-conditional code path: opcode availability, gas-cost tables, system-contract pre/post-block hooks, and resource limits. + +{% hint style="warning" %} +All replicas of the chain MUST use byte-identical genesis JSON. +A divergence in any single hardfork timestamp produces a fork that the rest of the network will reject — and because the divergence only manifests as a `state_root` mismatch on the first affected block, it is hard to attribute after the fact. +{% endhint %} + ## Reference architecture The reference client is a three-stage async pipeline. @@ -240,7 +265,11 @@ If `block_replay_time_seconds` (in the reference client) exceeds the chain's blo ## Trust model and reorgs -The validator's **only** trust input is the anchor block hash supplied at startup; the genesis JSON, hardfork schedule, and chain ID are derived from it together with the operator-provided genesis file. +The validator has two trust inputs, both supplied at startup: + +- The **genesis JSON** — supplies the chain ID and hardfork schedule (see [Genesis configuration](#genesis-configuration)). Persisted on first run; reused thereafter. +- The **anchor block hash** — pins the chain head. The next validated block's `parent_hash` must equal this value. + Everything downstream is verified: - The witness is verified cryptographically against the parent's state root before replay. From 4152e7da34fa6818b9e8cf6da3b9a373aeb679d6 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 11:45:46 +0800 Subject: [PATCH 03/15] docs(node): address PR #46 review comments - Reorder validation steps so pre-execution system updates come before transaction replay and post-execution updates come after, matching the actual execution order an implementer must follow. - Drop the Rex1-5 / MiniRex1-2 range shorthands and list each fork individually (MiniRex, MiniRex1, MiniRex2, Rex, Rex1-Rex4) per the canonical terminology table in docs/AGENTS.md. - Note that eth_getCodeByHash is a MegaETH RPC extension and call out the eth_getCode fallback path for endpoints that do not implement it. - Replace 'fully trust-minimized' (banned in docs/node/AGENTS.md) with neutral wording about deriving canonicality from L1. --- docs/node/validator-architecture.md | 23 ++++++++++++++++------- 1 file changed, 16 insertions(+), 7 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index fd01dad..b12adf2 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -36,7 +36,7 @@ Despite the file carrying the full Genesis schema (allocations, gas limit, times | Derived value | Source in `config` | Use during validation | | ----------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | Chain ID | `chainId` | Drives the EVM `CHAINID` opcode and EIP-155 transaction-signature checks. | -| Hardfork schedule | `Block` and `Time` fields | Activates Ethereum (Cancun, Shanghai, ...), OP-Stack (Ecotone, Granite, Holocene, Isthmus, ...), and MegaETH (MiniRex, MiniRex1-2, Rex, Rex1-5) at their pre-declared block numbers or timestamps. | +| Hardfork schedule | `Block` and `Time` fields | Activates Ethereum (Cancun, Shanghai, ...), OP-Stack (Ecotone, Granite, Holocene, Isthmus, ...), and MegaETH (MiniRex, MiniRex1, MiniRex2, Rex, Rex1, Rex2, Rex3, Rex4) at their pre-declared block numbers or timestamps. | The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — initial state arrives via the witness, not from genesis. @@ -152,13 +152,23 @@ Account entries in the witness carry the `codehash`, not the bytecode itself. This is intentional — bytecode is large, changes infrequently, and is content-addressed, so the witness only references it. Maintain a local cache keyed by `codehash`. -On a miss, fetch via `eth_getCodeByHash` (or fall back to `eth_getCode` for a known holder address) and **verify** that `keccak256(code) == codehash` before installing it. +On a miss, fetch via `eth_getCodeByHash` — a MegaETH RPC extension that takes a code hash and returns the bytecode whose `keccak256` equals that hash — and **verify** that `keccak256(code) == codehash` before installing it. +If the endpoint does not support `eth_getCodeByHash`, fall back to `eth_getCode` against a known holder address and apply the same verification. A miss that cannot be resolved is a fatal error for the block being validated. {% endstep %} {% step %} +### Apply pre-execution system updates + +Before the first transaction, apply OP-Stack pre-block hooks (e.g. L1-attributes deposit). +The exact set is fixed by the active hardfork; mirror the reference client's `replay_block` to stay aligned. + +{% endstep %} + +{% step %} + ### Replay the block's transactions Execute every transaction with the chain's hardfork rules and accumulate state changes, receipts, and the cumulative gas counter. @@ -175,11 +185,10 @@ There is no separate "ancestor headers" field in the witness. {% step %} -### Apply pre- and post-execution system updates +### Apply post-execution system updates -Before the first transaction, apply OP-Stack pre-block hooks (e.g. L1-attributes deposit). -After the last transaction, apply post-block hooks (withdrawals processing on the L1 message-passer, beacon-root updates, etc.). -The exact set is fixed by the active hardfork; mirror the reference client's `replay_block` to stay aligned. +After the last transaction, apply OP-Stack post-block hooks: withdrawals processing on the L1 message-passer contract, beacon-root updates, and any hardfork-specific finalization. +As with pre-execution, the exact set is fixed by the active hardfork. {% endstep %} @@ -277,7 +286,7 @@ Everything downstream is verified: - The post-state is verified by recomputing every header commitment. The validator does **not** decide which fork is canonical. -For a fully trust-minimized setup, pair it with `op-node` and a MegaETH replica that derive the canonical chain from L1 — see [Stateless Validation > Trust model](stateless-validation.md#trust-model). +To derive canonicality from L1 instead of trusting the upstream RPC, pair it with `op-node` and a MegaETH replica — see [Stateless Validation > Trust model](stateless-validation.md#trust-model). Reorgs are detected when a freshly validated block's `parent_hash` does not match the local tip. The chain advancer truncates back to the divergence height and re-validates the new branch from there; the canonical-chain row cap (`canonical-chain-max-length`) bounds how far back this can reach. From c2e367923cd51f243c98fcd449bb4170071c1055 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 14:58:18 +0800 Subject: [PATCH 04/15] Update validator-architecture.md --- docs/node/validator-architecture.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index b12adf2..51d094e 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -33,9 +33,9 @@ The reference client loads genesis via `--genesis-file` on first run, stores it Despite the file carrying the full Genesis schema (allocations, gas limit, timestamp, base fee, ...), the validator consumes only two pieces of state from it: -| Derived value | Source in `config` | Use during validation | -| ----------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Chain ID | `chainId` | Drives the EVM `CHAINID` opcode and EIP-155 transaction-signature checks. | +| Derived value | Source in `config` | Use during validation | +| ----------------- | ------------------------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Chain ID | `chainId` | Drives the EVM `CHAINID` opcode and EIP-155 transaction-signature checks. | | Hardfork schedule | `Block` and `Time` fields | Activates Ethereum (Cancun, Shanghai, ...), OP-Stack (Ecotone, Granite, Holocene, Isthmus, ...), and MegaETH (MiniRex, MiniRex1, MiniRex2, Rex, Rex1, Rex2, Rex3, Rex4) at their pre-declared block numbers or timestamps. | The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — initial state arrives via the witness, not from genesis. From c0ca33e89f1515b2cb639bbf970978a6f4a7df27 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 15:03:43 +0800 Subject: [PATCH 05/15] fix review - Fix step reference in the header-commitment table: withdrawals_root is recomputed in step 9 (Update the withdrawals MPT), not step 8. The numbering shifted when pre/post-execution system updates were split into separate steps. - Add MiniRex1 and MiniRex2 to the canonical spec-name list in docs/AGENTS.md. Both are real hardforks defined in crates/stateless-core/src/chain_spec.rs (mini_rex_1_time, mini_rex_2_time) with activation timestamps in the mainnet genesis. --- docs/AGENTS.md | 28 ++++++++++++++-------------- docs/node/validator-architecture.md | 2 +- 2 files changed, 15 insertions(+), 15 deletions(-) diff --git a/docs/AGENTS.md b/docs/AGENTS.md index d2e9be9..e09e848 100644 --- a/docs/AGENTS.md +++ b/docs/AGENTS.md @@ -59,20 +59,20 @@ When in doubt: top-level pages prioritize readability, layer pages prioritize de Use these exact forms consistently. Do not alternate between variants. -| Term | Correct | Incorrect | -| --------------------- | ------------------------------------ | ----------------------------------------------------------- | -| Project name | MegaETH | megaETH, Mega ETH, megaeth, MEGAETH | -| EVM implementation | MegaEVM | MegaEvm, mega-evm, Mega EVM | -| Mainnet (proper noun) | MegaETH Mainnet | MegaETH mainnet, main net, main-net | -| Testnet (proper noun) | MegaETH Testnet | MegaETH testnet, test net, test-net | -| Currency ticker | ETH | eth, Eth | -| Currency name | ether | Ether, ETH (when referring to the currency, not the ticker) | -| Block type | mini-block | miniblock, mini block, MiniBlock | -| Onchain / offchain | onchain, offchain | on-chain, off-chain, on chain | -| Smart contract | smart contract | Smart Contract, smartcontract | -| Gas dimensions | compute gas, storage gas | Compute Gas, Storage Gas, Compute gas | -| Spec names | MiniRex, Rex, Rex1, Rex2, Rex3, Rex4 | minirex, MINIREX, mini-rex, rex-3 | -| State trie | SALT | Salt, salt | +| Term | Correct | Incorrect | +| --------------------- | -------------------------------------------------------- | ----------------------------------------------------------- | +| Project name | MegaETH | megaETH, Mega ETH, megaeth, MEGAETH | +| EVM implementation | MegaEVM | MegaEvm, mega-evm, Mega EVM | +| Mainnet (proper noun) | MegaETH Mainnet | MegaETH mainnet, main net, main-net | +| Testnet (proper noun) | MegaETH Testnet | MegaETH testnet, test net, test-net | +| Currency ticker | ETH | eth, Eth | +| Currency name | ether | Ether, ETH (when referring to the currency, not the ticker) | +| Block type | mini-block | miniblock, mini block, MiniBlock | +| Onchain / offchain | onchain, offchain | on-chain, off-chain, on chain | +| Smart contract | smart contract | Smart Contract, smartcontract | +| Gas dimensions | compute gas, storage gas | Compute Gas, Storage Gas, Compute gas | +| Spec names | MiniRex, MiniRex1, MiniRex2, Rex, Rex1, Rex2, Rex3, Rex4 | minirex, MINIREX, mini-rex, rex-3 | +| State trie | SALT | Salt, salt | **Capitalization rules:** diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 51d094e..b9d17ea 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -234,7 +234,7 @@ The block validates only if **every** check below passes: | Field | Source | | ------------------ | ------------------------------------------------------------------ | | `state_root` | Recomputed SALT root from the previous step. | -| `withdrawals_root` | Recomputed MPT root from step 8. | +| `withdrawals_root` | Recomputed MPT root from step 9. | | `receipts_root` | Merkle root of the transactions' receipts collected during replay. | | `logs_bloom` | Aggregated 256-byte bloom filter over emitted logs. | | `gas_used` | Cumulative gas counter from replay. | From c684f82cc5c03bfe17c0eb33c592041c28d7699e Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 16:30:32 +0800 Subject: [PATCH 06/15] docs: tighten precompiles row, defer details to spec page --- docs/node/validator-architecture.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index b9d17ea..7cbd6ae 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -261,14 +261,14 @@ If the validated block's `parent_hash` does not match the previous tip, treat it A custom EVM must implement standard Cancun/Shanghai semantics **plus** the MegaEVM-specific extensions below. Each link points to the normative specification. -| Topic | Reference | -| ----------------- | ------------------------------------------------------------------------------------------------------------------------------ | -| Dual gas model | [Dual Gas Model](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — compute gas vs. storage gas accounting per opcode. | -| Resource limits | [Resource Limits](https://docs.megaeth.com/spec/megaevm/resource-limits) — per-block and per-transaction caps. | -| System contracts | [System Contracts](https://docs.megaeth.com/spec/system-contracts/overview) — addresses and behaviors that MUST be replicated. | -| Precompiles | [Precompiles](https://docs.megaeth.com/spec/megaevm/precompiles) — MegaETH-specific precompile semantics, if any. | -| Volatile data | [Volatile Data Access](../dev/execution/volatile-data.md) — non-deterministic reads and how they are handled at re-execution. | -| Hardfork schedule | The genesis JSON the validator is started with. Mirror the same schedule in your own client. | +| Topic | Reference | +| ----------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| Dual gas model | [Dual Gas Model](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — compute gas vs. storage gas accounting per opcode. | +| Resource limits | [Resource Limits](https://docs.megaeth.com/spec/megaevm/resource-limits) — per-block and per-transaction caps. | +| System contracts | [System Contracts](https://docs.megaeth.com/spec/system-contracts/overview) — addresses and behaviors that MUST be replicated. | +| Precompiles | OP-Stack Isthmus set, with MegaETH gas-cost overrides at a few standard addresses. See [Precompiles](https://docs.megaeth.com/spec/megaevm/precompiles) for the full list and per-address details. | +| Volatile data | [Volatile Data Access](../dev/execution/volatile-data.md) — non-deterministic reads and how they are handled at re-execution. | +| Hardfork schedule | The genesis JSON the validator is started with. Mirror the same schedule in your own client. | If `block_replay_time_seconds` (in the reference client) exceeds the chain's block period, you are not real-time — diagnose with the per-stage histograms in [Stateless Validation](stateless-validation.md#useful-metrics). From e6c03af493d5e26436c3a48d97bd57f537778ac2 Mon Sep 17 00:00:00 2001 From: William Aaron Cheung Date: Fri, 8 May 2026 16:54:30 +0800 Subject: [PATCH 07/15] docs(node): correct pre/post-execution hooks, wrap reference-impl in info boxes --- docs/node/validator-architecture.md | 106 +++++++++++++++++++++------- 1 file changed, 79 insertions(+), 27 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 7cbd6ae..adc7fa5 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -16,20 +16,27 @@ For the wire format of the witness, see [Get Block Witness](witness.md). A stateless validator independently re-executes every MegaETH block against a compact cryptographic witness, then checks that every commitment in the block header matches the resulting post-state. It holds **no chain state of its own** — a fresh witness arrives with each block and supplies just the slice of state that block touches. -| Aspect | Detail | -| ----------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| Input | A `(block, witness)` pair fetched per height. The witness is the response of [`mega_getBlockWitness`](witness.md). | -| Output | A locally-persisted record that the block validates, plus optional `mega_setValidatedBlocks` callbacks to a downstream service. | -| Trust input | Two values supplied at startup: a **genesis JSON** (chain ID + hardfork schedule) and an **anchor block hash** (the chain head the next block must extend). | -| Non-goal | Picking the canonical fork. The validator validates whatever block sequence it is fed; pair it with a consensus client (e.g. `op-node`) to derive canonicality. | +| Aspect | Detail | +| ------ | ------------------------------------------------------------------------------------------------------------------ | +| Input | A `(block, witness)` pair fetched per height. The witness is the response of [`mega_getBlockWitness`](witness.md). | +| Output | A locally-persisted record that the block validates. | + +The validator's only startup trust input is a **genesis JSON** (chain ID + hardfork schedule) and an **anchor block hash** that the next validated block must extend. +Both are detailed in [Genesis configuration](#genesis-configuration) below. + +**Non-goal:** picking the canonical fork. +The validator validates whatever block sequence it is fed; pair it with a consensus client (e.g. `op-node`) to derive canonicality. ## Genesis configuration The genesis JSON is the validator's primary configuration anchor. -Misconfigure it and every subsequent fork-conditional check silently runs against the wrong rules — the validator will produce mismatched roots with no "wrong chain" error to point you at the cause. -Treat it like a chain-identity contract: load it once, persist it, and never edit it. +Misconfigure it and every subsequent fork-conditional check silently runs against the wrong rules — the validator will produce mismatched state roots with no "wrong chain" error to point you at the cause. +Treat it like a chain-identity contract: load it once, persist it, and never edit it by hand. +Find the canonical mainnet genesis at [`test_data/mainnet/genesis.json`](https://github.com/megaeth-labs/stateless-validator/blob/main/test_data/mainnet/genesis.json) in the stateless-validator repo, and pull the updated copy whenever a new hardfork is scheduled. -The reference client loads genesis via `--genesis-file` on first run, stores it in the local database with [`store_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/bin/stateless-validator/src/validator_db.rs#L88), and re-reads the stored copy on every subsequent boot. +{% hint style="info" %} +**Reference impl.** Loads genesis via `--genesis-file` on first run, stores it in the local database with [`store_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/bin/stateless-validator/src/validator_db.rs#L88), and re-reads the stored copy on every subsequent boot. +{% endhint %} Despite the file carrying the full Genesis schema (allocations, gas limit, timestamp, base fee, ...), the validator consumes only two pieces of state from it: @@ -40,8 +47,10 @@ Despite the file carrying the full Genesis schema (allocations, gas limit, times The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — initial state arrives via the witness, not from genesis. -Reference: [`ChainSpec::from_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L59) reads `genesis.config.chain_id` directly, hands the full `Genesis` to `OpChainSpec::from_genesis` to extract Ethereum and OP-Stack fork conditions, and pulls MegaETH-specific forks via [`MegaethGenesisHardforks::extract_from`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L123). +{% hint style="info" %} +**Reference impl.** [`ChainSpec::from_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L59) reads `genesis.config.chain_id` directly, hands the full `Genesis` to `OpChainSpec::from_genesis` to extract Ethereum and OP-Stack fork conditions, and pulls MegaETH-specific forks via [`MegaethGenesisHardforks::extract_from`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L123). The three sets are merged into a single ordered hardfork schedule that drives every fork-conditional code path: opcode availability, gas-cost tables, system-contract pre/post-block hooks, and resource limits. +{% endhint %} {% hint style="warning" %} All replicas of the chain MUST use byte-identical genesis JSON. @@ -50,7 +59,7 @@ A divergence in any single hardfork timestamp produces a fork that the rest of t ## Reference architecture -The reference client is a three-stage async pipeline. +The current implementation of stateless validator is a three-stage async pipeline. Each `(block, witness)` pair flows through the same stages; only the validator workers run in parallel. ```text @@ -83,7 +92,8 @@ A custom implementation can collapse the pipeline into a single sequential loop ## Validation pipeline The per-block sequence below is what `validate_block` performs. -A re-implementation MUST run every numbered step; reorderings are allowed only where they preserve all stated invariants. +A different implementation MUST run every numbered step. +Reorderings are allowed only when they preserve the data dependencies between steps — most importantly, the SALT proof MUST verify (step 3) before any state is read (steps 4+), bytecode MUST be hash-verified before being installed in the cache, and each header recompute MUST include all the state changes it commits to. {% stepper %} @@ -94,7 +104,9 @@ A re-implementation MUST run every numbered step; reorderings are allowed only w Call `eth_getBlockByHash` (or `eth_getBlockByNumber`) for the block, and [`mega_getBlockWitness`](witness.md) for the witness. Pin the witness call to `(blockNumber, blockHash)`; a `blockNumber`-only call is non-deterministic across forks. -The reference client fetches both in parallel from independent RPC pools — see [`get_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L402) and [`get_witness`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L530). +{% hint style="info" %} +**Reference impl.** Fetches both in parallel from independent RPC pools — see [`get_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L402) and [`get_witness`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L530). +{% endhint %} {% endstep %} @@ -139,9 +151,12 @@ A malicious witness producer that left a key out (rather than proving it empty) The verifier MUST treat "absent from `kvs`" as a fatal error, not as "empty". {% endhint %} -In the reference client, this backend is [`WitnessDatabase`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/evm_database.rs#L64), which implements [`revm::DatabaseRef`](https://docs.rs/revm/latest/revm/trait.DatabaseRef.html). A custom implementation needs the equivalent surface for whatever EVM it embeds. +{% hint style="info" %} +**Reference impl.** This backend is [`WitnessDatabase`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/evm_database.rs#L64), which implements [`revm::DatabaseRef`](https://docs.rs/revm/latest/revm/trait.DatabaseRef.html). +{% endhint %} + {% endstep %} {% step %} @@ -152,7 +167,7 @@ Account entries in the witness carry the `codehash`, not the bytecode itself. This is intentional — bytecode is large, changes infrequently, and is content-addressed, so the witness only references it. Maintain a local cache keyed by `codehash`. -On a miss, fetch via `eth_getCodeByHash` — a MegaETH RPC extension that takes a code hash and returns the bytecode whose `keccak256` equals that hash — and **verify** that `keccak256(code) == codehash` before installing it. +On a miss, fetch via `eth_getCodeByHash` — a MegaETH RPC extension that takes a code hash and returns the bytecode whose `keccak256` equals that hash — and **verify** that `keccak256(code) == codehash` before using it. If the endpoint does not support `eth_getCodeByHash`, fall back to `eth_getCode` against a known holder address and apply the same verification. A miss that cannot be resolved is a fatal error for the block being validated. @@ -162,8 +177,26 @@ A miss that cannot be resolved is a fatal error for the block being validated. ### Apply pre-execution system updates -Before the first transaction, apply OP-Stack pre-block hooks (e.g. L1-attributes deposit). -The exact set is fixed by the active hardfork; mirror the reference client's `replay_block` to stay aligned. +Before the first transaction, apply hardfork-conditional system calls. +Two layers run in order: + +1. **OP-Stack base hooks** (always active on MegaETH, since Isthmus is the floor): + - EIP-2935 history-storage contract update. + - EIP-4788 beacon root contract update. + +2. **MegaEVM system-contract deployments / updates**, gated by the active MegaETH hardfork: + - **MiniRex** — deploy the oracle contract and the high-precision timestamp oracle contract. + - **Rex2** — deploy the keyless-deploy contract. + - **Rex4** — deploy the access-control contract and the `MegaLimitControl` contract. + - **Rex5** — deploy `SequencerRegistry` (first-activation block only) and apply any pending role changes that are due. + +The L1-attributes deposit is **not** a pre-block hook: it is the block's first transaction and runs in the regular tx loop in step 7. + +The exact hook set is fixed by the active hardfork — see [System Contracts](https://docs.megaeth.com/spec/system-contracts/overview) for the canonical addresses and behaviors. + +{% hint style="info" %} +**Reference impl.** [`apply_pre_execution_changes`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L364) inside [`replay_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L252) delegates to mega-evm's [`pre_execution_changes`](https://github.com/megaeth-labs/mega-evm/blob/main/crates/mega-evm/src/block/executor.rs#L172), which runs the OP-Stack base hooks followed by the MegaEVM-specific system-contract deployments listed above. +{% endhint %} {% endstep %} @@ -175,9 +208,12 @@ Execute every transaction with the chain's hardfork rules and accumulate state c A custom EVM must match MegaEVM's semantics exactly — see [Re-execution requirements](#re-execution-requirements). -The reference client wires this through [`MegaBlockExecutorFactory` and `MegaEvmFactory`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L273) from the [`mega-evm`](https://github.com/megaeth-labs/mega-evm) crate, which extends `revm` rather than forking it. Re-implementers can either link `mega-evm` directly or build a compatible EVM from the [Specification](https://docs.megaeth.com/spec/megaevm/dual-gas-model). +{% hint style="info" %} +**Reference impl.** Wires this through [`MegaBlockExecutorFactory` and `MegaEvmFactory`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L273) from the [`mega-evm`](https://github.com/megaeth-labs/mega-evm) crate, which extends `revm` rather than forking it. +{% endhint %} + The `BLOCKHASH` opcode is served from the witnessed [EIP-2935](https://eips.ethereum.org/EIPS/eip-2935) history-storage contract entries — see [`evm_database.rs:149`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/evm_database.rs#L149). There is no separate "ancestor headers" field in the witness. @@ -187,8 +223,13 @@ There is no separate "ancestor headers" field in the witness. ### Apply post-execution system updates -After the last transaction, apply OP-Stack post-block hooks: withdrawals processing on the L1 message-passer contract, beacon-root updates, and any hardfork-specific finalization. -As with pre-execution, the exact set is fixed by the active hardfork. +After the last transaction, apply hardfork-conditional post-block system calls — including primarily EIP-7002 withdrawal-request and EIP-7251 consolidation-request processing on Isthmus+. +The `withdrawals_root` is **not** computed here: it is recomputed separately in step 9 against the L1 message-passer storage trie. +As with pre-execution, the exact hook set is fixed by the active hardfork. + +{% hint style="info" %} +**Reference impl.** [`apply_post_execution_changes`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L373) inside [`replay_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L252) delegates to op-reth's `BlockExecutor` for the canonical hook list. +{% endhint %} {% endstep %} @@ -201,7 +242,9 @@ As with pre-execution, the exact set is fixed by the active hardfork. Apply the block's withdrawal-message writes against this trie, then recompute the root. This must match `block.withdrawals_root`. -The reference path is [`MptWitness::verify`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/withdrawals.rs#L80). +{% hint style="info" %} +**Reference impl.** [`MptWitness::verify`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/withdrawals.rs#L80). +{% endhint %} {% endstep %} @@ -210,7 +253,10 @@ The reference path is [`MptWitness::verify`](https://github.com/megaeth-labs/sta ### Apply state changes to SALT and recompute `state_root` Flatten the EVM's collected state changes into `(SaltKey, SaltValue)` pairs. -The reference client uses two intermediate types — `PlainKey` (account address or `address ++ slot`) and `PlainValue` (encoded account or 32-byte slot) — defined in [`crates/stateless-core/src/data_types.rs`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/data_types.rs). + +{% hint style="info" %} +**Reference impl.** Uses two intermediate types — `PlainKey` (account address or `address ++ slot`) and `PlainValue` (encoded account or 32-byte slot) — defined in [`crates/stateless-core/src/data_types.rs`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/data_types.rs). +{% endhint %} Encoding rules (mirrored on [Get Block Witness](witness.md#saltvalue) for the reverse direction): @@ -240,7 +286,10 @@ The block validates only if **every** check below passes: | `gas_used` | Cumulative gas counter from replay. | A single mismatch is a fatal error for the block — do **not** advance the local chain. -The reference comparisons are at [`executor.rs:534-559`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L534). + +{% hint style="info" %} +**Reference impl.** Comparisons are at [`executor.rs:534-559`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L534). +{% endhint %} {% endstep %} @@ -248,7 +297,7 @@ The reference comparisons are at [`executor.rs:534-559`](https://github.com/mega ### Advance the local chain -If all checks pass, persist the block as the new tip and (optionally) report it via `mega_setValidatedBlocks`. +If all checks pass, persist the block as the new tip. If the validated block's `parent_hash` does not match the previous tip, treat it as a reorg: walk back to the divergence and re-validate forward along the new branch. @@ -270,7 +319,9 @@ Each link points to the normative specification. | Volatile data | [Volatile Data Access](../dev/execution/volatile-data.md) — non-deterministic reads and how they are handled at re-execution. | | Hardfork schedule | The genesis JSON the validator is started with. Mirror the same schedule in your own client. | -If `block_replay_time_seconds` (in the reference client) exceeds the chain's block period, you are not real-time — diagnose with the per-stage histograms in [Stateless Validation](stateless-validation.md#useful-metrics). +{% hint style="info" %} +**Reference impl.** If `block_replay_time_seconds` exceeds the chain's block period, you are not real-time — diagnose with the per-stage histograms in [Stateless Validation](stateless-validation.md#useful-metrics). +{% endhint %} ## Trust model and reorgs @@ -285,8 +336,9 @@ Everything downstream is verified: - Bytecode is verified by recomputing `keccak256(code)` on every cache miss. - The post-state is verified by recomputing every header commitment. -The validator does **not** decide which fork is canonical. -To derive canonicality from L1 instead of trusting the upstream RPC, pair it with `op-node` and a MegaETH replica — see [Stateless Validation > Trust model](stateless-validation.md#trust-model). +The validator does **not** verify: + +- **Block canonicity.** The validator validates whatever block sequence is fed to it; it does not decide which fork is canonical. To derive canonicality from L1 instead of trusting the upstream RPC, pair the validator with `op-node`, which derives the canonical block sequence directly from L1, and the replica feeds those blocks to the validator. Reorgs are detected when a freshly validated block's `parent_hash` does not match the local tip. The chain advancer truncates back to the divergence height and re-validates the new branch from there; the canonical-chain row cap (`canonical-chain-max-length`) bounds how far back this can reach. From 62c08fe934ddea76d0d7cd4d64ae169a6dd29da5 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 16:57:54 +0800 Subject: [PATCH 08/15] fix fmt --- docs/node/validator-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index adc7fa5..362880d 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -338,7 +338,7 @@ Everything downstream is verified: The validator does **not** verify: -- **Block canonicity.** The validator validates whatever block sequence is fed to it; it does not decide which fork is canonical. To derive canonicality from L1 instead of trusting the upstream RPC, pair the validator with `op-node`, which derives the canonical block sequence directly from L1, and the replica feeds those blocks to the validator. +- **Block canonicity.** The validator validates whatever block sequence is fed to it; it does not decide which fork is canonical. To derive canonicality from L1 instead of trusting the upstream RPC, pair the validator with `op-node`, which derives the canonical block sequence directly from L1, and the replica feeds those blocks to the validator. Reorgs are detected when a freshly validated block's `parent_hash` does not match the local tip. The chain advancer truncates back to the divergence height and re-validates the new branch from there; the canonical-chain row cap (`canonical-chain-max-length`) bounds how far back this can reach. From 0a2c9578ba53fcb764d2439482eac301faedec20 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Fri, 8 May 2026 17:30:17 +0800 Subject: [PATCH 09/15] fix review --- docs/node/validator-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 362880d..d728681 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -188,7 +188,6 @@ Two layers run in order: - **MiniRex** — deploy the oracle contract and the high-precision timestamp oracle contract. - **Rex2** — deploy the keyless-deploy contract. - **Rex4** — deploy the access-control contract and the `MegaLimitControl` contract. - - **Rex5** — deploy `SequencerRegistry` (first-activation block only) and apply any pending role changes that are due. The L1-attributes deposit is **not** a pre-block hook: it is the block's first transaction and runs in the regular tx loop in step 7. @@ -338,7 +337,8 @@ Everything downstream is verified: The validator does **not** verify: -- **Block canonicity.** The validator validates whatever block sequence is fed to it; it does not decide which fork is canonical. To derive canonicality from L1 instead of trusting the upstream RPC, pair the validator with `op-node`, which derives the canonical block sequence directly from L1, and the replica feeds those blocks to the validator. +- **Block canonicity.** The validator validates whatever block sequence is fed to it; it does not decide which fork is canonical. + To derive canonicality from L1 instead of trusting the upstream RPC, pair the validator with `op-node`, which derives the canonical block sequence directly from L1, and the replica feeds those blocks to the validator. Reorgs are detected when a freshly validated block's `parent_hash` does not match the local tip. The chain advancer truncates back to the divergence height and re-validates the new branch from there; the canonical-chain row cap (`canonical-chain-max-length`) bounds how far back this can reach. From bfe4b5e2605c9ade69039ce27a894e17bf431afd Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 08:50:34 +0800 Subject: [PATCH 10/15] docs(node): address PR #46 round 5 nits MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit - Fix grammar: 'implementation of stateless validator' → 'implementation of the stateless validator' (missing article). - List the MegaETH forks that deploy no system contracts (MiniRex1, MiniRex2, Rex, Rex1, Rex3) so the per-hardfork hook list is exhaustive. Verified against mega-evm/crates/mega-evm/src/block/executor.rs (only MiniRex, Rex2, Rex4 install contracts in pre_execution_changes). --- docs/node/validator-architecture.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index d728681..5f28f45 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -59,7 +59,7 @@ A divergence in any single hardfork timestamp produces a fork that the rest of t ## Reference architecture -The current implementation of stateless validator is a three-stage async pipeline. +The current implementation of the stateless validator is a three-stage async pipeline. Each `(block, witness)` pair flows through the same stages; only the validator workers run in parallel. ```text @@ -188,6 +188,7 @@ Two layers run in order: - **MiniRex** — deploy the oracle contract and the high-precision timestamp oracle contract. - **Rex2** — deploy the keyless-deploy contract. - **Rex4** — deploy the access-control contract and the `MegaLimitControl` contract. + - **MiniRex1, MiniRex2, Rex, Rex1, Rex3** — no new system-contract deployments. The fork still gates EVM behavior changes; the pre-execution hook list is just empty. The L1-attributes deposit is **not** a pre-block hook: it is the block's first transaction and runs in the regular tx loop in step 7. From d1913cf6d3bf727c41fad205f4224e9c30ba1add Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 09:01:44 +0800 Subject: [PATCH 11/15] fix review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The link goes to the Dual Gas Model spec page, not the spec root, so 'Specification' alone misled readers. Use 'MegaEVM specification — Dual Gas Model' on both the inline reference (re-execution section) and in Related pages, and note that the linked page is the entry point into the spec space. --- docs/node/validator-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 5f28f45..31c8fb2 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -208,7 +208,7 @@ Execute every transaction with the chain's hardfork rules and accumulate state c A custom EVM must match MegaEVM's semantics exactly — see [Re-execution requirements](#re-execution-requirements). -Re-implementers can either link `mega-evm` directly or build a compatible EVM from the [Specification](https://docs.megaeth.com/spec/megaevm/dual-gas-model). +Re-implementers can either link `mega-evm` directly or build a compatible EVM from the [MegaEVM specification — Dual Gas Model](https://docs.megaeth.com/spec/megaevm/dual-gas-model) and the related spec pages linked under [Re-execution requirements](#re-execution-requirements). {% hint style="info" %} **Reference impl.** Wires this through [`MegaBlockExecutorFactory` and `MegaEvmFactory`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L273) from the [`mega-evm`](https://github.com/megaeth-labs/mega-evm) crate, which extends `revm` rather than forking it. @@ -366,4 +366,4 @@ Companion repositories: - [Get Block Witness](witness.md) — wire format, decoding pipeline, and field-by-field type definitions for the witness payload. - [Stateless Validation](stateless-validation.md) — operator guide for running the reference client (CLI, metrics, anchoring, troubleshooting). - [Architecture](../architecture.md) — how transactions move through MegaETH and where validators fit in. -- [Specification](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — normative MegaEVM behavior. +- [MegaEVM specification — Dual Gas Model](https://docs.megaeth.com/spec/megaevm/dual-gas-model) — normative MegaEVM behavior; entry point into the spec space. From dc6263657aba2a02d9f6eaceab58195819e4bc19 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 09:22:59 +0800 Subject: [PATCH 12/15] fix review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit The '— MegaETH is iterating quickly, ...' clause is removed; the bolded imperative carries the message on its own. --- docs/node/validator-architecture.md | 19 ++++++++++++------- 1 file changed, 12 insertions(+), 7 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 31c8fb2..2cc0d9c 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -5,7 +5,7 @@ description: Architecture and implementation guide for building a MegaETH-compat # Validator architecture This page describes the reference architecture of MegaETH's stateless validator and the per-block validation pipeline it runs. -It is written for engineers building a compatible validator from scratch — in another language, against a different EVM stack, or against a custom workload. +It is written for engineers building a compatible validator from scratch — in another language or against a different EVM stack. The reference implementation lives at [`megaeth-labs/stateless-validator`](https://github.com/megaeth-labs/stateless-validator) and is used throughout this page as the source of truth. For day-to-day operation of that client, see [Stateless Validation](stateless-validation.md). @@ -32,7 +32,8 @@ The validator validates whatever block sequence it is fed; pair it with a consen The genesis JSON is the validator's primary configuration anchor. Misconfigure it and every subsequent fork-conditional check silently runs against the wrong rules — the validator will produce mismatched state roots with no "wrong chain" error to point you at the cause. Treat it like a chain-identity contract: load it once, persist it, and never edit it by hand. -Find the canonical mainnet genesis at [`test_data/mainnet/genesis.json`](https://github.com/megaeth-labs/stateless-validator/blob/main/test_data/mainnet/genesis.json) in the stateless-validator repo, and pull the updated copy whenever a new hardfork is scheduled. +**Pull a fresh copy of the canonical mainnet genesis whenever a new hardfork is scheduled.** +For the file layout (not a runtime artifact), see the schema-shaped sample at [`test_data/mainnet/genesis.json`](https://github.com/megaeth-labs/stateless-validator/blob/main/test_data/mainnet/genesis.json) — `alloc` is stripped to keep the repo small. {% hint style="info" %} **Reference impl.** Loads genesis via `--genesis-file` on first run, stores it in the local database with [`store_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/bin/stateless-validator/src/validator_db.rs#L88), and re-reads the stored copy on every subsequent boot. @@ -45,7 +46,7 @@ Despite the file carrying the full Genesis schema (allocations, gas limit, times | Chain ID | `chainId` | Drives the EVM `CHAINID` opcode and EIP-155 transaction-signature checks. | | Hardfork schedule | `Block` and `Time` fields | Activates Ethereum (Cancun, Shanghai, ...), OP-Stack (Ecotone, Granite, Holocene, Isthmus, ...), and MegaETH (MiniRex, MiniRex1, MiniRex2, Rex, Rex1, Rex2, Rex3, Rex4) at their pre-declared block numbers or timestamps. | -The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — initial state arrives via the witness, not from genesis. +The genesis `alloc`, `gasLimit`, `baseFeePerGas`, and other initial-state fields are **not** consumed — once the chain has produced a single block, initial state is served by the witness, not by the genesis file. {% hint style="info" %} **Reference impl.** [`ChainSpec::from_genesis`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L59) reads `genesis.config.chain_id` directly, hands the full `Genesis` to `OpChainSpec::from_genesis` to extract Ethereum and OP-Stack fork conditions, and pulls MegaETH-specific forks via [`MegaethGenesisHardforks::extract_from`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/chain_spec.rs#L123). @@ -188,7 +189,8 @@ Two layers run in order: - **MiniRex** — deploy the oracle contract and the high-precision timestamp oracle contract. - **Rex2** — deploy the keyless-deploy contract. - **Rex4** — deploy the access-control contract and the `MegaLimitControl` contract. - - **MiniRex1, MiniRex2, Rex, Rex1, Rex3** — no new system-contract deployments. The fork still gates EVM behavior changes; the pre-execution hook list is just empty. + - **MiniRex1, MiniRex2, Rex, Rex1, Rex3** — no new system-contract deployments. + The fork still gates EVM behavior changes; the pre-execution hook list is just empty. The L1-attributes deposit is **not** a pre-block hook: it is the block's first transaction and runs in the regular tx loop in step 7. @@ -327,8 +329,10 @@ Each link points to the normative specification. The validator has two trust inputs, both supplied at startup: -- The **genesis JSON** — supplies the chain ID and hardfork schedule (see [Genesis configuration](#genesis-configuration)). Persisted on first run; reused thereafter. -- The **anchor block hash** — pins the chain head. The next validated block's `parent_hash` must equal this value. +- The **genesis JSON** — supplies the chain ID and hardfork schedule (see [Genesis configuration](#genesis-configuration)). + Persisted on first run; reused thereafter. +- The **anchor block hash** — pins the chain head. + The next validated block's `parent_hash` must equal this value. Everything downstream is verified: @@ -358,7 +362,8 @@ The crates below are the entry points a re-implementation will most often want t Companion repositories: -- [`megaeth-labs/salt`](https://github.com/megaeth-labs/salt) — the authenticated key-value store and IPA proof system. Defines [`SaltWitness`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/salt_witness.rs#L46), [`SaltKey`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L198), [`SaltValue`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L274), and [`SaltProof`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/prover.rs#L103). +- [`megaeth-labs/salt`](https://github.com/megaeth-labs/salt) — the authenticated key-value store and IPA proof system. + Defines [`SaltWitness`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/salt_witness.rs#L46), [`SaltKey`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L198), [`SaltValue`](https://github.com/megaeth-labs/salt/blob/main/salt/src/types.rs#L274), and [`SaltProof`](https://github.com/megaeth-labs/salt/blob/main/salt/src/proof/prover.rs#L103). - [`megaeth-labs/mega-evm`](https://github.com/megaeth-labs/mega-evm) — the MegaEVM execution layer, layered on top of `revm`. ## Related pages From 4ae7aa15474f23a5bbd705c2aec5ba5aa4a29a39 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 09:47:23 +0800 Subject: [PATCH 13/15] fix the code line --- docs/node/validator-architecture.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 2cc0d9c..1f84627 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -106,7 +106,7 @@ Call `eth_getBlockByHash` (or `eth_getBlockByNumber`) for the block, and [`mega_ Pin the witness call to `(blockNumber, blockHash)`; a `blockNumber`-only call is non-deterministic across forks. {% hint style="info" %} -**Reference impl.** Fetches both in parallel from independent RPC pools — see [`get_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L402) and [`get_witness`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L530). +**Reference impl.** Fetches both in parallel from independent RPC pools — see [`get_block`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L430) and [`get_witness`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L558). {% endhint %} {% endstep %} From e67f3de36ef29e2d8dd7f0ebdd9fc0bb06409f04 Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 10:02:55 +0800 Subject: [PATCH 14/15] add eth_getCodeByHash --- docs/node/validator-architecture.md | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 1f84627..584bc94 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -168,8 +168,9 @@ Account entries in the witness carry the `codehash`, not the bytecode itself. This is intentional — bytecode is large, changes infrequently, and is content-addressed, so the witness only references it. Maintain a local cache keyed by `codehash`. -On a miss, fetch via `eth_getCodeByHash` — a MegaETH RPC extension that takes a code hash and returns the bytecode whose `keccak256` equals that hash — and **verify** that `keccak256(code) == codehash` before using it. -If the endpoint does not support `eth_getCodeByHash`, fall back to `eth_getCode` against a known holder address and apply the same verification. +On a miss, fetch via [`eth_getCodeByHash`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs#L398) — a MegaETH RPC extension that takes a code hash and returns the bytecode whose `keccak256` equals that hash — and **verify** that `keccak256(code) == codehash` before using it. +If the endpoint does not support `eth_getCodeByHash`, fall back to `eth_getCode` against a known holder address, and **always pin the call to the exact block at which the witness anchors** (the parent block's number). +Apply the same `keccak256(code) == codehash` verification to the result. A miss that cannot be resolved is a fatal error for the block being validated. {% endstep %} From 53da99250d8de14c3a82dc956c8ec232b99fe78e Mon Sep 17 00:00:00 2001 From: "liquan.eth" Date: Sat, 9 May 2026 10:17:29 +0800 Subject: [PATCH 15/15] fix review --- docs/node/validator-architecture.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/docs/node/validator-architecture.md b/docs/node/validator-architecture.md index 584bc94..248331c 100644 --- a/docs/node/validator-architecture.md +++ b/docs/node/validator-architecture.md @@ -85,7 +85,7 @@ Each `(block, witness)` pair flows through the same stages; only the validator w | Block fetcher | Streams `(block, witness)` pairs from RPC. Independent semaphores cap data and witness concurrency. | [`crates/stateless-common/src/rpc_client.rs`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-common/src/rpc_client.rs) | | Validator worker | Verifies the witness, replays the block, computes post-state, compares against the header. | [`crates/stateless-core/src/executor.rs:411`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/executor.rs#L411) (`validate_block`) | | Chain advancer | Reorders out-of-order results, detects reorgs by parent-hash mismatch, persists in height order. | [`crates/stateless-core/src/pipeline/mod.rs:44`](https://github.com/megaeth-labs/stateless-validator/blob/main/crates/stateless-core/src/pipeline/mod.rs#L44) (`run_pipeline`) | -| Contract cache | Resolves contract bytecode by code hash with three tiers: in-memory → disk (redb) → RPC. | `crates/stateless-db/` | +| Contract cache | Resolves contract bytecode by code hash with three tiers: in-memory → disk (redb) → RPC. | [`crates/stateless-db/`](https://github.com/megaeth-labs/stateless-validator/tree/main/crates/stateless-db) | Workers do not coordinate. A custom implementation can collapse the pipeline into a single sequential loop without changing correctness — parallelism is purely a throughput choice. @@ -310,7 +310,7 @@ If the validated block's `parent_hash` does not match the previous tip, treat it ## Re-execution requirements -A custom EVM must implement standard Cancun/Shanghai semantics **plus** the MegaEVM-specific extensions below. +A custom EVM must implement [OP-Stack Isthmus](https://docs.megaeth.com/spec/overview) semantics — MegaETH's baseline, inherited unless explicitly overridden — **plus** the MegaEVM-specific extensions below. Each link points to the normative specification. | Topic | Reference |