Fix edge publishes with dynamic core ACK quorum#405
Conversation
d0fc724 to
9673062
Compare
| ); | ||
| } | ||
| const registeredAccountId = BigInt( | ||
| await this.contracts.publishingConvictionAccount.agentToAccountId(txSigner.address), |
There was a problem hiding this comment.
🔴 Bug: This preflight is checking agentToAccountId on PublishingConvictionAccount, but that getter exists on DKGPublishingConvictionNFT and KnowledgeAssetsV10.publish() also resolves the paying account through the NFT contract. As written, any gateway publish that sets publisherConvictionAccountId will fail before broadcast (or validate the wrong account model). Please switch this lookup to the same NFT-backed registration source that publish() uses.
| // Do not force an edge-owned CG to mint or reuse a node identity just to | ||
| // satisfy legacy ContextGraphStorage.hostingNodes metadata. | ||
| const effectiveParticipantIdentityIds = participantIdentityIds; | ||
| const effectiveRequiredSignatures = Number.isInteger(storedRequiredSignatures) && storedRequiredSignatures >= 0 |
There was a problem hiding this comment.
🔴 Bug: Allowing requiredSignatures to persist as 0 here creates on-chain context graphs that the existing verify flow cannot use. DKGAgent.verify() still treats the chain value as a strictly positive quorum and throws when getContextGraphConfig().requiredSignatures < 1 (dkg-agent.ts:5692-5709). Either keep the old floor here or update the verify path in the same PR so zero-quorum CGs remain operable.
| } | ||
| } | ||
|
|
||
| if (this.config.isSignerRegistered) { |
There was a problem hiding this comment.
🔴 Bug: The gateway only checks isSignerRegistered() before signing. After this PR, KnowledgeAssetsV10._verifySignature() also requires the publisher identity to be an active sharding-table core node, so a core that has dropped out of the active set will continue handing out signatures that deterministically revert on-chain. Please gate gateway signing on active-core membership as well (or reuse the same identity check the ACK path uses).
9673062 to
9ce2e42
Compare
| } | ||
|
|
||
| function resolvePublishGatewayForCommand(opts: ActionOpts, stored?: PublisherGatewayConfig): PublisherGatewayConfig | undefined { | ||
| if (!opts.publishGatewayPeer && !opts.pcaAccount && !opts.paymaster) return undefined; |
There was a problem hiding this comment.
🔴 Bug: This early return makes the saved publisher.gateway config effectively dead unless the user also passes one of the per-command flags. After dkg publisher gateway set ..., a plain dkg shared-memory publish still publishes without the gateway. Fall back to stored here and treat CLI flags as overrides, rather than requiring a flag to opt into the saved config.
| pcaAccountId: this.config.publishGateway?.pcaAccountId, | ||
| paymaster: this.config.publishGateway?.paymaster, | ||
| isSignerRegistered: async () => { | ||
| const isOperationalWalletRegistered = this.chain.isOperationalWalletRegistered; |
There was a problem hiding this comment.
🔴 Bug: The gateway liveness check only verifies that this wallet is still registered for the identity. KnowledgeAssetsV10._verifySignature() now also requires the publisher identity to be an active sharding-table core, so a node that has been unstaked/evicted will keep signing gateway responses and every publish will fail later on-chain. Wire this through verifyACKIdentity (or an equivalent active-core check) instead of isOperationalWalletRegistered alone.
| ); | ||
| } | ||
| if ( | ||
| params.publisherConvictionAccountId !== undefined && |
There was a problem hiding this comment.
🟡 Issue: The mock only checks that the requested conviction account exists, but the real EVM adapter also verifies that the transaction signer is registered to that account via agentToAccountId(txSigner.address). That means mock-backed tests can now pass for a gateway publish that will revert on a real chain when the sender is bound to a different account. Mirror the signer/account binding check here so tests keep parity with production behavior.
60ffdcc to
dd909bd
Compare
| this.config = config; | ||
| } | ||
|
|
||
| handler = async (data: Uint8Array, _peerId: PeerId): Promise<Uint8Array> => { |
There was a problem hiding this comment.
🔴 Bug: The requester identity is ignored here, so any connected peer can ask this core to sign publisher digests. When paymaster is configured, an untrusted peer can get free publishes charged to that paymaster on open context graphs, and even without a paymaster they can still attribute arbitrary produced value to this node. Gate publish-gateway requests on an explicit allowlist or another authorization check tied to peerId before signing.
| randomSamplingWalPath: config.randomSampling?.walPath, | ||
| randomSamplingTickIntervalMs: config.randomSampling?.tickIntervalMs, | ||
| randomSamplingUseWorkerThread: config.randomSampling?.useWorkerThread, | ||
| publishGateway: normalizePublisherGatewayConfig(config.publisher?.gateway), |
There was a problem hiding this comment.
🔴 Bug: config.publisher.gateway is created by dkg publisher gateway set as the preferred remote gateway, but this line now feeds it into DKGAgentConfig.publishGateway, which is also used for the local handler/profile on core nodes. A core that sets an upstream gateway will start advertising and serving a local gateway with the upstream PCA/paymaster settings. Split outbound gateway preference from local gateway capability instead of reusing the same config object.
| if (typeof gateway.peerId !== 'string' || gateway.peerId.trim().length === 0) { | ||
| throw new Error('"publishGateway.peerId" is required'); | ||
| } | ||
| if (typeof gateway.nodeIdentityId !== 'string' && typeof gateway.nodeIdentityId !== 'number' && typeof gateway.nodeIdentityId !== 'bigint') { |
There was a problem hiding this comment.
🟡 Issue: Accepting numeric nodeIdentityId/pcaAccountId values is unsafe here. JSON numbers above Number.MAX_SAFE_INTEGER are already rounded before BigInt(String(value)) runs, so large on-chain IDs can be silently corrupted. Require these fields to be strings (or explicitly reject unsafe integers) to avoid misrouting publishes to the wrong identity/account.
dd909bd to
bfa8214
Compare
| this.config = config; | ||
| } | ||
|
|
||
| handler = async (data: Uint8Array, _peerId: PeerId): Promise<Uint8Array> => { |
There was a problem hiding this comment.
🔴 Bug: _peerId is ignored here, so any connected peer can request a valid publisher signature from this core. When a paymaster is configured, that also gives arbitrary peers a way to route their own publish txs through that paymaster, because the only check below is “is this paymaster valid?”, not “is this requester allowed to use it?”. Gate this protocol on the requesting peer / publisher identity before returning a signature.
| `Publish gateway did not confirm requested PCA account ${options.publishGateway.pcaAccountId}`, | ||
| ); | ||
| } | ||
| if (gatewaySignature.pcaAccountId !== undefined) { |
There was a problem hiding this comment.
🔴 Bug: this accepts pcaAccountId and paymaster from the gateway response even when the caller did not request them. A buggy or compromised gateway can silently switch the publish onto a conviction-account or paymaster path the publisher never opted into. Only apply these optional fields when they were explicitly requested in options.publishGateway, otherwise reject unsolicited values.
| if (requiredSignatures == 0 || requiredSignatures > hostCount) { | ||
| revert KnowledgeAssetsLib.InvalidContextGraphConfig("invalid M/N threshold"); | ||
| } | ||
| _contextGraphs[contextGraphId].requiredSignatures = requiredSignatures; |
There was a problem hiding this comment.
🔴 Bug: removing the quorum validation here makes the contract accept requiredSignatures values that the shipped clients still treat as invalid. The agent verify path throws on < 1 and the daemon/API still rejects zero or inconsistent thresholds, so a CG updated here can become unverifiable/unmanageable through this repo’s own tooling. Either keep the storage invariant, or update the off-chain consumers in the same PR.
bfa8214 to
51478d4
Compare
| } | ||
| } | ||
|
|
||
| if (paymaster) { |
There was a problem hiding this comment.
🔴 Bug: A configured paymaster is still usable when allowedPeers is empty, so any connected peer can turn this core into an open sponsor endpoint. The docstring above already calls that out as unsafe. Fail closed here (or in the constructor) whenever a paymaster is configured without a non-empty allowlist.
| pcaAccount?: string; | ||
| paymaster?: string; | ||
| }): PublisherGatewayConfig { | ||
| return { |
There was a problem hiding this comment.
🟡 Issue: This accepts both --pca-account and --paymaster, but the new publish path rejects that combination later in DKGPublisher/EVMChainAdapter. Right now the CLI can persist an impossible gateway config and the user only finds out at publish time after the network round-trip. Reject the combination when building/saving the config.
| nodeRole: this.config.nodeRole ?? 'edge', | ||
| nodeRole: role, | ||
| nodeIdentityId: nodeIdentityId > 0n ? nodeIdentityId : undefined, | ||
| publishGateway: role === 'core' && nodeIdentityId > 0n |
There was a problem hiding this comment.
🟡 Issue: This advertises publish-gateway support for every core with a non-zero identity, even though the handler is only registered after attemptStorageACKRegistration() succeeds and it is explicitly unregistered when that signer drops off-chain. Peers can discover a gateway from the profile and then hit a missing protocol/timeout. Tie the profile flag to the actual handler-registration state instead of role && nodeIdentityId > 0.
51478d4 to
7bd3a1b
Compare
| ); | ||
| }, | ||
| }, this.eventBus); | ||
| const isGatewayActiveCore = async () => { |
There was a problem hiding this comment.
🔴 Bug: This PR makes on-chain signature validity depend on active sharding-table membership, but the StorageACKHandler configured just above still only re-checks operational-wallet registration before signing. Once a core is evicted from ShardingTableStorage, it will keep emitting ACKs until restart, and publishers can collect a quorum that now deterministically reverts on-chain with ACK signer is not an active core node. Pass the same active-core probe into StorageACKHandler (and unregister/refuse on false) instead of applying it only to the publish-gateway path.
| if (hostingNodes.length == 0) { | ||
| revert KnowledgeAssetsLib.InvalidContextGraphConfig("empty hosting nodes"); | ||
| } | ||
| if (hostingNodes.length > MAX_HOSTING_NODES) { |
There was a problem hiding this comment.
🔴 Bug: This relaxes the on-chain invariants for edge-owned context graphs, but the supported create entrypoints still enforce the old ones. /api/context-graph/create currently rejects requiredSignatures > participantIdentityIds.length, so a CLI/daemon client still cannot create the empty-host / legacy-quorum shape this PR is enabling here. Either update the request validation in the same PR or keep the old invariant until the off-chain create flow accepts it.
| if (requiredSignatures == 0 || requiredSignatures > hostCount) { | ||
| revert KnowledgeAssetsLib.InvalidContextGraphConfig("invalid M/N threshold"); | ||
| } | ||
| _contextGraphs[contextGraphId].requiredSignatures = requiredSignatures; |
There was a problem hiding this comment.
🔴 Bug: Allowing requiredSignatures = 0 here creates a context-graph state that current clients cannot consume. The agent-side verify flow still treats getContextGraphConfig().requiredSignatures < 1 as invalid and throws, so any graph updated to zero quorum becomes unverifiable unless callers manually override the value. Keep zero invalid here, or normalize the legacy 0 case across the agent/CLI before merging.
7bd3a1b to
d369470
Compare
Reject unsolicited PCA/paymaster from publish-gateway responses unless the caller requested them; strip the same fields in the agent gateway provider as defense in depth. Add `dkg publisher local-gateway set|clear|status` for publisher.localGateway (handler advertisement). Co-authored-by: Cursor <cursoragent@cursor.com>
d369470 to
fd686d8
Compare
| 'chain adapter cannot read conviction accounts', | ||
| ); | ||
| } | ||
| const account = await this.config.getConvictionAccountInfo(pcaAccountId); |
There was a problem hiding this comment.
🔴 Bug: getConvictionAccountInfo() is now a generic lookup that falls back to legacy PublishingConvictionAccount, but this gateway path later drives KnowledgeAssetsV10.publish() and the adapter only accepts NFT-backed agentToAccountId accounts. That means a core can confirm/advertise a legacy PCA id that will always fail at broadcast time. Please validate against the NFT-specific publish-conviction mapping here (or plumb a dedicated chain check) instead of the generic account lookup.
| contextGraphId, | ||
| merkleRoot, | ||
| ); | ||
| const signature = ethers.Signature.from( |
There was a problem hiding this comment.
🔴 Bug: this signs with signerWallet after only checking isActiveCore. If the operational key is rotated off nodeIdentityId while the identity stays active, /publish-gateway will keep returning signatures that the edge rejects in createPublishGatewayProvider() as an unregistered wallet. Add the same live signer-registration check/callback that StorageACKHandler uses and stop serving the protocol once it fails.
| // been torn down because the ACK signer is no longer confirmed | ||
| // on-chain — peers would discover a gateway and then hit a | ||
| // missing-protocol / timeout. | ||
| publishGateway: role === 'core' && nodeIdentityId > 0n && this.publishGatewayHandlerActive |
There was a problem hiding this comment.
🟡 Issue: tying profile advertisement to publishGatewayHandlerActive only affects the next publishProfile() call, but the daemon publishes the profile once at startup. If the handler comes up later via the retry loop, or is torn down after signer eviction, peers will keep seeing stale discovery data. Republish the profile when this flag flips so the advertised gateway state matches runtime reality.
Summary
hostingNodesor legacy quorum coupling.Verification
minimumRequiredSignaturesto3.1.pnpm -r --filter @origintrail-official/dkg-core --filter @origintrail-official/dkg-storage --filter @origintrail-official/dkg-query --filter @origintrail-official/dkg-publisher --filter @origintrail-official/dkg-chain --filter @origintrail-official/dkg-epcis --filter @origintrail-official/dkg-random-sampling --filter @origintrail-official/dkg-agent --filter @origintrail-official/dkg-adapter-openclaw --filter @origintrail-official/dkg-adapter-hermes --filter @origintrail-official/dkg-mcp run buildpnpm --dir packages/publisher exec vitest run test/publish-gateway-handler.test.tspnpm --dir packages/evm-module exec hardhat test --network hardhat --config hardhat.node.config.ts test/unit/ContextGraphStorage.test.ts test/unit/ContextGraphs.test.tspnpm --dir packages/evm-module exec hardhat test --network hardhat --config hardhat.node.config.ts --grep "empty-host CG|ACK from an operational key|raising quorum|duplicate identity ids|lowering quorum" test/unit/KnowledgeAssetsV10-extra.test.tsNotes
pnpm --filter @origintrail-official/dkg-evm-module test:unitstill includes unrelated intentional red audit/spec-gap cases, including Hub event/auth expectations, RandomSampling auth expectations, and the KAV10KnowledgeBatchCreatedspec-gap test.pnpm --dir packages/cli exec tsc --noEmitis still blocked by unrelated HermesChatMemoryManagertype errors inpackages/cli/src/daemon/hermes.tsandpackages/cli/src/daemon/routes/hermes.ts.