Background
The 2026-05-01 devnet incident (lake#556, lake#557) exposed a fundamental Borsh limitation: Vec<T> is encoded as (count: u32, elements: T[count]) — element count, not byte size. When an old SDK encounters a Vec<Interface> containing an element whose version it doesn't recognize, it can't skip the element to reach the next one — there's no framing telling it where the unknown element ends. The dispatcher consumes only the version byte and returns early, leaving the reader cursor stranded inside the unknown element's body. The next iteration reads garbage as the next discriminant. Cursor desync then propagates through the rest of the Vec and through the 12 Device fields after Interfaces.
This is a systemic forward-compatibility hole that recurs every time a new Interface version ships and an off-chain SDK consumer hasn't been upgraded.
New Proposed Design
New vector at the end of the device struct where V3+ interfaces will go. For compatibility, writes will put interfaces into both vectors, with the original location getting a truncated version that matches the V2 format.
Original Proposed design
Stop evolving Interface itself. Freeze it at the current shape (V2) forever. New per-interface fields live in a new sibling Vec on Device that is appended at the end of Device's on-chain layout:
pub struct Device {
// ... all existing fields, unchanged ...
pub interfaces: Vec<Interface>, // frozen at V2 shape, never evolves
// ... rest of existing Device fields, unchanged ...
pub max_multicast_publishers: u16, // currently the last on-chain field
pub interface_additional: Vec<InterfaceAdditional>, // NEW, appended at end
}
pub struct InterfaceAdditional {
pub length: u32, // total bytes of this struct after this field — for skip-on-unknown
pub name: String, // FK to interface.name; on-chain code MUST keep names in sync
pub version: u8, // V1, V2, ... independent of Interface's discriminant
// version-specific body follows
}
The on-chain program is responsible for keeping interface_additional[i].name in sync with the corresponding interfaces[j].name on every name mutation. (Names are mutable on-chain; this is the cost of using name as the join key, but no other Interface field is a better candidate.)
Why this preserves current SDKs
Current SDKs reading post-migration Device accounts:
- Read everything up through
max_multicast_publishers correctly — Interface is frozen at V2 shape, which they already understand.
- Their
DeserializeDevice has no code for interface_additional, so they simply stop reading. The reader cursor lands at the start of the new field but nothing tries to consume it. ByteReader returns zeros past EOF anyway, so even mistaken reads don't panic.
- They get the base interface data correctly and silently miss the additional data.
No SDK consumer needs to upgrade before the migration runs. Compare to Options A and B (below), which both required every consumer to be on the new SDK before the migration shipped.
This is the property that elevates this design above any of the in-Vec framing approaches. Old CLIs, old admin tools, old monitoring scripts all keep working — they just don't see the additional fields.
Forward compat within interface_additional
The same evolution problem has to be solved within the new Vec itself. Future InterfaceAdditional versions will add fields. SDKs that know V_n but encounter V_n+1:
- Read
length (u32),
- Read
version,
- If unknown,
reader.Advance(length - sizeof(version)) to the next element,
- Continue.
Per-element length + version is the same length-prefix-per-element pattern that made Option B (below) robust on its own. We're applying it where it's needed (inside the new evolutionary container) but not paying for it on interfaces (which is now frozen).
We choose a single catch-all interface_additional rather than one Vec per evolution so future fields grow into the existing structure rather than proliferating per-evolution Vecs.
Migration
Required:
-
On-chain program upgrade that:
- Treats
Interface as frozen at V2 shape (drops V3-with-inline-flex-algo from the writer side).
- Defines
InterfaceAdditional and the interface_additional field on Device.
- Adds a migration instruction that, for every existing Device, rewrites each Interface from V3 back to V2 and creates a corresponding
InterfaceAdditional entry holding the moved flex_algo_node_segments data.
- Keeps
interface_additional[i].name in sync with interfaces[j].name on every Interface name mutation.
-
SDK updates across Go (internal + external), TypeScript, Python — adding InterfaceAdditional deserialization. Crucially, this can roll out on a relaxed timeline because old SDKs continue to work in degraded-but-correct mode.
-
Consumer rollouts: lake-indexer, CLIs, admin tools update at their own pace. Strict consumers (lake) should adopt new-format reads before downstream queries depend on extension fields.
The contrast with the original RFC-18 migration is the headline benefit: that one needed every consumer in lockstep before the migration ran on each env. This one doesn't.
Bonus cleanup
The pre-flight heuristic in deserialize.go:164 (length*18 > reader.Remaining()) hardcodes a minimum interface size that's already inaccurate post-V3. Once Interface is frozen at V2, the minimum is well-defined again — but better to drop the heuristic in favor of letting ByteReader's EOF handling do its job.
Discipline (what stays true forever)
Interface struct shape never changes again.
- All future per-interface evolution lives in
interface_additional with per-element length+version framing.
interface_additional stays at the end of Device's on-chain layout.
- On-chain Interface name mutations propagate to corresponding
interface_additional entries atomically.
Alternatives considered
A. Length-prefix the existing interfaces Vec body + sort by version
(byte_length: u32, count: u32, elements...) for the Vec, with on-chain code re-sorting interfaces ascending by version on every mutation. SDKs hitting an unknown version bail out of the Vec and reader.Advance(remaining_bytes) to resync.
Why rejected: requires every off-chain consumer to upgrade before the migration runs (current SDKs can't read the new format). Also requires maintaining a sort invariant on every Interface mutator forever, and breaks any latent positional dependencies on Vec order.
B. Length-prefix per Interface element
Each Interface element gains its own (version: u8, body_length: u32, body) framing. SDKs skip unknown elements via body_length regardless of position.
Why rejected: also requires every off-chain consumer to upgrade before the migration runs. Marginally more robust than Option A (no sort needed) but doesn't preserve current SDKs at all.
C. Per-evolution sibling Vec on Device
Each new evolution gets its own typed Vec field on Device (e.g., interface_flex_algo: Vec<...>, later interface_bgp: Vec<...>). Same backward-compat properties as the chosen design.
Why not chosen: works identically from old-SDK perspective, but proliferates fields on Device as evolutions accumulate. The catch-all interface_additional keeps the Device struct cleaner.
D. Tagged Borsh enum for Interface
Doesn't help — Borsh's enum encoding is (discriminant: u8, body) with no length information. Old SDKs hitting an unknown discriminant still can't skip past it.
Related
Background
The 2026-05-01 devnet incident (lake#556, lake#557) exposed a fundamental Borsh limitation:
Vec<T>is encoded as(count: u32, elements: T[count])— element count, not byte size. When an old SDK encounters aVec<Interface>containing an element whose version it doesn't recognize, it can't skip the element to reach the next one — there's no framing telling it where the unknown element ends. The dispatcher consumes only the version byte and returns early, leaving the reader cursor stranded inside the unknown element's body. The next iteration reads garbage as the next discriminant. Cursor desync then propagates through the rest of the Vec and through the 12 Device fields afterInterfaces.This is a systemic forward-compatibility hole that recurs every time a new Interface version ships and an off-chain SDK consumer hasn't been upgraded.
New Proposed Design
New vector at the end of the device struct where V3+ interfaces will go. For compatibility, writes will put interfaces into both vectors, with the original location getting a truncated version that matches the V2 format.
Original Proposed design
Stop evolving
Interfaceitself. Freeze it at the current shape (V2) forever. New per-interface fields live in a new sibling Vec on Device that is appended at the end of Device's on-chain layout:The on-chain program is responsible for keeping
interface_additional[i].namein sync with the correspondinginterfaces[j].nameon every name mutation. (Names are mutable on-chain; this is the cost of using name as the join key, but no other Interface field is a better candidate.)Why this preserves current SDKs
Current SDKs reading post-migration Device accounts:
max_multicast_publisherscorrectly — Interface is frozen at V2 shape, which they already understand.DeserializeDevicehas no code forinterface_additional, so they simply stop reading. The reader cursor lands at the start of the new field but nothing tries to consume it. ByteReader returns zeros past EOF anyway, so even mistaken reads don't panic.No SDK consumer needs to upgrade before the migration runs. Compare to Options A and B (below), which both required every consumer to be on the new SDK before the migration shipped.
This is the property that elevates this design above any of the in-Vec framing approaches. Old CLIs, old admin tools, old monitoring scripts all keep working — they just don't see the additional fields.
Forward compat within
interface_additionalThe same evolution problem has to be solved within the new Vec itself. Future
InterfaceAdditionalversions will add fields. SDKs that know V_n but encounter V_n+1:length(u32),version,reader.Advance(length - sizeof(version))to the next element,Per-element
length+versionis the same length-prefix-per-element pattern that made Option B (below) robust on its own. We're applying it where it's needed (inside the new evolutionary container) but not paying for it oninterfaces(which is now frozen).We choose a single catch-all
interface_additionalrather than one Vec per evolution so future fields grow into the existing structure rather than proliferating per-evolution Vecs.Migration
Required:
On-chain program upgrade that:
Interfaceas frozen at V2 shape (drops V3-with-inline-flex-algo from the writer side).InterfaceAdditionaland theinterface_additionalfield on Device.InterfaceAdditionalentry holding the movedflex_algo_node_segmentsdata.interface_additional[i].namein sync withinterfaces[j].nameon every Interface name mutation.SDK updates across Go (internal + external), TypeScript, Python — adding
InterfaceAdditionaldeserialization. Crucially, this can roll out on a relaxed timeline because old SDKs continue to work in degraded-but-correct mode.Consumer rollouts: lake-indexer, CLIs, admin tools update at their own pace. Strict consumers (lake) should adopt new-format reads before downstream queries depend on extension fields.
The contrast with the original RFC-18 migration is the headline benefit: that one needed every consumer in lockstep before the migration ran on each env. This one doesn't.
Bonus cleanup
The pre-flight heuristic in
deserialize.go:164(length*18 > reader.Remaining()) hardcodes a minimum interface size that's already inaccurate post-V3. Once Interface is frozen at V2, the minimum is well-defined again — but better to drop the heuristic in favor of letting ByteReader's EOF handling do its job.Discipline (what stays true forever)
Interfacestruct shape never changes again.interface_additionalwith per-element length+version framing.interface_additionalstays at the end of Device's on-chain layout.interface_additionalentries atomically.Alternatives considered
A. Length-prefix the existing
interfacesVec body + sort by version(byte_length: u32, count: u32, elements...)for the Vec, with on-chain code re-sorting interfaces ascending by version on every mutation. SDKs hitting an unknown version bail out of the Vec andreader.Advance(remaining_bytes)to resync.Why rejected: requires every off-chain consumer to upgrade before the migration runs (current SDKs can't read the new format). Also requires maintaining a sort invariant on every Interface mutator forever, and breaks any latent positional dependencies on Vec order.
B. Length-prefix per Interface element
Each Interface element gains its own
(version: u8, body_length: u32, body)framing. SDKs skip unknown elements viabody_lengthregardless of position.Why rejected: also requires every off-chain consumer to upgrade before the migration runs. Marginally more robust than Option A (no sort needed) but doesn't preserve current SDKs at all.
C. Per-evolution sibling Vec on Device
Each new evolution gets its own typed Vec field on Device (e.g.,
interface_flex_algo: Vec<...>, laterinterface_bgp: Vec<...>). Same backward-compat properties as the chosen design.Why not chosen: works identically from old-SDK perspective, but proliferates fields on Device as evolutions accumulate. The catch-all
interface_additionalkeeps the Device struct cleaner.D. Tagged Borsh enum for Interface
Doesn't help — Borsh's enum encoding is
(discriminant: u8, body)with no length information. Old SDKs hitting an unknown discriminant still can't skip past it.Related
convertDeviceInterfacesregression tests