Skip to content

ADR-009: Actor-model NetFlow generation#1763

Open
blt wants to merge 2 commits intomainfrom
blt/adr-009-actor-model-netflow
Open

ADR-009: Actor-model NetFlow generation#1763
blt wants to merge 2 commits intomainfrom
blt/adr-009-actor-model-netflow

Conversation

@blt
Copy link
Collaborator

@blt blt commented Feb 11, 2026

What does this PR do?

Proposes a dedicated NetFlow generator using an actor-model simulation
where network hosts execute behavioral sequences that produce correlated
flow records as a byproduct. Covers v5 and v9 wire formats, explains why
the block cache is inapplicable (temporal compression), and follows the
dedicated-generator precedent set by ProcessTree/FileTree/Container.

@blt blt requested a review from a team as a code owner February 11, 2026 22:24
@blt blt marked this pull request as draft February 11, 2026 22:36
Proposes a dedicated NetFlow generator using an actor-model simulation
where network hosts execute behavioral sequences that produce correlated
flow records as a byproduct. Covers v5 and v9 wire formats, explains why
the block cache is inapplicable (temporal compression), and follows the
dedicated-generator precedent set by ProcessTree/FileTree/Container.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@blt blt force-pushed the blt/adr-009-actor-model-netflow branch from 8603b88 to b2ea346 Compare February 11, 2026 22:42
@blt blt marked this pull request as ready for review February 11, 2026 22:43
Copy link

@garrison-stauffer garrison-stauffer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very cool! One question I had that I might have missed in the doc: is there a single client for the agent? The agent aggregates by these fields, typically in a customer environment this is a router, firewall, or L3 switch, as opposed to individual clients on the network

@blt
Copy link
Collaborator Author

blt commented Feb 12, 2026

Looks very cool! One question I had that I might have missed in the doc: is there a single client for the agent? The agent aggregates by these fields, typically in a customer environment this is a router, firewall, or L3 switch, as opposed to individual clients on the network

Good question. Let me clarify the ADR.


Each field is individually valid. The flow, while technically valid, does not
represent 'realistic' sufficient to make claims about the target, except in
extremis.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fwiw I will likely end up needing a custom generator for CWS as well.

They will need a "shape" of kernel traffic that exercises the right code paths. Completely arbitrary kernel events could easily bypass the functionality of CWS.

So I am also super curious about what we do here.

```
- **Semantic assumptions**: The behavior definitions encode assumptions about
"realistic" traffic that may not match all deployment environments
- **Runtime generation cost**: Unlike block-cache generators, the dedicated
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we establish any baseline requirements for runtime generators?

  • "Must have associated benchmarks"
    • "Throughput must be X"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Let's talk about this when I get into the implementation side. I think yes, I'm not exactly sure what to claim yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy to start with this and have a follow-up ADR that addresses the runtime generation costs.

It's easier for me to argue about this once we have some guarantees/better understanding of what we mean by "lading must run faster than the target"/ "lading must not be the bottleneck". The more benchmarking I'm iterating on, the clearer that's getting.

We don't currently enforce any invariants/constraints on lading - as such, it wouldn't make sense to constrain the solution we need for the problem you've highlighted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's easier for me to argue about this once we have some guarantees/better understanding of what we mean by "lading must run faster than the target"/ "lading must not be the bottleneck". The more benchmarking I'm iterating on, the clearer that's getting.

Agreed. So far that's been a design goal without a quantitative measure. I'd be real pleased to have that resolved.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: we might want to add/tweak some of the documentation in AGENTS.md afterward.

Copy link
Collaborator Author

@blt blt Feb 12, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed. I'll ping you, curious to work that together. You've improved over me in this area.

Introduces exporters as first-class config concept — each exporter
represents a router/firewall/switch with its own bind address, protocol
version, source_id, flows_per_second, and actor pool. The Agent
aggregates by ExporterAddr (UDP source IP), so distinct loopback
addresses (127.0.0.1, 127.0.0.2, etc.) model distinct network devices.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@blt
Copy link
Collaborator Author

blt commented Feb 12, 2026

Looks very cool! One question I had that I might have missed in the doc: is there a single client for the agent? The agent aggregates by these fields, typically in a customer environment this is a router, firewall, or L3 switch, as opposed to individual clients on the network

@garrison-stauffer solid catch here. I had completely overlooked this. I've adjusted the proposed config now so that we are explicitly setting exporter / clients. I'm thinking the interior implementation will just have separate forks of the same model, serialization code etc and bind to unique sockets on the same host.

mask: 24
weight: 100

- addr: "127.0.0.2" # IoT gateway

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah cool, I was wondering if we'd be able to multiple local IPs (loopbacks? not sure what it is called), this looks perfect

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool.

Copy link
Contributor

@preinlein preinlein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@preinlein
Copy link
Contributor

@blt question for ya, we don't speak of any limitations in the expression of the lading configuration: we allow unbounded lists to be expressed.

Eventually, would we want to do some kind of configuration verification to verify that the user is not doing something that would cause lading to be very slow?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants