Summary
DistWorker currently accepts an int level_ constructor parameter (3=L3, 4=L4, …) but stores it without enforcing any level-specific constraints or capabilities. Add level-aware mode that validates sub-worker types, enforces dispatch rules, and surfaces the level identity through the Python API.
Motivation / Use Case
The distributed runtime is designed as a recursive hierarchy — L3 dispatches to ChipWorker (L2) and SubWorker; L4 dispatches to DistWorker(level=3) and SubWorker; and so on. Today the level_ field is purely informational: nothing prevents an L4 node from being given a ChipWorker directly, or an L3 node from holding another L3 node as a CHIP sub-worker.
Concrete problems this causes:
- No validation at construction time — incorrect wiring (wrong sub-worker type for the level) is silently accepted and only fails at runtime.
- No capability query — Python code cannot ask "what worker types are valid at this level?" and must know the hierarchy out-of-band.
- Dispatch routing is type-agnostic —
WorkerType::CHIP is used for both L2 ChipWorker and lower-level DistWorker nodes; a level-aware routing table would make this unambiguous.
Proposed API / Behavior
// C++ — enforce valid sub-worker types per level
class DistWorker : public IWorker {
public:
// level 3 → accepts ChipWorker (CHIP) + SubWorker (SUB)
// level 4+ → accepts DistWorker (DIST) + SubWorker (SUB)
void add_worker(WorkerType type, IWorker* worker); // throws if invalid for level
// Query what worker types are accepted at this level
bool accepts_worker_type(WorkerType type) const;
};
# Python
dw = DistWorker(level=3)
dw.accepts_worker_type(WorkerType.CHIP) # True
dw.accepts_worker_type(WorkerType.DIST) # False (L3 does not host L3 sub-nodes)
dw4 = DistWorker(level=4)
dw4.accepts_worker_type(WorkerType.CHIP) # False
dw4.accepts_worker_type(WorkerType.DIST) # True
Level rules (proposed defaults):
- L3: CHIP (ChipWorker / L2 device) + SUB (SubWorker fork/shm)
- L4+: DIST (lower-level DistWorker) + SUB (SubWorker fork/shm)
Alternatives Considered
Leave validation entirely to the caller (current behaviour). Rejected because multi-level composition is error-prone and the misuse is only caught at dispatch time, not at construction time.
Additional Context
DistWorker introduced in PR for Phase 2 (feat/chip-worker branch)
src/common/distributed/dist_worker.{h,cpp}, python/bindings/dist_worker_bind.h
- Related architectural context:
.docs/PHASE2_HOST_WORKER.md, .docs/UNIFIED_RUNTIME_PLAN.md
Summary
DistWorkercurrently accepts anint level_constructor parameter (3=L3, 4=L4, …) but stores it without enforcing any level-specific constraints or capabilities. Add level-aware mode that validates sub-worker types, enforces dispatch rules, and surfaces the level identity through the Python API.Motivation / Use Case
The distributed runtime is designed as a recursive hierarchy — L3 dispatches to
ChipWorker(L2) andSubWorker; L4 dispatches toDistWorker(level=3)andSubWorker; and so on. Today thelevel_field is purely informational: nothing prevents an L4 node from being given aChipWorkerdirectly, or an L3 node from holding another L3 node as a CHIP sub-worker.Concrete problems this causes:
WorkerType::CHIPis used for both L2ChipWorkerand lower-levelDistWorkernodes; a level-aware routing table would make this unambiguous.Proposed API / Behavior
Level rules (proposed defaults):
Alternatives Considered
Leave validation entirely to the caller (current behaviour). Rejected because multi-level composition is error-prone and the misuse is only caught at dispatch time, not at construction time.
Additional Context
DistWorkerintroduced in PR for Phase 2 (feat/chip-worker branch)src/common/distributed/dist_worker.{h,cpp},python/bindings/dist_worker_bind.h.docs/PHASE2_HOST_WORKER.md,.docs/UNIFIED_RUNTIME_PLAN.md