Skip to content

Clean up: remove remaining multi-orchestrator scaffolding #438

@hw-native-sys-bot

Description

@hw-native-sys-bot

Context

PR #437 hardcoded orchestrators[0] and removed the thread_local pto2_current_orch_idx / pto2_set_orch_thread_idx mechanism. However, several pieces of multi-orchestrator scaffolding remain in the codebase:

Remaining code to remove

  • PTO2_MAX_ORCH_THREADS constant (currently 4) in all pto_runtime2.h headers
  • PTO2Runtime::orchestrators[] array — can become a single PTO2OrchestratorState orchestrators[1] (or a plain member)
  • PTO2Runtime::orch_count field and its validation logic in pto2_runtime_create_from_sm
  • orch_idx parameter passed to orch_func_(args, orch_thread_num_, orch_idx) in executor — always 0 now
  • orch_thread_num_ / sched_thread_num_ split logic in aicpu_executor.cpp (orchestrator vs scheduler thread roles)
  • Multi-orchestrator docs references in SUBMIT_BY_CLUSTER.md mentioning pto2_current_orch_idx
  • perf_aicpu_set_orch_thread_idx (static __thread) in performance_collector_aicpu.cpp — same pattern, also uses thread-local

Files

  • src/a2a3/runtime/aicpu_build_graph/runtime/pto_runtime2.h
  • src/a2a3/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h
  • src/a5/runtime/tensormap_and_ringbuffer/runtime/pto_runtime2.h
  • src/a2a3/runtime/aicpu_build_graph/aicpu/aicpu_executor.cpp
  • src/a2a3/runtime/tensormap_and_ringbuffer/aicpu/aicpu_executor.cpp
  • src/a5/runtime/tensormap_and_ringbuffer/aicpu/aicpu_executor.cpp
  • src/{a2a3,a5}/platform/src/aicpu/performance_collector_aicpu.cpp
  • src/{a2a3,a5}/runtime/tensormap_and_ringbuffer/docs/SUBMIT_BY_CLUSTER.md

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    Status

    Todo

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions