Skip to content

F100: Dedicated roles/ namespace for agent roles #356

@pocky

Description

@pocky

F100: Dedicated roles/ namespace for agent roles

Scope

In Scope

  • Migrate agent role discovery paths to a dedicated roles/ subdirectory (.awf/roles/, .agents/roles/, $XDG_CONFIG_HOME/awf/roles/, ~/.agents/roles/)
  • Add the XDG helpers AWFRolesDir() and LocalRolesDir(), mirroring the existing skills helpers
  • Update the CLI validation messages (USER.INPUT.MISSING_ROLE) to reflect the new paths
  • Apply the chosen migration strategy: Option A — clean replacement, no fallback (Option B, a deprecated agents/ fallback, is discarded as invalid)
  • Rename the override variable AWF_AGENTS_PATHAWF_ROLES_PATH (no alias)
  • Clarify role: reference resolution: by name (local→global priority, independent of the workflow location) vs by relative path (relative to workflowDir), recommending the by-name mode for overrides
  • Synchronize the user documentation and the CHANGELOG with the effective paths

Out of Scope

  • Changing the injected file name AGENTS.md (remains the cross-client standard)
  • Modifying the injection logic itself (F098, already merged)
  • Reworking the skills namespace (.agents/skills/ stays unchanged)
  • Adding a new architecture component (infra-roles stays; only the paths change)

Deferred

Item Rationale Follow-up
Definitive removal of the AWFAgentsDir() / LocalAgentsDir() helpers Conditional on the absence of other usages (verify via grep), to be confirmed during implementation future

User Stories

US1: Role discovery in an isolated roles/ namespace (P1 - Must Have)

As an AWF user defining an agent step with role:,
I want my roles to be looked up in a dedicated roles/ subdirectory,
So that they no longer collide with skills or any other concept namespaced under .agents/.

Why this priority: This is the core of the fix. Without switching the search paths to roles/, the namespace collision persists and the F098 feature remains structurally fragile. No other requirement has value without this one.

Acceptance Scenarios:

  1. Given a .awf/roles/go-senior/AGENTS.md file and a workflow with role: go-senior, When I run awf validate, Then the role is resolved and the command exits 0.
  2. Given a role legitimately named skills (.agents/roles/skills/AGENTS.md) and a skills container .agents/skills/, When role discovery runs, Then the skills role is resolved without colliding with the skills directory.
  3. Given a workflow located in $XDG_CONFIG_HOME/awf/workflows/ referencing role: go-senior by name, a global role ~/.agents/roles/go-senior/AGENTS.md, and a local override .agents/roles/go-senior/AGENTS.md, When I run awf run from the project directory, Then the local role is injected (local → global priority, independent of the workflow location — FR-008).

Independent Test: Create .awf/roles/go-senior/AGENTS.md, reference role: go-senior in a workflow, run awf validate and verify it exits 0; verify that no root path (.agents/<role>/) is scanned anymore.

US2: Validation messages reflecting the target paths (P2 - Should Have)

As an AWF user whose role cannot be found or is misplaced,
I want the validation error message to list the new roles/ paths,
So that I know exactly where to place my AGENTS.md without reading the code.

Why this priority: Greatly improves the ergonomics of the fix but does not condition the effective resolution of roles (US1). Without US2, discovery works but the error diagnostics remain misleading.

Acceptance Scenarios:

  1. Given a workflow with role: missing-role and no matching AGENTS.md, When I run awf validate, Then the USER.INPUT.MISSING_ROLE error lists the 4 target paths (.awf/roles/, .agents/roles/, $XDG_CONFIG_HOME/awf/roles/, ~/.agents/roles/).
  2. Given an existing role directory without an AGENTS.md, When I run awf validate, Then the message disambiguates "directory without AGENTS.md" from "role not found", with the updated paths.

Independent Test: Run awf validate on a workflow referencing a missing role and verify the error output contains the 4 target paths namespaced under roles/.

US3: Documented migration strategy and renamed env override (P3 - Nice to Have)

As an AWF maintainer preparing the v0.10.0 release,
I want the migration strategy (A or B) to be decided and documented, and the exclusive override env var to remain functional under its new name,
So that existing users understand the impact (breaking or not) and keep their override.

Why this priority: Concerns communication and backward compatibility rather than raw functionality. The feature works without it, but release tracking and the upgrade experience depend on it. Since Option A is a breaking change, CHANGELOG clarity is essential for existing users.

Acceptance Scenarios:

  1. Given the decided strategy (Option A — clean replacement; Option B discarded), When I read the commit and the CHANGELOG, Then the breaking impact (old agents/ paths no longer scanned) is explicitly documented.
  2. Given the AWF_ROLES_PATH variable is set, When I run a workflow with a role, Then the exclusive override is functional and AWF_AGENTS_PATH is no longer recognized.

Independent Test: Verify the CHANGELOG explicitly mentions the chosen strategy (Option A) and that setting AWF_ROLES_PATH resolves roles through the override path while AWF_AGENTS_PATH is ignored.

Edge Cases

  • What happens when a role exists both in the old path (.agents/<role>/) and the new one (.agents/roles/<role>/)? → only the new (roles/) one is resolved; the old path is no longer scanned at all (Option A, no fallback — FR-007).
  • How does the system behave on a system without XDG_CONFIG_HOME / XDG_DATA_HOME defined (XDG fallback)?
  • What is the behavior when AWF_ROLES_PATH is set but points to a directory using the old structure without roles/?
  • What happens for a role named skills placed under .agents/roles/skills/ vs the .agents/skills/ container (structural disambiguation expected)?
  • What happens for role: ./roles/x (relative path) when the workflow is in $XDG_CONFIG_HOME/awf/workflows/? → resolved relative to workflowDir (the workflow location), not the current working directory; the by-name local override remains the recommended path (FR-008 / FR-009).

Requirements

Functional Requirements

  • FR-001: System MUST discover roles via NewFilesystemAgentRoleRepository in the target paths with a roles/ segment: .awf/roles/<role>/, .agents/roles/<role>/, $XDG_CONFIG_HOME/awf/roles/<role>/, ~/.agents/roles/<role>/.
  • FR-002: System MUST expose in xdg.go the helpers AWFRolesDir() ($XDG_CONFIG_HOME/awf/roles) and LocalRolesDir() (.awf/roles); keep AWFAgentsDir() / LocalAgentsDir() as long as other usages exist (verify via grep), otherwise remove them.
  • FR-003: System MUST report the roles/-namespaced target paths in the USER.INPUT.MISSING_ROLE error messages of internal/interfaces/cli/validate.go, while maintaining the "dir-without-AGENTS.md" vs "role-not-found" disambiguation.
  • FR-004: System MUST rename the exclusive override variable AWF_AGENTS_PATHAWF_ROLES_PATH; when set, it short-circuits the entire roles/ discovery chain (exclusive-override semantics unchanged). For consistency with Option A (clean replacement), no backward-compatible alias is kept; the removal of AWF_AGENTS_PATH MUST be documented as breaking in the CHANGELOG (potential CI impact).
  • FR-005: Users MUST be able to place an AGENTS.md at <path>/roles/<name>/AGENTS.md and have the role resolved, the AGENTS.md file name remaining unchanged.
  • FR-006: System MUST apply Option A — clean replacement: role discovery paths switch entirely to the roles/ segment, with no fallback to the old agents/ paths. Option B (deprecated fallback + warning) is discarded (invalid, see Clarifications). The breaking nature of this replacement MUST be documented in the commit and the CHANGELOG.
  • FR-007: System MUST NO LONGER scan the old root paths agents/<role>/ (.awf/agents/, .agents/, $XDG_CONFIG_HOME/awf/agents/, ~/.agents/) when resolving a role; a role present only in the old location MUST be treated as not found and trigger USER.INPUT.MISSING_ROLE.
  • FR-008: For a role: referenced by name (no path separator and no ./~// prefix), System MUST apply the roles/ search-path priority (local → global, relative to the current working directory), guaranteeing that a local role (.awf/roles/<name>/ or .agents/roles/<name>/) overrides a global role (~/.agents/roles/<name>/) regardless of the workflow file location — including a workflow located in $XDG_CONFIG_HOME/awf/workflows/. The workflowDir parameter MUST have no effect on by-name resolution.
  • FR-009: For a role: referenced by relative path (./…), System MUST resolve it relative to workflowDir (the workflow file directory), a portable and intentionally distinct behavior from the local→global priority of FR-008. This divergence MUST be documented explicitly in the user documentation so that the by-name reference is the recommended way to locally override a global role.

Non-Functional Requirements

  • NFR-001: make build && make lint && make test MUST pass with zero violations after implementation.
  • NFR-002: Role name validation MUST apply filepath.Clean and check for path-traversal patterns before any filepath.Join (no security regression vs F098).
  • NFR-003: Role discovery MUST behave deterministically when XDG_CONFIG_HOME / XDG_DATA_HOME are not defined (XDG fallback tested).
  • NFR-004: The migration decision (Option A) MUST be documented in the commit and the CHANGELOG; the user documentation MUST be synchronized with the effective paths.

Success Criteria

  • SC-001: A go-senior role is resolved from each of the 4 roles/-namespaced target paths and awf validate exits 0 in every case.
  • SC-002: A role named skills is resolved with no collision against the skills container (0 collisions across 100% of tested configurations).
  • SC-003: 100% of role validation error messages list the up-to-date roles/ target paths.
  • SC-004: The chosen migration strategy (Option A) and its impact (breaking) are documented in the CHANGELOG and the user documentation, verifiable without reading the code.
  • SC-005: A role referenced by name is resolved with local → global priority independently of the workflow file location; the local override wins in 100% of tested configurations (including a workflow in $XDG_CONFIG_HOME/awf/workflows/).

Key Entities

Entity Description Key Attributes
Agent Role A role injected onto an agent step via an AGENTS.md file name (path segment), roles/-namespaced discovery path, AGENTS.md content
Role Search Path A role lookup path, namespaced under roles/ base (.awf, .agents, $XDG_CONFIG_HOME/awf, ~/.agents), roles/ segment, priority

Assumptions

  • Option A (clean replacement, no fallback) is chosen; Option B (deprecated agents/ fallback) is discarded (see Clarifications). The breaking nature is accepted and documented in the CHANGELOG.
  • The injected file remains AGENTS.md (cross-client standard for Cursor/Cline); only the parent directory changes.
  • infra-roles remains an existing .go-arch-lint.yml component; only the search paths evolve, no new component is expected.
  • The skills namespace (.agents/skills/) does not change and serves as the symmetric convention model for roles/.

Metadata

  • Status: backlog
  • Version: v0.10.0
  • Priority: high
  • Estimation: S

Dependencies

  • Blocked by: none
  • Unblocks: none

Clarifications

Session 2026-05-27

  • Q: Migration strategy A (clean replacement) vs B (deprecated agents/ fallback)? → A: Option A chosen. Option B judged invalid and discarded. The old agents/ paths are no longer scanned (FR-007); the change is breaking and must be documented in the CHANGELOG. The US1 Independent Test ("no root path is scanned anymore") is therefore valid unconditionally.
  • Q: Rename the override variable AWF_AGENTS_PATH? → A: Yes → AWF_ROLES_PATH, with no backward-compatible alias (consistent with Option A). The removal of AWF_AGENTS_PATH is documented as breaking (FR-004).
  • Q: How to resolve the local-vs-global resolution blind spot (overriding a global role with a local one when the workflow is in $XDG_CONFIG_HOME/awf/workflows/)? → A: By-name reference = override mechanism (local→global priority via search paths, independent of workflowDir — FR-008); relative-path reference = resolution relative to workflowDir (portable, distinct — FR-009). The behavior is made explicit and documented, with the by-name mode recommended for overrides. No change to the existing expandRolePath semantics.

Notes

Architecture fix for F098 (Agent role injection via AGENTS.md, PR #352 merged). Source ID: F098-FIX-001.

Indicative implementation perimeter:

  • internal/infrastructure/roles/filesystem_repository.go — search paths + reading AWF_ROLES_PATH
  • internal/infrastructure/xdg/xdg.goAWFRolesDir(), LocalRolesDir()
  • internal/application/role_loader.goResolveAgentRole / expandRolePath: confirm the by-name semantics (FR-008) vs relative-path workflowDir semantics (FR-009); add dedicated tests
  • internal/interfaces/cli/validate.go — validation messages
  • .go-arch-lint.ymlinfra-roles consistency (no new component)
  • Tests: roles/filesystem_repository_test.go, validate_role_test.go, possible fixtures
  • Docs: docs/user-guide/agent-steps.md, docs/user-guide/workflow-syntax.md, CHANGELOG.md

Decision made (see Clarifications, session 2026-05-27): Option A — clean replacement, no fallback. Option B (deprecated agents/ fallback) discarded. Breaking migration to be documented in the CHANGELOG.

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureFeature specificationv0.10.0Target version

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions