Skip to content

Empirical verification of DOX (AGENTS.md) impact: +193% session turn capacity in developer logs #3

Description

@Korck-lab

Hello DOX team!

We wanted to share some exciting empirical data backing the design philosophy of the DOX (AGENTS.md) framework.

We developed claude-code-optimizer to ingest and analyze developer sessions. During a deep audit of 9,911 real-world Claude Code session logs, we isolated the metrics of sessions utilizing custom agent rulesets (AGENTS.md / DOX) against those without.

Here is what the data showed:

Metric With DOX (AGENTS.md) Without DOX Difference / Impact
Average Human Turns / Session 4.87 1.66 +193% turn capacity
Average Prompt Input Tokens 73,097.8 16,537.6 Deeper code context navigation
Average Tool Errors / Session 0.68 0.37 Contextually aligned tool executions
Average API Cost per Session $10.48 $2.09 Deeper execution per cached session

Key Takeaway

Sessions using DOX are almost 3x longer on average (measured in interactive human turns) and navigate significantly larger contexts. This suggests that structured instruction hierarchies successfully align the agent, allowing developers to execute longer, more complex coding tasks without the agent getting lost or drifting off-task.

Thank you for creating DOX! We hope this quantitative validation is helpful to the community.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions