Skip to content

chore(claude): add ddtrace-review code review skill#17179

Draft
bm1549 wants to merge 1 commit intomainfrom
brian.marks/add-ddtrace-review-skill
Draft

chore(claude): add ddtrace-review code review skill#17179
bm1549 wants to merge 1 commit intomainfrom
brian.marks/add-ddtrace-review-skill

Conversation

@bm1549
Copy link
Copy Markdown
Contributor

@bm1549 bm1549 commented Mar 28, 2026

Description

Add a Claude Code skill (ddtrace-review) that performs automated code review for dd-trace-py PRs. The skill was built by mining 6 months of human review comments from this repo and distilling the most common patterns into a progressive-disclosure checklist.

How it was created

  1. Data collection: Fetched all review comments from the past 6 months using the GitHub API — 5,974 inline comments and 1,409 review summaries across 1,083 PRs.

  2. Pattern extraction: Analyzed comment frequency by category:

    • Test issues (12.3%), error handling (10.7%), code suggestions (9.8%), config/env (7.6%), monkey-patching (7.2%), cleanup (6.9%), thread safety (5.7%), naming (5.0%), type hints (3.3%), performance (2.5%), backward compat (2.0%), and more.
  3. Skill design: Built a three-pass progressive-disclosure review:

    • Pass 1 (fast): Python version compat, import discipline, release notes, PR title, dead code, test issues, CI/infra files, config/env vars
    • Pass 2 (medium): Error handling, thread safety, hot-path performance, backward compat, span metadata
    • Pass 3 (deep): Integration patterns, monkey-patching, API design, AppSec/LLMObs/Cython-specific patterns
  4. Iterative evaluation: Ran the skill against 20 real PRs (4 rounds of 5 PRs each) from different areas of the codebase, comparing output against actual human reviewer comments. Iterated on the skill after each round.

Eval results

Tested against 20 PRs spanning profiling, contrib integrations, core internals, AppSec, LLMObs, openfeature, and CI/config:

Eval Round PRs Pattern-Coverable Coverage
Round 1 (5 PRs) azure_cosmos, profiling/cython, llmobs/meta_struct, agentless_json, appsec/waf 82.7%
Round 2 (5 PRs) httpx/events, core/events, azure_functions, vllm, config/env 79.3%
Round 3 (5 PRs) profiling/C++, MCP, core/subscribers, httpx_revamp, appsec/SSRF 92.8%
Round 4 (5 PRs) profiling/alloc, Python_3.14, openfeature, core-api/defer, llm_judge 83.6%
Combined 20 PRs, 407 comments 84.5% (288/341 pattern-coverable)

Coverage of total comments

  • 84% of all human review comments (341/407) are pattern-coverable (detectable from reading code, not requiring deep domain context)
  • 71% of all comments (including domain-specific/conversational ones) would be covered by this skill
  • The remaining 16% non-pattern-coverable comments are things like domain-specific design discussions, acknowledgments, and conversational replies

Strongest areas

  • New integration PRs (azure_cosmos, vllm, MCP): consistently 80-100%
  • LLMObs changes: 87-94%
  • Core internal changes: 94-100%

Weakest areas

  • Deep C++ profiling internals (CPython struct layouts): ~54-77%
  • Config/linting PRs with unusual file types (.sg/ rules): ~50%

Testing

Evaluated against 20 real PRs from the past 6 months with a grading script that compares skill output against actual human reviewer comments. The grading measures file-level and concept-level coverage of pattern-coverable review comments.

Risks

None — this is a Claude Code skill file only (.claude/skills/). It has no runtime impact on the library. It only affects Claude Code sessions that invoke /ddtrace-review.

Additional Notes

The skill can be invoked with /ddtrace-review or triggers automatically when asking Claude Code to review dd-trace-py changes. It uses the same P1/P2/P3 severity format that human reviewers use on this repo.

🤖 Generated with Claude Code

Add a Claude Code skill that performs automated code review for dd-trace-py
PRs, trained on 6 months of human review comments (5,974 comments across
1,083 PRs). Uses a progressive-disclosure approach with three review passes
of increasing depth.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@bm1549 bm1549 added the AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos label Mar 28, 2026
@cit-pr-commenter-54b7da
Copy link
Copy Markdown

Codeowners resolved as

.claude/skills/ddtrace-review/SKILL.md                                  @DataDog/apm-core-python

@bm1549 bm1549 added the changelog/no-changelog A changelog entry is not required for this PR. label Mar 28, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated Largely based on code generated by an AI or LLM. This label is the same across all dd-trace-* repos changelog/no-changelog A changelog entry is not required for this PR.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant