Skip to content

[DOC] Add comprehensive release documentation and automation skill#97

Open
tazarov wants to merge 1 commit intomainfrom
doc/release-process-and-skill
Open

[DOC] Add comprehensive release documentation and automation skill#97
tazarov wants to merge 1 commit intomainfrom
doc/release-process-and-skill

Conversation

@tazarov
Copy link
Copy Markdown
Contributor

@tazarov tazarov commented Nov 8, 2025

Summary

This PR adds comprehensive documentation and tooling for the release process, capturing the learnings from creating v0.1.2.

Changes

📚 Documentation

  • docs/RELEASE.md: Complete step-by-step release guide
    • Prerequisites and version selection guidance
    • Detailed 6-step process with exact commands
    • Troubleshooting section for common issues
    • Quick reference and release checklist
    • Platform support matrix

🤖 Automation

  • .claude/skills/SKILL.md: Automated release skill for Claude Code
    • Validates prerequisites automatically
    • Interactive version selection
    • Updates Cargo.toml
    • Requires user confirmation before git operations
    • Monitors workflows and handles failures
    • Auto-retries Go release if timing issue occurs
    • Provides final verification and summary

Key Features

Release Documentation

  • Clear explanation of dual release cycle (Rust + Go)
  • Emphasis on correct sequencing (Rust must complete before Go)
  • Troubleshooting for the "undefined symbol" timing issue
  • Common mistakes to avoid
  • Monitoring commands using gh CLI

Automation Skill

  • Semi-automated: handles complexity while requiring user confirmation
  • Shows exactly what will be committed/pushed before executing
  • Tracks progress with TodoWrite
  • Intelligent retry logic for timing-related failures
  • Comprehensive error handling

Testing

  • Skill created following Claude Code specs (SKILL.md as entrypoint)
  • Documentation includes all steps from successful v0.1.2 release
  • User confirmation step prevents accidental pushes

Usage

After merging, the skill can be invoked with:

/release

Or naturally:

"Create a new release"
"I want to release version 0.1.3"

Related

Based on the successful release of v0.1.2 which included:

Full Changelog: v0.1.1...v0.1.2

- Add docs/RELEASE.md with step-by-step release process
- Create .claude/skills/SKILL.md for automated releases
- Document dual release cycle (Rust + Go)
- Include troubleshooting for common issues
- Add user confirmation step before git operations
- Provide quick reference commands and checklists
@claude
Copy link
Copy Markdown

claude bot commented Nov 8, 2025

PR Review: Release Documentation and Automation

Summary

This PR adds comprehensive release documentation and a Claude Code skill for semi-automated releases. Overall, this is a valuable addition that addresses the immediate need documented in issue #90. The documentation is clear, well-structured, and based on real experience from v0.1.2.


Strengths

Excellent documentation structure - Step-by-step guide with troubleshooting section
User safety - Skill requires explicit confirmation before git operations
Real-world validated - Based on successful v0.1.2 release
Good error handling - Automatic retry for timing issues
Clear sequencing - Properly documents Rust-first release order


Issues & Suggestions

1. Cargo.lock Version Mismatch (Minor Bug)

Location: Cargo.lock line 1064

The Cargo.lock shows version 0.1.2 but this should likely be 0.1.3 or remain at the current version. This looks like it was accidentally included from a previous release test.

Action: Verify and update if needed.


2. Missing Integration with Issue #91

Context: Issue #91 proposes a unified release workflow with version matrix

The skill documentation doesn't mention the planned unified workflow from #91. If #91 is implemented, this skill would need updates.

Suggestion: Add a note in the skill or docs about the potential evolution to a unified workflow, or coordinate timing with #91.


3. Documentation References

Location: docs/RELEASE.md lines 272-277

References to CI-CD.md and DEPLOYMENT.md that may not exist or may need updates per issue #90.

Action: Verify these files exist and are up-to-date, or remove the references.


4. Skill Robustness Improvements

a) Run ID Extraction

Location: .claude/skills/SKILL.md lines 112-114, 150-152

The skill mentions "Extract the run ID" but doesn't specify how. The gh run list output parsing could be fragile.

Suggestion:

# More robust run ID extraction
gh run list --workflow="Rust Release" --limit 1 --json databaseId --jq '.[0].databaseId'

b) Version Validation

The skill doesn't validate semantic versioning format. A typo in version input could cause issues.

Suggestion: Add validation step to check version format matches X.Y.Z.

c) Artifact Count Verification

Location: Lines 126-133

Instead of manually verifying 7 artifacts, could programmatically count them:

gh release view rust-vX.Y.Z --json assets --jq '.assets | length'

5. Documentation Completeness

a) Rollback Strategy

Neither the skill nor docs cover how to rollback a failed release or unpublish a bad release.

Suggestion: Add a troubleshooting section for rollback procedures.

b) CLAUDE.md Alignment

The CLAUDE.md file mentions version bumping should happen before tagging, but doesn't emphasize the full sequence as strongly as RELEASE.md does.

Suggestion: Consider updating CLAUDE.md lines 21-28 to reference the detailed RELEASE.md guide.


6. Testing Considerations

Question: How was the skill tested? Since it performs git operations, testing is non-trivial.

Suggestion: Consider adding:

  • A dry-run mode that shows what would happen without executing
  • Or test documentation showing it was validated against a test repo

Security Considerations

Good: User confirmation before destructive operations
Good: Uses gh CLI which inherits user's GitHub auth

⚠️ Note: The skill pushes directly to main. Ensure users understand this bypasses PR review for version bumps.


Related Work


Recommendation

Approve with minor changes. This is a solid foundation for the release process. Address the Cargo.lock discrepancy and consider the suggestions above for robustness.

Priority fixes before merge:

  1. Fix/explain Cargo.lock version
  2. Verify documentation cross-references exist
  3. Add note about relationship to issue [ENH] Implement unified release workflow with independent component versioning #91

Nice-to-haves for follow-up:

  • More robust run ID extraction
  • Version format validation
  • Rollback documentation
  • Dry-run mode for skill

@github-actions
Copy link
Copy Markdown

github-actions bot commented Nov 8, 2025

Benchmark Comparison

goos: linux
goarch: amd64
pkg: github.com/amikos-tech/pure-tokenizers
cpu: AMD EPYC 7763 64-Core Processor                
                                  │ base_bench.txt │            pr_bench.txt            │
                                  │     sec/op     │    sec/op     vs base              │
Encode/Short-4                        9.150µ ± ∞ ¹   9.424µ ± ∞ ¹  +2.99% (p=0.008 n=5)
Encode/Medium-4                       42.19µ ± ∞ ¹   42.25µ ± ∞ ¹       ~ (p=0.841 n=5)
Encode/Long-4                         324.7µ ± ∞ ¹   329.5µ ± ∞ ¹       ~ (p=0.690 n=5)
EncodeWithOptions/Default-4           42.08µ ± ∞ ¹   42.35µ ± ∞ ¹       ~ (p=0.310 n=5)
EncodeWithOptions/WithTypeIDs-4       42.78µ ± ∞ ¹   42.27µ ± ∞ ¹       ~ (p=0.095 n=5)
EncodeWithOptions/WithTokens-4        42.57µ ± ∞ ¹   42.54µ ± ∞ ¹       ~ (p=0.548 n=5)
EncodeWithOptions/WithOffsets-4       42.76µ ± ∞ ¹   42.41µ ± ∞ ¹       ~ (p=0.421 n=5)
EncodeWithOptions/AllOptions-4        45.17µ ± ∞ ¹   44.77µ ± ∞ ¹       ~ (p=0.222 n=5)
Decode/WithSpecialTokens-4            18.88µ ± ∞ ¹   19.04µ ± ∞ ¹       ~ (p=0.397 n=5)
Decode/SkipSpecialTokens-4            18.86µ ± ∞ ¹   18.93µ ± ∞ ¹       ~ (p=0.690 n=5)
BatchEncode-4                         434.6µ ± ∞ ¹   424.9µ ± ∞ ¹       ~ (p=0.421 n=5)
FromHuggingFace/CreationOnly-4        35.37m ± ∞ ¹   35.23m ± ∞ ¹       ~ (p=0.095 n=5)
FromHuggingFace/FullLifecycle-4       36.64m ± ∞ ¹   35.67m ± ∞ ¹  -2.64% (p=0.008 n=5)
VocabSize-4                           3.074m ± ∞ ¹   3.099m ± ∞ ¹       ~ (p=0.222 n=5)
EncodeDecode/Short-4                  13.82µ ± ∞ ¹   13.99µ ± ∞ ¹  +1.21% (p=0.032 n=5)
EncodeDecode/Medium-4                 65.17µ ± ∞ ¹   63.93µ ± ∞ ¹  -1.90% (p=0.016 n=5)
EncodeDecode/Long-4                   485.6µ ± ∞ ¹   486.9µ ± ∞ ¹       ~ (p=0.222 n=5)
Truncation-4                          324.7µ ± ∞ ¹   325.9µ ± ∞ ¹       ~ (p=1.000 n=5)
Padding-4                             114.2µ ± ∞ ¹   116.6µ ± ∞ ¹       ~ (p=0.548 n=5)
ConcurrentCacheRead-4                 4.588µ ± ∞ ¹   4.545µ ± ∞ ¹  -0.94% (p=0.008 n=5)
ConcurrentCacheValidation-4           5.419µ ± ∞ ¹   5.385µ ± ∞ ¹  -0.63% (p=0.008 n=5)
ConcurrentHFCacheLookup-4             9.004µ ± ∞ ¹   8.892µ ± ∞ ¹  -1.24% (p=0.032 n=5)
DownloadWithFailureRecovery-4          1.119 ± ∞ ¹    1.023 ± ∞ ¹       ~ (p=0.421 n=5)
ConcurrentDownloadsWithFailures-4     44.43m ± ∞ ¹   45.16m ± ∞ ¹       ~ (p=0.095 n=5)
FromHuggingFaceWithCache-4            10.72µ ± ∞ ¹   10.44µ ± ∞ ¹  -2.62% (p=0.008 n=5)
FromHuggingFaceWithoutCache-4         144.8µ ± ∞ ¹   140.6µ ± ∞ ¹  -2.90% (p=0.008 n=5)
geomean                               162.8µ         161.9µ        -0.57%
¹ need >= 6 samples for confidence interval at level 0.95

                                  │ base_bench.txt │             pr_bench.txt              │
                                  │      B/op      │     B/op       vs base                │
Encode/Short-4                         920.0 ± ∞ ¹     920.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
Encode/Medium-4                      1.516Ki ± ∞ ¹   1.516Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
Encode/Long-4                        6.703Ki ± ∞ ¹   6.703Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/Default-4          1.516Ki ± ∞ ¹   1.516Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithTypeIDs-4      1.609Ki ± ∞ ¹   1.609Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithTokens-4       1.516Ki ± ∞ ¹   1.516Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithOffsets-4      1.703Ki ± ∞ ¹   1.703Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/AllOptions-4       2.109Ki ± ∞ ¹   2.109Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
Decode/WithSpecialTokens-4             740.0 ± ∞ ¹     740.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
Decode/SkipSpecialTokens-4             740.0 ± ∞ ¹     740.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
BatchEncode-4                        11.30Ki ± ∞ ¹   11.30Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
FromHuggingFace/CreationOnly-4       6.124Mi ± ∞ ¹   6.143Mi ± ∞ ¹       ~ (p=0.310 n=5)
FromHuggingFace/FullLifecycle-4      6.129Mi ± ∞ ¹   6.138Mi ± ∞ ¹       ~ (p=0.421 n=5)
VocabSize-4                            288.0 ± ∞ ¹     288.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Short-4                 1.516Ki ± ∞ ¹   1.516Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Medium-4                2.242Ki ± ∞ ¹   2.242Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Long-4                  8.430Ki ± ∞ ¹   8.430Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
Truncation-4                         5.500Ki ± ∞ ¹   5.500Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
Padding-4                            15.89Ki ± ∞ ¹   15.89Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentCacheRead-4                2.062Ki ± ∞ ¹   2.062Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentCacheValidation-4          3.023Ki ± ∞ ¹   3.023Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentHFCacheLookup-4            3.180Ki ± ∞ ¹   3.180Ki ± ∞ ¹       ~ (p=0.722 n=5)
DownloadWithFailureRecovery-4        62.45Ki ± ∞ ¹   69.96Ki ± ∞ ¹       ~ (p=0.310 n=5)
ConcurrentDownloadsWithFailures-4    18.80Ki ± ∞ ¹   18.74Ki ± ∞ ¹       ~ (p=0.421 n=5)
FromHuggingFaceWithCache-4           1.727Ki ± ∞ ¹   1.727Ki ± ∞ ¹       ~ (p=1.000 n=5) ²
FromHuggingFaceWithoutCache-4        16.21Ki ± ∞ ¹   16.21Ki ± ∞ ¹       ~ (p=0.889 n=5)
geomean                              5.433Ki         5.457Ki        +0.44%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

                                  │ base_bench.txt │             pr_bench.txt             │
                                  │   allocs/op    │  allocs/op    vs base                │
Encode/Short-4                         16.00 ± ∞ ¹    16.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
Encode/Medium-4                        35.00 ± ∞ ¹    35.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
Encode/Long-4                          165.0 ± ∞ ¹    165.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/Default-4            35.00 ± ∞ ¹    35.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithTypeIDs-4        36.00 ± ∞ ¹    36.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithTokens-4         35.00 ± ∞ ¹    35.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/WithOffsets-4        36.00 ± ∞ ¹    36.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeWithOptions/AllOptions-4         41.00 ± ∞ ¹    41.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
Decode/WithSpecialTokens-4             10.00 ± ∞ ¹    10.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
Decode/SkipSpecialTokens-4             10.00 ± ∞ ¹    10.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
BatchEncode-4                          261.0 ± ∞ ¹    261.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
FromHuggingFace/CreationOnly-4        92.19k ± ∞ ¹   92.19k ± ∞ ¹       ~ (p=1.000 n=5)
FromHuggingFace/FullLifecycle-4       92.20k ± ∞ ¹   92.20k ± ∞ ¹       ~ (p=0.683 n=5)
VocabSize-4                            5.000 ± ∞ ¹    5.000 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Short-4                   26.00 ± ∞ ¹    26.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Medium-4                  45.00 ± ∞ ¹    45.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
EncodeDecode/Long-4                    175.0 ± ∞ ¹    175.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
Truncation-4                           127.0 ± ∞ ¹    127.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
Padding-4                              535.0 ± ∞ ¹    535.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentCacheRead-4                  25.00 ± ∞ ¹    25.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentCacheValidation-4            43.00 ± ∞ ¹    43.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
ConcurrentHFCacheLookup-4              38.00 ± ∞ ¹    38.00 ± ∞ ¹       ~ (p=1.000 n=5) ²
DownloadWithFailureRecovery-4          456.0 ± ∞ ¹    457.0 ± ∞ ¹       ~ (p=1.000 n=5)
ConcurrentDownloadsWithFailures-4      231.0 ± ∞ ¹    231.0 ± ∞ ¹       ~ (p=1.000 n=5)
FromHuggingFaceWithCache-4             7.000 ± ∞ ¹    7.000 ± ∞ ¹       ~ (p=1.000 n=5) ²
FromHuggingFaceWithoutCache-4          217.0 ± ∞ ¹    217.0 ± ∞ ¹       ~ (p=1.000 n=5) ²
geomean                                89.78          89.79        +0.01%
¹ need >= 6 samples for confidence interval at level 0.95
² all samples are equal

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant