diff --git a/.Rbuildignore b/.Rbuildignore index 1f85bbf..252789d 100644 --- a/.Rbuildignore +++ b/.Rbuildignore @@ -12,3 +12,4 @@ ^\.vscode$ ^AGENTS\.md$ ^\.positai$ +^\.claude$ diff --git a/.github/skills/implement-issue/SKILL.md b/.github/skills/implement-issue/SKILL.md new file mode 100644 index 0000000..fe794f8 --- /dev/null +++ b/.github/skills/implement-issue/SKILL.md @@ -0,0 +1,57 @@ +--- +name: implement-issue +description: Implements a GitHub issue end-to-end. Use when asked to implement, work on, or fix a specific issue number. +compatibility: Requires the `gh` CLI and an authenticated GitHub session. +--- + +# Implement a GitHub issue + +This skill wraps the Standard Workflow defined in `AGENTS.md`. Run the steps +below before and after that workflow. + +## Before the standard workflow + +**A. Read the issue in full:** + +```bash +gh issue view {number} +``` + +If `gh` is not authenticated, stop and ask the user to authenticate before continuing. + +**B. Check/create the branch:** + +- If on `main`: `usethis::pr_init("fix-{number}-{description}")` +- Branch format: `fix-{number}-{description}` + - Parts separated by hyphens; `{description}` uses snake_case + - Example: `fix-279-ai_infrastructure` +- If a branch already exists for this issue, check it out instead + +## Run the Standard Workflow from AGENTS.md + +Steps 1–9 of the Standard Workflow are the core development loop. + +## After the standard workflow + +**C. Commit and push:** + +1. Review commits already on this branch (not on `main`) — these are all part + of the same PR and should inform the PR description: + ```bash + git log main..HEAD --oneline + ``` +2. Stage and commit all changes: + ```bash + git add -A + git commit -m "{short imperative summary}" + ``` +3. Push and open the PR: + ```bash + gh pr create --fill + ``` + Use `--title` and `--body` explicitly if `--fill` produces an inadequate + description. + +This step may be overridden — the user may ask you to stop before committing, +handle the push themselves, or complete only part of the workflow. Always follow +explicit user instructions over these defaults. diff --git a/.github/skills/r-code/SKILL.md b/.github/skills/r-code/SKILL.md new file mode 100644 index 0000000..e10e768 --- /dev/null +++ b/.github/skills/r-code/SKILL.md @@ -0,0 +1,139 @@ +--- +name: r-code +description: Guide for writing R code in datawrap. Use when writing new functions, designing APIs, or reviewing/modifying existing R code. +--- + +# R code in datawrap + +This skill covers how to design and write R functions in this package — +including signatures, API conventions, error handling, and common pitfalls. +For documenting functions, use the `document` skill. For tests, use the +`tdd-workflow` skill. + +## General API design patterns + +**Enum-like arguments** — declare choices as the default vector; resolve with +`arg_match0()` (exact match, no partial matching) at the top of the function: + +```r +my_function <- function(x, mode = c("fast", "safe")) { + mode <- arg_match0(mode, c("fast", "safe")) + # mode is now guaranteed to be "fast" or "safe" +} +``` + +**`NULL` as "not provided"** — use `NULL` as the default for optional +arguments where there is no sensible scalar fallback; check with `is.null()`: + +```r +my_function <- function(x, names_to = NULL) { + if (!is.null(names_to)) { ... } +} +``` + +**S3 object construction** — build as a named list, set class explicitly: + +```r +.my_object <- function(x, type) { + out <- list(value = x, type = type) + class(out) <- c(paste0("my_", type), "my_object") + out +} +``` + +**`.call` propagation in internal validators** — helpers that validate +arguments and may throw errors should accept and forward `.call`: + +```r +.check_something <- function(x, .call = caller_env()) { + if (bad(x)) cli::cli_abort("...", call = .call) +} + +my_exported_fn <- function(x, .call = caller_env()) { + .check_something(x, .call) +} +``` + +**Return tibbles, not data frames:** + +```r +my_function <- function(x) { + result |> tibble::as_tibble() +} +``` + +## Internal vs. exported functions + +Export a function when: +- Users will call it directly +- Other packages may want to extend it +- It is a stable, intentional part of the API + +Keep a function internal when: +- It is an implementation detail that may change +- It is only used within the package +- Exporting it would clutter the user-facing API + +Internal helpers use a dot prefix (e.g. `.my_helper()`). Mark them with +`@keywords internal` and omit `@export`. + +## Error handling + +```r +cli::cli_abort( + "Input {.arg x} cannot be empty.", + "i" = "Provide a non-empty vector.", + call = caller_env() +) +``` + +- Always pass `call = caller_env()` (or `call = .call` or `call = call`, if a + call is passed to the function) so errors point to the user's call frame, not + an internal helper. +- Use custom error classes for errors that callers may want to catch + programmatically: + ```r + cli::cli_abort( + "...", + class = c("my_specific_error", "datawrap_error"), + call = caller_env() + ) + ``` + +## Common package mistakes + +```r +# Never use library() inside package code +library(dplyr) # Wrong +dplyr::filter(...) # Right +# or `@importFrom dplyr filter` if used extensively + +# Never modify global state without restoring it +options(my_option = TRUE) # Wrong +withr::local_options(list(my_option = TRUE)) # Right + +# Use system.file() for package data, not hardcoded paths +read.csv("/home/user/data.csv") # Wrong +system.file("extdata", "data.csv", package = "datawrap") # Right +``` + +## Dependencies + +### Use existing imports first + +Packages that are already in `Imports` in @DESCRIPTION should be used in +preference to base R equivalents when relevant. + +For example: prefer `purrr::map()` over `lapply()`, `rlang::is_*()` predicates +over `is.*()`, `vctrs::vec_*()` over base length/NA checks, and +`withr::local_*()` over manual `on.exit()` state management. + +### When to add a new dependency + +Add a dependency when it provides significant functionality that would be +complex or brittle to reimplement — date parsing, web requests, complex string +manipulation. Stick with base R when the solution is straightforward and a +dependency would add overhead for little gain. + +datawrap's existing dependencies are intentional. **Adding a new dependency +requires explicit discussion with the developer.** diff --git a/.github/skills/tdd-workflow/SKILL.md b/.github/skills/tdd-workflow/SKILL.md new file mode 100644 index 0000000..b5b18d0 --- /dev/null +++ b/.github/skills/tdd-workflow/SKILL.md @@ -0,0 +1,186 @@ +--- +name: tdd-workflow +description: Test-driven development workflow for datawrap. Use when writing any R code (writing new features, fixing bugs, refactoring, or reviewing tests). +--- + +# TDD workflow for datawrap + +## Core principle + +Write a failing test first, then implement the minimal code to make it pass, +then refactor. Never write implementation code without a failing test driving it. + +## File naming + +Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. Place new tests +next to similar existing ones. + +## Running tests + +```r +# Full suite +devtools::test(reporter = "check") + +# Single file +devtools::test(filter = "name", reporter = "check") +``` + +Testing functions load code automatically. You do not need to call `library()` or `devtools::load_all()` separately. + +## Coverage + +Goal: **100%** for every edited file. After editing `R/file_name.R`, verify: + +```r +covr_res <- devtools:::test_coverage_active_file("R/file_name.R") +which(purrr::map_int(covr_res, "value") == 0) +``` + +Files excluded from the coverage requirement: +- `R/datawrap-package.R` +- Files matching `R/import-standalone-*.R` + +## Test types + +### Unit tests + +Test individual functions in isolation: + +```r +test_that("create_dataset_dictionary returns a tibble (#2)", { + result <- create_dataset_dictionary(data.frame(x = 1L)) + expect_s3_class(result, "tbl_df") +}) +``` + +### Snapshot tests + +For complex outputs that are hard to specify with equality assertions: + +```r +test_that("print method is stable (#123)", { + expect_snapshot(print(create_dataset_dictionary(mtcars))) +}) + +# For errors, wrap expect_error() inside expect_snapshot() so both the error +# class and the message text are captured in the snapshot: +test_that("errors on invalid input (#456)", { + expect_snapshot({ + (expect_error( + create_dataset_dictionary("not a data frame"), + class = "datawrap_error" + )) + }) +}) +``` + +When snapshots change intentionally: + +```r +testthat::snapshot_review("test_name") +testthat::snapshot_accept("test_name") +``` + +Snapshots are stored in `tests/testthat/_snaps/`. + +## Test design principles + +- **Self-sufficient:** each test contains its own setup, execution, and + assertion. Tests must be runnable in isolation. +- **Duplication over factoring:** repeat setup code rather than extracting it. + Clarity beats DRY in tests. +- **One concept per test:** a failing test should tell you exactly what broke. +- **Issue reference in description:** the `desc` of every new `test_that()` call + should end with a parenthetical issue reference: + ```r + test_that("create_dataset_dictionary() returns correct columns (#1)", { ... }) + test_that("errors on empty input (#noissue)", { ... }) + ``` + +## testthat Edition 3 — deprecated patterns + +```r +# Deprecated → Modern +context("Data validation") # Remove — filename serves this purpose +expect_equivalent(x, y) # expect_equal(x, y, ignore_attr = TRUE) +with_mock(...) # local_mocked_bindings(...) +expect_is(x, "data.frame") # expect_s3_class(x, "data.frame") +``` + +## Essential expectations + +### Equality & identity + +```r +expect_equal(x, y) # with numeric tolerance +expect_equal(x, y, tolerance = 0.001) +expect_equal(x, y, ignore_attr = TRUE) +expect_identical(x, y) # exact match +``` + +### Conditions + +```r +expect_error(code) +expect_error(code, "pattern") +expect_error(code, class = "datawrap_error") +expect_warning(code) +expect_no_warning(code) +expect_message(code) +expect_no_message(code) +``` + +### Collections + +```r +expect_setequal(x, y) # same elements, any order +expect_in(element, set) +expect_named(x, c("a", "b")) +``` + +### Type & structure + +```r +expect_type(x, "double") +expect_s3_class(x, "tbl_df") +expect_length(x, 10) +``` + +### Logical + +```r +expect_true(x) +expect_false(x) +expect_null(x) +``` + +## `withr` patterns for temporary state + +```r +withr::local_options(list(datawrap.option = TRUE)) +withr::local_envvar(MY_VAR = "value") +withr::local_tempfile(lines = c("a", "b")) +``` + +## Fixtures + +Store static test data in `tests/testthat/fixtures/` and access via: + +```r +test_path("fixtures", "sample.rds") +``` + +## Mocking + +```r +local_mocked_bindings( + external_fn = function(...) "mocked_result" +) +result <- my_function_that_calls_external_fn() +``` + +## Common mistakes + +- **Do not modify tests to make them pass.** Fix the implementation. +- **Do not write tests that depend on other tests' state.** Each test must + be independently runnable. diff --git a/AGENTS.md b/AGENTS.md index 3760f6a..b9cce39 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -18,6 +18,43 @@ --- +## Standard workflow + +For any feature, fix, or refactor: + +1. **Update packages**: `pak::pak()` +2. **Run existing tests** — confirm everything passes before touching code: + `devtools::test(reporter = "check")` + If any tests fail, stop and ask the user how to proceed. +3. **Plan** — identify which R files are affected; check whether new exported + functions are needed (→ r-code skill) +4. **Test first** — write a failing test, then implement (→ tdd-workflow skill): + `devtools::test(filter = "name", reporter = "check")` — should fail +5. **Implement** — minimal code to make the tests pass +6. **Refactor** — clean up while keeping all tests green +7. **Document** — for any new or changed exported functions, use the document skill +8. **Verify**: + ```r + devtools::test(reporter = "check") + covr_res <- devtools:::test_coverage_active_file("R/file_name.R") + which(purrr::map_int(covr_res, "value") == 0) + ``` + Then run `air format .` + Once, just before wrapping up: `devtools::check(error_on = "warning")`. + Warnings and errors must be resolved. NOTEs should be resolved too. +9. **News** — add a bullet at the top of `NEWS.md` (under the development version + heading). Rules: + - Only for user-facing changes; skip purely internal changes. + - One line per bullet; end with a full stop. + - Write for users, not developers. Frame positively, present tense: + `` * `create_dataset_dictionary()` now accepts ... `` not `* Fixed a bug where ...` + - Put the function name (in backticks with `()`) near the start. + - Issue number and contributor go in parentheses before the final period: + `` * `create_dataset_dictionary()` now accepts ... (@{username}, #{issue_number}). `` + - Get username: `gh api user --jq .login` + +--- + ## GitHub Use the `gh` CLI to interact with GitHub rather than fetching web URLs. Common examples: @@ -51,59 +88,28 @@ with `@eval` in package dataset documentation. --- -## Skills - -Skills in @.github/skills should be loaded when the user triggers them. - -| Triggers | Path | -------------------------|------------------------------------------------| -| document functions | @.github/skills/document/SKILL.md | -| search / rewrite code | @.github/skills/search-code/SKILL.md | -| create github issues | @.github/skills/create-issue/SKILL.md | - -## File Organization - -Each exported function should be defined in its own file named `R/{function_name}.R`. For example, `create_dataset_dictionary()` belongs in `R/create_dataset_dictionary.R`. Any helper functions used exclusively by that exported function should also live in the same file. General-purpose helpers shared across multiple functions belong in `R/utils.R`. - -## Testing - -### Workflow: Red-Green-Refactor TDD +## General -Unless explicitly told otherwise, follow strict red-green-refactor TDD for all development: +- When running R from the console, use `--quiet --vanilla`. +- Always run `air format .` after generating R code. +- Code comments should explain *why*, not *what*. Omit comments that restate the code. +- When writing or reviewing any code, load the relevant skills (usually `r-code`, `tdd-workflow`, and `document`). -1. **Red** — Write a failing test that captures the desired behavior. Run it to confirm it fails for the right reason. -2. **Green** — Write the minimal implementation needed to make the test pass. Run the test to confirm it passes. -3. **Refactor** — Clean up code and tests while keeping all tests green. Run the full test suite after each refactor step. - -Never write implementation code before there is a failing test for it. - -### General rules - -- Before starting any coding task, run the relevant tests and check coverage so you know the baseline state. -- Always run `air format .` before running tests, after every R file edit. -- Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. -- Use `devtools::test(reporter = "check")` to run all tests -- Use `devtools::test(filter = "name", reporter = "check")` to run tests for `R/{name}.R` -- All testing functions automatically load code; you don't need to. -- All new code should have an accompanying test. -- If there are existing tests, place new tests next to similar existing tests. -- Test descriptions should reference an issue that they are closed with "(#123)" (for issue 123), etc. If the test doesn't relate to a particular issue (for example, if it's testing an underlying piece of infrastructure), tag it with "(#noissue)". - -### Test coverage - -The goal is 100% file-level test coverage across all R source files. After editing a file, ensure that it still has 100% test coverage. - -To check coverage for a single file: +--- -```r -covr_res <- devtools:::test_coverage_active_file("R/file_name.R") -which(vapply(covr_res, `[[`, numeric(1), "value") == 0) -``` +## Skills -The following files are intentionally excluded from coverage requirements (no associated tests): +Load skills from @.github/skills when the user triggers them. -- `R/datawrap-package.R` +| Triggers | Path | +|---------------------------------------------------|------------------------------------------------| +| document functions | @.github/skills/document/SKILL.md | +| search / rewrite code | @.github/skills/search-code/SKILL.md | +| create github issues | @.github/skills/create-issue/SKILL.md | +| implement issue / work on #NNN | @.github/skills/implement-issue/SKILL.md | +| writing R functions / API design / error handling | @.github/skills/r-code/SKILL.md | +| writing or reviewing tests | @.github/skills/tdd-workflow/SKILL.md | -## Documentation +## File Organization -After adding or changing functions, create or update their documentation. See @.github/skills/document/SKILL.md for details on documentation for this project. +Each exported function should be defined in its own file named `R/{function_name}.R`. For example, `create_dataset_dictionary()` belongs in `R/create_dataset_dictionary.R`. Any helper functions used exclusively by that exported function should also live in the same file. General-purpose helpers shared across multiple functions belong in `R/utils.R`.