From c506cb605b0319e2253ed926ba9d594387da7ac4 Mon Sep 17 00:00:00 2001 From: Jon Harmon Date: Sun, 5 Apr 2026 13:12:00 -0500 Subject: [PATCH 1/2] infra: update ai and gha Closes #285 --- .github/skills/create-issue/SKILL.md | 28 ++- .github/skills/document/SKILL.md | 102 ++++++---- .github/skills/github/SKILL.md | 22 +++ .github/skills/implement-issue/SKILL.md | 18 +- .github/skills/r-code/SKILL.md | 223 +++++++++++++--------- .github/skills/search-code/SKILL.md | 84 ++++++++ .github/skills/tdd-workflow/SKILL.md | 153 +++++++++------ .github/workflows/R-CMD-check.yaml | 24 +-- .github/workflows/copilot-setup-steps.yml | 51 +++++ .github/workflows/install/action.yml | 118 ++++++++++++ .github/workflows/pkgdown.yaml | 47 +++-- .github/workflows/pr-commands.yaml | 50 +---- .github/workflows/qcthat.yaml | 19 +- .github/workflows/test-coverage.yaml | 32 ++-- AGENTS.md | 114 ++++++----- DESCRIPTION | 1 + R/aaa-conditions.R | 22 +++ 17 files changed, 758 insertions(+), 350 deletions(-) create mode 100644 .github/skills/github/SKILL.md create mode 100644 .github/skills/search-code/SKILL.md create mode 100644 .github/workflows/copilot-setup-steps.yml create mode 100644 .github/workflows/install/action.yml create mode 100644 R/aaa-conditions.R diff --git a/.github/skills/create-issue/SKILL.md b/.github/skills/create-issue/SKILL.md index 397f88f4..dc80e9e9 100644 --- a/.github/skills/create-issue/SKILL.md +++ b/.github/skills/create-issue/SKILL.md @@ -1,5 +1,6 @@ --- name: create-issue +trigger: create GitHub issues description: Creates GitHub issues for the package repository. Use when asked to create, file, or open a GitHub issue, or when planning new features or functions that need to be tracked. compatibility: Requires the `gh` CLI and an authenticated GitHub session. --- @@ -10,16 +11,29 @@ Use `gh api graphql` with the `createIssue` mutation to create issues. This sets If `gh` is not authenticated, stop and ask the user to authenticate before continuing. +## Looking up IDs + +The hardcoded IDs below are correct for this repo as of 2026-04-05 18:08:30 UTC. If they ever change, or if you're working in a fork, re-run these queries to get fresh values: + +```bash +# Repository node ID +gh api graphql -f query='{ repository(owner: "wranglezone", name: "tibblify") { id } }' + +# Available issue type IDs +gh api graphql -f query='{ repository(owner: "wranglezone", name: "tibblify") { issueTypes(first: 20) { nodes { id name description } } } }' +``` + ## Issue type Choose the type that best fits the issue: | Type | ID | Use for | |---|---|---| -| Feature | `IT_kwDODjbzj84BwJRW` | New exported functions or capabilities | -| Bug | `IT_kwDODjbzj84BwJRV` | Something broken or incorrect | -| Documentation | `IT_kwDODjbzj84BwprI` | Docs-only changes | -| Task | `IT_kwDODjbzj84BwJRU` | Maintenance, refactoring, chores | +| Task | `IT_kwDODjbzj84BwJRU` | A specific piece of work | +| Bug | `IT_kwDODjbzj84BwJRV` | An unexpected problem or behavior | +| Feature | `IT_kwDODjbzj84BwJRW` | A request, idea, or new functionality | +| Documentation | `IT_kwDODjbzj84BwprI` | An update to help, vignettes, etc. | +| Infrastructure | `IT_kwDODjbzj84B5OM3` | Infrastructure of a project, like GitHub Actions | ## Issue title @@ -55,12 +69,12 @@ Example: ```markdown ## Summary -> As a tibblify user, in order to handle optional fields gracefully, I would like `tib_chr()` to accept a `.default` argument. +> As a package developer, in order to set up agent skills quickly, I would like to generate a skill template from a single function call. ``` ### `## Details` (optional, all types) -For information that's important to capture but doesn't fit naturally into any other section. Use sparingly — if the content belongs in `## Behavior`, `## Proposed signature`, or `## References`, put it there instead. +For information that's important to capture but doesn't fit naturally into any other section, including implementation details such as packages to add to `Imports` in `DESCRIPTION` or files to add to `inst`. Use sparingly — if the content belongs in `## Behavior`, `## Proposed signature`, or `## References`, put it there instead. ### `## Proposed signature` (Feature only) @@ -93,8 +107,6 @@ Only include when there are specific reference implementations, external URLs, o ## Creating the issue Use the `repoId` and the `typeId` for the chosen issue type from the table above. -If the mutation fails with an ID-related error, load -`@references/lookup-ids.md` to re-query the correct values. ```bash gh api graphql \ diff --git a/.github/skills/document/SKILL.md b/.github/skills/document/SKILL.md index f5c01d42..9914cdd0 100644 --- a/.github/skills/document/SKILL.md +++ b/.github/skills/document/SKILL.md @@ -1,55 +1,67 @@ --- name: document +trigger: document functions description: Document package functions. Use when asked to document functions. --- # Document functions -*All* R functions in `R/` should be documented in {roxygen2} `#'` style, including internal/unexported functions. +*All* R functions in `R/` should be documented in roxygen2 `#'` style, including internal/unexported functions. - Run `air format .` then `devtools::document()` after changing any roxygen2 docs. - Use sentence case for all headings. +- Wrap roxygen comments at 80 characters. - Files matching `R/import-standalone-*.R` are imported from other packages and have their own conventions. Do not modify their documentation. +- After documenting functions, run `devtools::document(roclets = c('rd', 'collate', 'namespace'))`. +- If `_pkgdown.yml` exists and contains a `reference` section: + - Whenever you add a new (non-internal) documentation topic, also add the topic to `_pkgdown.yml`. + - Use `pkgdown::check_pkgdown()` to check that all topics are included in the reference index. ## Shared parameters -**Parameters used in more than one function** go in `R/aaa-shared_params.R` under `@name .shared-params`. Functions inherit them with `@inheritParams .shared-params`. See @R/aaa-shared_params.R for current definitions. - -Multiple shared-params groups exist for different contexts (`.shared-params-tib`, `.shared-params-spec_prep`). File-level groups are appropriate when parameters are only shared within a file and closely related files. +**Parameters used in more than one function** go in `R/aaa-shared_params.R` under `@name .shared-params`. Functions inherit them with `@inheritParams .shared-params`. See `R/aaa-shared_params.R` for current definitions (if it exists). Shared params blocks: alphabetize parameters, use `@name .shared-params` (with leading dot), include `@keywords internal`, end with `NULL`. +Multiple shared-params groups (e.g. `.shared-params-io`, `.shared-params-parsing`) are appropriate when parameters are only shared within a file and closely related files. + ## Parameter documentation format ```r -#' @param .param_name (`TYPE`) One sentence description. Can include [cross_references()]. -#' Additional details on continuation lines if needed. +#' @param param_name (`TYPE`) Brief description (usually 1-3 sentences. Can +#' include [cross_references()]. Additional details on continuation lines if +#' needed. ``` Function-specific `@param` definitions always appear *before* any `@inheritParams` lines. If all parameters are defined locally, omit `@inheritParams` entirely. ### Type notation -- "(`character`)" - Character vector -- "(`character(1)`)" - Single string -- "(`logical(1)`)" - Single logical -- "(`integer`)" - Integer vector -- "(`vector(0)`)" - A prototype (zero-length vector) -- "(`vector`)" - A vector of unspecified type -- "(`list`)" - List -- "(`function` or `NULL`)" - A function or NULL -- "(`tib_collector`)" - A class-specific type (use the actual class name) -- "(`any`)" - Any type +| Notation | Meaning | +|----------|---------| +| ``(`character`)`` | Character vector | +| ``(`character(1)`)`` | Single string | +| ``(`logical(1)`)`` | Single logical | +| ``(`integer`)`` | Integer vector | +| ``(`integer(1)`)`` | Single integer | +| ``(`double`)`` | Double vector | +| ``(`vector(0)`)`` | A prototype (zero-length vector) | +| ``(`vector`)`` | A vector of unspecified type | +| ``(`list`)`` | List | +| ``(`data.frame`)`` | Data frame or tibble | +| ``(`function` or `NULL`)`` | A function or NULL | +| ``(`my_class`)`` | A class-specific type (use the actual class name) | +| ``(`any`)`` | Any type | ### Enumerated values When a parameter takes one of a fixed set of values, document them with a bullet list: ```r -#' @param .input_form (`character(1)`) The structure of the input field. Can be -#' one of: -#' * `"vector"`: The field is a vector, e.g. `c(1, 2, 3)`. -#' * `"scalar_list"`: The field is a list of scalars, e.g. `list(1, 2, 3)`. +#' @param method (`character(1)`) The aggregation method. Can be one of: +#' * `"mean"`: Arithmetic mean. +#' * `"median"`: Median value. +#' * `"sum"`: Total sum. ``` ## Returns @@ -57,12 +69,21 @@ When a parameter takes one of a fixed set of values, document them with a bullet Use `@returns` (not `@return`). Include a type when it's informative: ```r -#' @returns A prepared tibblify specification. -#' @returns (`logical(1)`) `TRUE` if `x` is a `tib_scalar`. -#' @returns Either a tibble or a list, depending on the specification. +#' @returns A summary tibble. +#' @returns (`logical(1)`) `TRUE` if `x` is a valid record. +#' @returns Either a tibble or a list, depending on the input. #' @returns `NULL` (invisibly). ``` +**Structured returns with columns:** + +```r +#' @returns A [tibble::tibble()] with columns: +#' - `name`: Record name. +#' - `value`: Numeric value. +#' - `status`: Status (`"active"` or `"inactive"`). +``` + ## Exported functions ```r @@ -70,7 +91,7 @@ Use `@returns` (not `@return`). Include a type when it's informative: #' #' Description paragraph providing context and details. #' -#' @param .param (`TYPE`) Description. +#' @param param_name (`TYPE`) Description. #' @inheritParams .shared-params #' #' @returns Description of return value. @@ -87,39 +108,40 @@ Use `@returns` (not `@return`). Include a type when it's informative: ## Internal (unexported) functions -Internal functions (starting with `.`) use abbreviated documentation: +Internal helpers (identified by a dot prefix, e.g. `.parse_response()`) use abbreviated documentation. Mark them with `@keywords internal` and omit `@export`: ```r #' Title in sentence case #' -#' @param .one_off_param (`TYPE`) Description. +#' @param one_off_param (`TYPE`) Description. #' @inheritParams .shared-params #' @returns (`TYPE`) What it returns. #' @keywords internal ``` -No description paragraph, fewer blank `#'` lines, no `@examples`, `@keywords internal` instead of `@export`. +Description paragraph is optional (only include when usage isn't obvious), fewer blank `#'` lines, and no `@examples`. ## S3 methods and `@rdname` grouping Use `@rdname` to group related functions under one help page. This applies to: - **S3 methods we own** (generic defined in this package): generic gets full docs, methods get `@rdname` + `@export`. -- **Related exported functions** (e.g., `tspec_df`/`tspec_object`/`tspec_row`): primary function gets full docs, variants get `@rdname` + `@export`. +- **Related exported functions** (e.g. multiple variants of the same operation): primary function gets full docs, variants get `@rdname` + `@export`. ```r -#' Finalize a tibblify object +#' Format a summary object #' -#' @param field (`any`) The field value. -#' @inheritParams .shared-params-spec_prep +#' @param x (`any`) The object to format. +#' @param ... Additional arguments passed to methods. +#' @returns A formatted character string. #' @keywords internal -.finalize_tspec_object <- function(field_spec, field) { - UseMethod(".finalize_tspec_object") +.format_summary <- function(x, ...) { + UseMethod(".format_summary") } -#' @rdname .finalize_tspec_object +#' @rdname .format_summary #' @export -.finalize_tspec_object.tib_collector <- function(field_spec, field) { - field[[1]] +.format_summary.data_summary <- function(x, ...) { + # method implementation } ``` @@ -137,12 +159,14 @@ method.class <- function(x, ...) { ... } ## Style notes -**Cross-references:** Use square brackets — `[tspec_df()]` (internal), `[tibble::tibble()]` (external), `[tib_spec]` (topics). +**Cross-references:** Use square brackets — `[fetch_records()]` (internal), `[tibble::tibble()]` (external), `[topic_name]` (topics). -**Section comment headers** organize code within a file, lowercase with dashes to column 80: +**Section comment headers** optionally organize code within a file, lowercase with dashes to column 80: ```r # helpers ---------------------------------------------------------------------- ``` -**Examples:** Exported functions include `@examples`. Use `@examplesIf interactive()` for network-dependent functions. Use section-style comments (`# Section ---`) to organize longer example blocks. Internal functions do not get examples. +Only use such headers in complex files. The need for section comment headers might indicate that the file should be split into multiple files. + +**Examples:** Exported functions include `@examples`. Use `@examplesIf interactive()` for network-dependent or slow functions. Use section-style comments (`# Section ---`) to organize longer example blocks. Internal functions do not get examples. diff --git a/.github/skills/github/SKILL.md b/.github/skills/github/SKILL.md new file mode 100644 index 00000000..8fa04d59 --- /dev/null +++ b/.github/skills/github/SKILL.md @@ -0,0 +1,22 @@ +--- +name: github +trigger: from github +description: GitHub workflows using the `gh` CLI, including viewing issues/PRs and commit message conventions. Use when interacting with GitHub in any way, such as viewing, creating, or editing issues and pull requests, making commits, or running any `gh` command. +compatibility: Requires the `gh` CLI and an authenticated GitHub session. +--- + +# GitHub + +Use `gh` CLI, not web URLs: `gh issue view 123`, `gh issue list`, `gh pr view 456`, `gh pr list`. + +## Commit messages + +Conventional commits; backtick-quote function names; close issues in body with `- Closes #N`. + +``` +feat: add `create_skill()` + +Generates a new skill directory with a SKILL.md template. + +- Closes #3 +``` diff --git a/.github/skills/implement-issue/SKILL.md b/.github/skills/implement-issue/SKILL.md index ee1509a9..f17891dc 100644 --- a/.github/skills/implement-issue/SKILL.md +++ b/.github/skills/implement-issue/SKILL.md @@ -1,13 +1,13 @@ --- name: implement-issue +trigger: implement issue / work on #NNN description: Implements a GitHub issue end-to-end. Use when asked to implement, work on, or fix a specific issue number. compatibility: Requires the `gh` CLI and an authenticated GitHub session. --- # Implement a GitHub issue -This skill wraps the Standard Workflow defined in `AGENTS.md`. Run the steps -below before and after that workflow. +This skill wraps the Standard Workflow defined in `AGENTS.md`. Run the steps below before and after that workflow. ## Before the standard workflow @@ -21,10 +21,10 @@ If `gh` is not authenticated, stop and ask the user to authenticate before conti **B. Check/create the branch:** -- If on `main`: `git checkout -b fix-{number}-{description}` (run in the shell) +- If on `main`: `usethis::pr_init("fix-{number}-{description}")` - Branch format: `fix-{number}-{description}` - Parts separated by hyphens; `{description}` uses snake_case - - Example: `fix-279-ai_infrastructure` + - Example: `fix-42-validate_input` - If a branch already exists for this issue, check it out instead ## Run the Standard Workflow from AGENTS.md @@ -35,8 +35,7 @@ Steps 1–9 of the Standard Workflow are the core development loop. **C. Commit and push:** -1. Review commits already on this branch (not on `main`) — these are all part - of the same PR and should inform the PR description: +1. Review commits already on this branch (not on `main`) — these are all part of the same PR and should inform the PR description: ```bash git log main..HEAD --oneline ``` @@ -49,9 +48,6 @@ Steps 1–9 of the Standard Workflow are the core development loop. ```bash gh pr create --fill ``` - Use `--title` and `--body` explicitly if `--fill` produces an inadequate - description. + Use `--title` and `--body` explicitly if `--fill` produces an inadequate description. -This step may be overridden — the user may ask you to stop before committing, -handle the push themselves, or complete only part of the workflow. Always follow -explicit user instructions over these defaults. +This step may be overridden — the user may ask you to stop before committing, handle the push themselves, or complete only part of the workflow. Always follow explicit user instructions over these defaults. diff --git a/.github/skills/r-code/SKILL.md b/.github/skills/r-code/SKILL.md index 47d7b984..70350d8b 100644 --- a/.github/skills/r-code/SKILL.md +++ b/.github/skills/r-code/SKILL.md @@ -1,109 +1,175 @@ --- name: r-code -description: Guide for writing R code in tibblify. Use when writing new functions, designing APIs, or reviewing/modifying existing R code. +trigger: writing R functions / API design / error handling +description: Guide for writing R code. Use when writing new functions, designing APIs, or reviewing/modifying existing R code. --- -# R code in tibblify +# R code -This skill covers how to design and write R functions in this package — -including signatures, API conventions, error handling, and common pitfalls. -For documenting functions, use the `document` skill. For tests, use the -`tdd-workflow` skill. +This skill covers how to design and write R functions — including naming conventions, signatures, API conventions, input validation, error handling, and common pitfalls. For documenting functions, use the `document` skill. For tests, use the `tdd-workflow` skill. -## tibblify API conventions +## Naming conventions -`tib_*()` collector functions use dot-prefixed arguments: +### Functions -- `.key` — always the first argument; names the output column -- `.ptype` — prototype that determines the output type -- `.required` — whether the field must be present -- `.default` — value to use when the field is absent -- `.names_to` — captures element names into a column +Functions use `snake_case` and should be **verbs or verb phrases** that describe what the function does: -New `tib_*()` functions must follow this scheme. Use dot-prefixed names for -any non-standard or spec-controlling parameters to stay consistent with the -existing family. +```r +fetch_records() +build_summary() +validate_input() +``` -More broadly, dot-prefix is the package-wide convention for named parameters, -to avoid collisions with column names passed through `...`. The exceptions are -common positional parameters where collision is unlikely: `x`, `spec`, `name`, -and similar. When in doubt, use the dot prefix. +A function name should be descriptive enough to make its purpose clear without a comment. Prefer clarity over brevity — don't abbreviate unless there is a widely understood convention (e.g. `df` for data frame, `dir` for directory). -## General API design patterns +Internal helpers use a dot prefix: + +```r +.parse_response() +.validate_columns() +``` -**Enum-like arguments** — declare choices as the default vector; resolve with -`arg_match0()` (exact match, no partial matching) at the top of the function: +### Parameters + +Parameters use `snake_case` and should generally be **nouns**, occasionally adjectives. The same rule applies: clarity over brevity. ```r -my_function <- function(x, .mode = c("fast", "safe")) { - .mode <- arg_match0(.mode, c("fast", "safe")) - # .mode is now guaranteed to be "fast" or "safe" -} +# Good +fetch_records(file_path, page_size, overwrite) + +# Bad — unclear abbreviations +fetch_records(fp, ps, ow) ``` -**`NULL` as "not provided"** — use `NULL` as the default for optional -arguments where there is no sensible scalar fallback; check with `is.null()`: +## File organization + +One exported function per file: `R/{function_name}.R` (e.g. `fetch_records()` → `R/fetch_records.R`). Internal helpers used exclusively by that function live in the same file. Shared helpers go in `R/utils.R` or `R/utils-{topic}.R` (e.g. `R/utils-parsing.R`). + +## Coding style + +- Always run `air format .` after generating code. +- Use the base pipe operator (`|>`) not the magrittr pipe (`%>%`). +- Use `\() ...` for single-line anonymous functions. For all other cases, use `function() {...}`. + +## Function design + +**Functional core, imperative shell** — pure, testable functions that accept data and return data form the core. The imperative shell orchestrates program flow, manages state, and calls the functional core. + +Functions should be **small and single-purpose**. Each function should operate at a **single level of abstraction**: it either orchestrates calls to other functions, or performs a direct operation on data, but does not mix the two. ```r -my_function <- function(x, .names_to = NULL) { - if (!is.null(.names_to)) { ... } +# Orchestrator — delegates to focused helpers +build_report <- function(data, output_path) { + data <- .clean_data(data) + summary <- .compute_summary(data) + .write_report(summary, output_path) +} + +# Worker — performs one direct operation +.clean_data <- function(data) { + data |> + dplyr::filter(!is.na(value)) |> + dplyr::mutate(value = round(value, 2)) } ``` -**`...` usage** — there are two distinct patterns in tibblify: +Name functions well enough that their purpose is obvious from the call site. When reading the orchestrator above, each step is self-documenting — no comments needed. + +**Simplify control flow** — prefer guard clauses and returning early over complex if/else structures. + +**Pure conditionals** — the expression inside a conditional check should not cause side effects. Extract the pure check from the impure action into separate functions if needed. + +## General API design patterns + +**Enum-like arguments** — declare choices as the default vector; resolve with `rlang::arg_match()` at the top of the function: ```r -# tspec_*(): collect named collector specs with list2() -tspec_custom <- function(...) { - fields <- rlang::list2(...) +summarize_data <- function(x, method = c("mean", "median")) { + method <- rlang::arg_match(method) + # method is now guaranteed to be "mean" or "median" } +``` -# tib_*(): dots reserved for future extensions — must be empty -tib_custom <- function(.key, ..., .required = TRUE) { - rlang::check_dots_empty() +**`NULL` as "not provided"** — use `NULL` as the default for optional arguments where there is no sensible scalar fallback; check with `is.null()`: + +```r +fetch_records <- function(x, output_column = NULL) { + if (!is.null(output_column)) { ... } } ``` **S3 object construction** — build as a named list, set class explicitly: ```r -.my_object <- function(x, type) { - out <- list(value = x, type = type) - class(out) <- c(paste0("my_", type), "my_object") +.new_summary <- function(values, method) { + out <- list(values = values, method = method) + class(out) <- c(paste0("summary_", method), "data_summary") out } ``` -**`vctrs` for type-safe operations** — prefer `vctrs::vec_*` over base -equivalents for length, missing-value, and type checks: +**`call` propagation in internal validators** — helpers that validate arguments and may throw errors should accept and forward `call`: ```r -vctrs::vec_size(x) # instead of length(x) -vctrs::vec_any_missing(x) # instead of anyNA(x) -vctrs::vec_detect_missing(x) # instead of is.na(x) +.check_non_empty <- function(x, call = rlang::caller_env()) { + if (length(x) == 0L) { + .pkg_abort("Input {.arg x} cannot be empty.", "empty_input", call = call) + } +} + +process_data <- function(x, call = rlang::caller_env()) { + .check_non_empty(x, call = call) + ... +} ``` -**`.call` propagation in internal validators** — helpers that validate -arguments and may throw errors should accept and forward `.call`: +**Return tibbles, not data frames:** ```r -.check_something <- function(x, .call = caller_env()) { - if (bad(x)) cli::cli_abort("...", call = .call) +summarize_data <- function(x) { + result |> tibble::as_tibble() } +``` + +## Input validation + +Use `stbl::to_*()` and `stbl::stabilize_*()` to validate parameters. These functions coerce when safe and fail with clear error messages when not. + +- **`to_*()`** — simple type coercion. Use when you need to ensure a parameter is the right type but don't need additional constraints. +- **`stabilize_*()`** — coercion plus content validation (regex, ranges, etc.). Use when simple type coercion isn't enough. + +**Validate in the function that uses the parameter**, not in a caller that passes it through. This preserves R's lazy evaluation — if a parameter is never used on a code path, it is never evaluated or validated. -my_exported_fn <- function(x, .call = caller_env()) { - .check_something(x, .call) +```r +# Good — validation happens where the parameter is used +build_report <- function(data, title, page_size) { + data <- .clean_data(data) + summary <- .compute_summary(data, page_size) + .write_report(summary, title) } -``` -**Return tibbles, not data frames:** +.compute_summary <- function(data, page_size, call = rlang::caller_env()) { + page_size <- stbl::to_int_scalar(page_size, call = call) + ... +} + +.write_report <- function(summary, title, call = rlang::caller_env()) { + title <- stbl::to_chr_scalar(title, call = call) + ... +} +``` ```r -my_function <- function(x) { - result |> tibble::as_tibble() +# Bad — validates everything eagerly, breaking lazy evaluation +build_report <- function(data, title, page_size) { + title <- stbl::to_chr_scalar(title) + page_size <- stbl::to_int_scalar(page_size) + ... } ``` +When `call` is available (because the function accepts it), always pass it to `stbl` calls so error messages point to the user's call frame. + ## Internal vs. exported functions Export a function when: @@ -116,31 +182,21 @@ Keep a function internal when: - It is only used within the package - Exporting it would clutter the user-facing API -Internal helpers use a dot prefix (e.g. `.my_helper()`). Mark them with -`@keywords internal` and omit `@export`. +Internal helpers use a dot prefix (e.g. `.parse_response()`). ## Error handling +Use `.pkg_abort()` (defined in `R/aaa-conditions.R`) rather than calling `cli::cli_abort()` directly. This wraps `stbl::pkg_abort()` and ensures consistent error class formatting: + ```r -cli::cli_abort( - "Input {.arg x} cannot be empty.", - "i" = "Provide a non-empty vector.", - call = caller_env() +.pkg_abort( + "Column {.field {name}} not found in {.arg data}.", + "column_not_found", + call = call ) ``` -- Always pass `call = caller_env()` (or `call = .call` or `call = call`, if a - call is passed to the function) so errors point to the user's call frame, not - an internal helper. -- Use custom error classes for errors that callers may want to catch - programmatically: - ```r - cli::cli_abort( - "...", - class = c("my_specific_error", "tibblify_error"), - call = caller_env() - ) - ``` +Always pass `call = call` (or `call = rlang::caller_env()`) so errors point to the user's call frame, not an internal helper. ## Common package mistakes @@ -155,27 +211,18 @@ options(my_option = TRUE) # Wrong withr::local_options(list(my_option = TRUE)) # Right # Use system.file() for package data, not hardcoded paths -read.csv("/home/user/data.csv") # Wrong -system.file("extdata", "data.csv", package = "tibblify") # Right +read.csv("/home/user/data.csv") # Wrong +system.file("extdata", "data.csv", package = "mypkg") # Right ``` ## Dependencies ### Use existing imports first -Packages that are already in `Imports` in @DESCRIPTION should be used in -preference to base R equivalents when relevant. - -For example: prefer `purrr::map()` over `lapply()`, `rlang::is_*()` predicates -over `is.*()`, `vctrs::vec_*()` over base length/NA checks, and -`withr::local_*()` over manual `on.exit()` state management. +Packages already in `Imports` in `DESCRIPTION` should be preferred over base R equivalents: `purrr::map()` over `lapply()`, `rlang::is_*()` predicates over `is.*()`, and `withr::local_*()` over manual `on.exit()` state management. ### When to add a new dependency -Add a dependency when it provides significant functionality that would be -complex or brittle to reimplement — date parsing, web requests, complex string -manipulation. Stick with base R when the solution is straightforward and a -dependency would add overhead for little gain. +Add a dependency when it provides significant functionality that would be complex or brittle to reimplement — date parsing, web requests, complex string manipulation. Stick with base R or existing imports when the solution is straightforward. -tibblify's existing dependencies are intentional. **Adding a new dependency -requires explicit discussion with the developer.** +**Adding a new dependency requires explicit discussion with the developer.** diff --git a/.github/skills/search-code/SKILL.md b/.github/skills/search-code/SKILL.md new file mode 100644 index 00000000..c5cf9783 --- /dev/null +++ b/.github/skills/search-code/SKILL.md @@ -0,0 +1,84 @@ +--- +name: search-code +trigger: search / rewrite code +description: Search and rewrite R source code by syntax using astgrepr. Use when asked to find patterns in code, search for function calls, identify usage of specific arguments, locate structural patterns across R files, or perform find-and-replace on code structure. +--- + +# Search and rewrite code with astgrepr + +`{astgrepr}` enables AST-based code search — structural queries that regex can't express cleanly. If not installed: `install.packages("astgrepr")` + +```r +library(astgrepr) + +src <- " +add <- function(x, y) x + y +greet <- function(name, greeting, sep) paste0(greeting, sep, name) +square <- function(x) x^2 +" +root <- src |> tree_new() |> tree_root() # or tree_new(file = "R/my_file.R") + +# µNAME/µA/µB capture matched nodes; only `add` matches (2 params) +matches <- node_find_all(root, + ast_rule(id = "two_arg", pattern = "µNAME <- function(µA, µB) µBODY") +) +matches #> |--two_arg: 1 nodes +node_text_all(matches) # source text of each match +lapply(matches$two_arg, \(n) node_get_match(n, "NAME") |> node_text()) #> "add" +``` + +## Reference + +Metavariables: `µVAR` captures one node (uppercase only); `µµµ` captures zero or more (ellipsis). In *replacement* strings, refer to captures as `~~VAR~~`. + +`ast_rule()` — see `?ast_rule`. Key args: + +- `pattern` — code pattern with metavariables +- `kind` — tree-sitter node kind; see the [R grammar](https://github.com/r-lib/tree-sitter-r/blob/main/src/grammar.json) +- `regex` — match node text with a Rust regex +- `id` — names the rule; results accessible as `matches$id` +- `inside`, `has`, `precedes`, `follows` — relational rules, each takes another `ast_rule()` +- `all`, `any`, `not` — boolean combinators, each takes a list of `ast_rule()`s + +For advanced pattern syntax: [ast-grep pattern docs](https://ast-grep.github.io/guide/pattern-syntax.html). + +## Searching across files + +```r +lapply(list.files("R", pattern = "\\.R$", full.names = TRUE), \(f) { + root <- tree_root(tree_new(file = f)) + texts <- node_find_all(root, ast_rule(id = "r", pattern = "YOUR_PATTERN")) |> + node_text_all() |> _$r + if (length(texts) > 0) list(file = basename(f), matches = texts) +}) |> Filter(Negate(is.null), x = _) +``` + +## Patterns + +```r +ast_rule(pattern = "if (µµµ) { return(µµµ) } else µµµ") # if-else with return() +ast_rule(pattern = "util_fun(debug = µµµ)") # named argument +ast_rule(kind = "while_statement") # by node kind +ast_rule(pattern = "df$µCOL") # df$col, any column +ast_rule(pattern = "print(µA)", # relational: inside loop + inside = ast_rule(any = ast_rule(kind = c("for_statement", "while_statement")))) +ast_rule(any = list(ast_rule(pattern = "any(is.na(µµµ))"), # boolean OR + ast_rule(pattern = "any(duplicated(µµµ))"))) +``` + +## Find and replace + +See `?node_replace_all`, `?tree_rewrite`. + +```r +root <- tree_root(tree_new(file = "R/my_file.R")) +fixes <- root |> + node_find_all( + ast_rule(id = "any_na", pattern = "any(is.na(µVAR))"), + ast_rule(id = "any_dup", pattern = "any(duplicated(µVAR))") + ) |> + node_replace_all(any_na = "anyNA(~~VAR~~)", + any_dup = "anyDuplicated(~~VAR~~) > 0") +tree_rewrite(root, fixes) # preview +writeLines(tree_rewrite(root, fixes), con = "R/my_file.R") # write back +``` diff --git a/.github/skills/tdd-workflow/SKILL.md b/.github/skills/tdd-workflow/SKILL.md index a51faea4..1620b6ac 100644 --- a/.github/skills/tdd-workflow/SKILL.md +++ b/.github/skills/tdd-workflow/SKILL.md @@ -1,19 +1,18 @@ --- name: tdd-workflow -description: Test-driven development workflow for tibblify. Use when writing any R code (writing new features, fixing bugs, refactoring, or reviewing tests). +trigger: writing or reviewing tests +description: Test-driven development workflow. Use when writing any R code (writing new features, fixing bugs, refactoring, or reviewing tests). --- -# TDD workflow for tibblify +# TDD workflow ## Core principle -Write a failing test first, then implement the minimal code to make it pass, -then refactor. Never write implementation code without a failing test driving it. +Write a failing test first, then implement the minimal code to make it pass, then refactor. Never write implementation code without a failing test driving it. ## File naming -Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. Place new tests -next to similar existing ones. +Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. Place new tests next to similar existing ones. ## Running tests @@ -37,8 +36,8 @@ which(purrr::map_int(covr_res, "value") == 0) ``` Files excluded from the coverage requirement: +- `R/*-package.R` - `R/aaa-shared_params.R` -- `R/tibblify-package.R` - Files matching `R/import-standalone-*.R` ## Test types @@ -48,23 +47,21 @@ Files excluded from the coverage requirement: Test individual functions in isolation: ```r -test_that("tib_chr() accepts a .default argument", { - spec <- tspec_df(tib_chr("x", .default = "fallback")) - result <- tibblify(list(list()), spec) - expect_equal(result$x, "fallback") +test_that("fetch_records() returns a tibble (#2)", { + result <- fetch_records(sample_input) + expect_s3_class(result, "tbl_df") }) ``` ### Integration tests -Test end-to-end pipelines through the full `tibblify()` → output path: +Test end-to-end pipelines through multiple functions: ```r -test_that("nested tspec_df works end-to-end", { - input <- list(list(a = list(b = 1L))) - spec <- tspec_df(tib_df("a", tspec_df(tib_int("b")))) - result <- tibblify(input, spec) - expect_equal(result$a[[1]]$b, 1L) +test_that("build_report() produces expected output (#15)", { + input <- data.frame(value = c(1.123, 2.456, NA)) + result <- build_report(input, tempfile()) + expect_equal(nrow(result), 2L) }) ``` @@ -73,49 +70,30 @@ test_that("nested tspec_df works end-to-end", { For complex outputs that are hard to specify with equality assertions: ```r -test_that("print method is stable (#123)", { - expect_snapshot(print(tspec_df(tib_chr("x")))) -}) - -# For errors, wrap expect_error() inside expect_snapshot() so both the error -# class and the message text are captured in the snapshot: -test_that("errors if there are detached parts of the tree (#456)", { - df <- tibble::tibble( - id = c(1, 2, 3), - x = c("a", "b", "c"), - parent = c(NA, 3, 2) - ) - - expect_snapshot({ - (expect_error( - nest_tree(df, id, parent, children_to = "children"), - class = "tibblify_error-detached_tree_parts" - )) - }) +test_that("build_summary print method is stable (#123)", { + expect_snapshot(print(build_summary(sample_data))) }) ``` -When snapshots change intentionally: +When snapshots change intentionally, check the content of the file corresponding to the edited test file, then accept: ```r -testthat::snapshot_review("test_name") testthat::snapshot_accept("test_name") ``` -Snapshots are stored in `tests/testthat/_snaps/`. +Snapshots are stored in `tests/testthat/_snaps/`. The filename corresponds to the R file being tested, ending with `.md`. ## Test design principles -- **Self-sufficient:** each test contains its own setup, execution, and - assertion. Tests must be runnable in isolation. -- **Duplication over factoring:** repeat setup code rather than extracting it. - Clarity beats DRY in tests. +- **Self-sufficient:** each test contains its own setup, execution, and assertion. Tests must be runnable in isolation. +- **Duplication over factoring:** repeat setup code rather than extracting it. Clarity beats DRY in tests. - **One concept per test:** a failing test should tell you exactly what broke. -- **Issue reference in description:** the `desc` of every new `test_that()` call - should end with a parenthetical issue reference: +- **Minimal with few comments:** keep tests lean. Avoid over-commenting. +- **Issue reference in description:** the `desc` of every new `test_that()` call should end with one or more parenthetical issue references for the issue(s) *verified by those tests* — typically the issue currently being solved. **Never guess or invent issue numbers.** Determine the number from the user's prompt, the branch name (`git branch --show-current`), or `gh issue list`. Before writing a number, verify you can trace it to one of these sources. If no tracked issue applies, use `#noissue`. The numbers in the examples below are illustrative placeholders — do not copy them: ```r - test_that("tib_chr() accepts a .default argument (#123)", { ... }) - test_that("errors on empty input (#123, #456)", { ... }) + test_that("fetch_records() returns correct columns (#1)", { ... }) + test_that("build_summary() returns correct columns (#2, #3)", { ... }) + test_that(".check_record() errors on empty input (#noissue)", { ... }) ``` ## testthat Edition 3 — deprecated patterns @@ -141,10 +119,71 @@ expect_identical(x, y) # exact match ### Conditions +**Errors thrown by this package** (via `.pkg_abort()`) should always be tested +with `stbl::expect_pkg_error_snapshot()`, which captures both the error class +hierarchy and the user-facing message in one snapshot: + +```r +test_that("process_data() errors on empty input (#42)", { + stbl::expect_pkg_error_snapshot( + process_data(data.frame()), + "tibblify", + "empty_input" + ) +}) +``` + +Pass `transform = stbl::.transform_path(path)` to scrub volatile values (e.g. temp +paths) from the snapshot before comparison. + +**Errors thrown by `stbl`** (via `stbl::to_*()` / `stbl::stabilize_*()`) +should be tested with `stbl::expect_pkg_error_classes()`. Since the message +text is controlled by `stbl`, only the class hierarchy needs to be asserted: + +```r +test_that("process_data() errors on non-integer page_size (#43)", { + stbl::expect_pkg_error_classes( + process_data(sample_data, page_size = "abc"), + "stbl", + "incompatible_type" + ) +}) +``` + +For **composite** stbl error classes (where the class name contains dashes, +e.g. `stbl-error-coerce-character`), pass each dash-separated component as a +separate argument. Underscores within a component are kept as-is: + +```r +test_that("process_data() errors on non-coercible input (#43)", { + stbl::expect_pkg_error_classes( + process_data(sample_data, value = list(bad = "input")), + "stbl", + "coerce", + "character" + ) +}) +``` + +**Errors from other packages** can be tested with `expect_error()`, optionally +wrapped in `expect_snapshot()` to lock down the message text: + ```r -expect_error(code) expect_error(code, "pattern") -expect_error(code, class = "tibblify_error") +expect_error(code, class = "some-error-class") + +# Lock down both class and message text: +test_that("fetch_records errors on invalid input (#456)", { + expect_snapshot( + (expect_error( + fetch_records("not valid input"), + class = "pkg-error" + )) + ) +}) +``` + +```r expect_warning(code) expect_no_warning(code) expect_message(code) @@ -165,20 +204,22 @@ expect_named(x, c("a", "b")) expect_type(x, "double") expect_s3_class(x, "tbl_df") expect_length(x, 10) +expect_null(x) ``` ### Logical +These expectations are a last resort when more-specific checks aren't available. + ```r expect_true(x) expect_false(x) -expect_null(x) ``` ## `withr` patterns for temporary state ```r -withr::local_options(list(tibblify.option = TRUE)) +withr::local_options(list(tibblify.verbose = TRUE)) withr::local_envvar(MY_VAR = "value") withr::local_tempfile(lines = c("a", "b")) ``` @@ -193,15 +234,17 @@ test_path("fixtures", "sample.rds") ## Mocking +Mock functions that might have unstable output, hit external servers, etc. + ```r local_mocked_bindings( - external_fn = function(...) "mocked_result" + .other_fn = function(...) "mocked_result" ) -result <- my_function_that_calls_external_fn() +result <- my_function_that_calls_other_fn() ``` ## Common mistakes - **Do not modify tests to make them pass.** Fix the implementation. -- **Do not write tests that depend on other tests' state.** Each test must - be independently runnable. +- **Do not write tests that depend on other tests' state.** Each test must be independently runnable. +- **Ask for help if test is bad.** If you think a test might be invalid, do not loop through trying to make impossible tests pass. Ask for help if possible. diff --git a/.github/workflows/R-CMD-check.yaml b/.github/workflows/R-CMD-check.yaml index 68651f05..25cb2c5e 100644 --- a/.github/workflows/R-CMD-check.yaml +++ b/.github/workflows/R-CMD-check.yaml @@ -11,6 +11,7 @@ name: R-CMD-check jobs: R-CMD-check: runs-on: ${{ matrix.config.os }} + container: ${{ matrix.config.container }} name: ${{ matrix.config.os }} (${{ matrix.config.r }}) @@ -20,29 +21,22 @@ jobs: config: - {os: macos-latest, r: 'release'} - {os: windows-latest, r: 'release'} - - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} - - {os: ubuntu-latest, r: 'release'} - - {os: ubuntu-latest, r: 'oldrel-1'} - - env: - GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} - R_KEEP_PKG_SOURCE: yes + - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release', container: 'ghcr.io/api2r/pkgskills-ci:devel'} + - {os: ubuntu-latest, r: 'release', container: 'ghcr.io/api2r/pkgskills-ci:release'} + - {os: ubuntu-latest, r: 'oldrel-1', container: 'ghcr.io/api2r/pkgskills-ci:oldrel-1'} steps: - uses: actions/checkout@v6 - - uses: r-lib/actions/setup-pandoc@v2 - - - uses: r-lib/actions/setup-r@v2 + - uses: ./.github/workflows/install with: + use-container: "${{ matrix.config.container != '' }}" + token: ${{ secrets.GITHUB_TOKEN }} r-version: ${{ matrix.config.r }} http-user-agent: ${{ matrix.config.http-user-agent }} - use-public-rspm: true - - - uses: r-lib/actions/setup-r-dependencies@v2 - with: - extra-packages: any::rcmdcheck needs: check + extra-packages: any::rcmdcheck + cache-version: "1" - uses: r-lib/actions/check-r-package@v2 with: diff --git a/.github/workflows/copilot-setup-steps.yml b/.github/workflows/copilot-setup-steps.yml new file mode 100644 index 00000000..e299ee30 --- /dev/null +++ b/.github/workflows/copilot-setup-steps.yml @@ -0,0 +1,51 @@ +name: "Copilot Setup Steps" + +# Automatically run when the file changes to allow for easy validation, +# and allow manual testing through the repository's "Actions" tab. +on: + workflow_dispatch: + push: + paths: + - .github/workflows/copilot-setup-steps.yml + pull_request: + paths: + - .github/workflows/copilot-setup-steps.yml + +jobs: + # The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot. + copilot-setup-steps: + runs-on: ubuntu-latest + + permissions: + contents: read + + steps: + - uses: actions/checkout@v6 + + - uses: ./.github/workflows/install + with: + token: ${{ secrets.GITHUB_TOKEN }} + cache-version: copilot + needs: build, check, website + # Extra packages include things referenced in skills, to make sure the + # agent has them available. + optional-packages: any::astgrepr + extra-packages: >- + any::cli + any::covr + any::devtools + any::magick + any::pak + any::pkgdown + any::purrr + any::rcmdcheck + any::rlang + any::roxygen2 + any::testthat + any::usethis + any::withr + wranglezone/stbl + local::. + + - name: Install air + uses: posit-dev/setup-air@v1 diff --git a/.github/workflows/install/action.yml b/.github/workflows/install/action.yml new file mode 100644 index 00000000..ff4ba5e1 --- /dev/null +++ b/.github/workflows/install/action.yml @@ -0,0 +1,118 @@ +name: "Install R and dependencies" +description: "Set up pandoc, R, and package dependencies" +inputs: + token: + description: "GitHub token, set to secrets.GITHUB_TOKEN" + required: true + r-version: + description: "R version, passed to r-lib/actions/setup-r@v2" + required: false + default: release + http-user-agent: + description: "HTTP user agent, passed to r-lib/actions/setup-r@v2" + required: false + default: "" + needs: + description: "Config/Needs tag(s), passed to r-lib/actions/setup-r-dependencies@v2" + required: false + default: "" + extra-packages: + description: "Extra packages, passed to r-lib/actions/setup-r-dependencies@v2" + required: false + default: any::rcmdcheck + optional-packages: + description: "Optional extra packages; installed individually, warn (not error) on failure" + required: false + default: "" + cache-version: + description: "Cache version for r-lib/actions/setup-r-dependencies@v2" + required: false + default: "1" + use-container: + description: "Set to 'true' when running inside a pre-warmed container (skips R and pandoc setup)" + required: false + default: "false" + +runs: + using: "composite" + steps: + - if: inputs.use-container != 'true' + uses: r-lib/actions/setup-pandoc@v2 + + - uses: r-lib/actions/setup-r@v2 + with: + install-r: ${{ inputs.use-container != 'true' }} + r-version: ${{ inputs.r-version }} + http-user-agent: ${{ inputs.http-user-agent }} + use-public-rspm: true + + # In pre-warmed containers every package may already be in the system + # library, so nothing is ever written to R_LIBS_USER. Ensure the + # directory exists with a sentinel so the actions/cache glob matches + # and the "Path Validation Error" warning doesn't fire at cache-save. + - if: inputs.use-container == 'true' + name: Ensure R_LIBS_USER exists + shell: bash + run: | + lib="$(Rscript -e 'cat(Sys.getenv("R_LIBS_USER"))')" + mkdir -p "$lib" + touch "$lib/.keep" + + # Container path: call pak::pak() directly so it checks .libPaths() and + # only installs/updates packages that are genuinely missing or outdated. + # setup-r-dependencies always runs a full lockfile-create → lockfile-install + # cycle which reinstalls everything even when the container already has it. + - if: inputs.use-container == 'true' + name: Update R package dependencies (container) + shell: Rscript {0} + env: + GITHUB_PAT: ${{ inputs.token }} + run: | + # Replicate needs → Config/Needs/ expansion from r-lib/actions + needs_parts <- strsplit("${{ inputs.needs }}", "[[:space:],]+")[[1]] + needs_parts <- needs_parts[nzchar(needs_parts)] + needs <- sprintf("Config/Needs/%s", needs_parts) + + extra_deps <- strsplit("${{ inputs.extra-packages }}", "[[:space:],]+")[[1]] + extra_deps <- extra_deps[nzchar(extra_deps)] + + # Over-protect for strangeness in the container + if (!requireNamespace("pak", quietly = TRUE)) { + install.packages("pak") + } + + pak::pak( + c("deps::.", extra_deps), + dependencies = c(needs, "all"), + upgrade = TRUE, + ask = FALSE + ) + + # Non-container path: standard setup-r-dependencies flow with caching. + - if: inputs.use-container != 'true' + uses: r-lib/actions/setup-r-dependencies@v2 + env: + GITHUB_PAT: ${{ inputs.token }} + with: + needs: ${{ inputs.needs }} + extra-packages: ${{ inputs.extra-packages }} + cache-version: ${{ inputs.cache-version }} + + - if: inputs.optional-packages != '' + name: Install optional packages + shell: Rscript {0} + env: + GITHUB_PAT: ${{ inputs.token }} + run: | + pkgs <- strsplit("${{ inputs.optional-packages }}", "\\s+")[[1]] + pkgs <- pkgs[nzchar(pkgs)] + for (pkg in pkgs) { + tryCatch( + pak::pak(pkg), + error = function(e) { + msg <- sprintf("Failed to install optional package %s: %s", pkg, conditionMessage(e)) + msg <- gsub("[\r\n]+", " ", msg) + cat("::warning::", msg, "\n", sep = "") + } + ) + } diff --git a/.github/workflows/pkgdown.yaml b/.github/workflows/pkgdown.yaml index 88044f4b..afc5f447 100644 --- a/.github/workflows/pkgdown.yaml +++ b/.github/workflows/pkgdown.yaml @@ -2,9 +2,9 @@ # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help on: push: - branches: [main, master] + branches: [main] pull_request: - branches: [main, master] + branches: [main] release: types: [published] workflow_dispatch: @@ -14,6 +14,8 @@ name: pkgdown jobs: pkgdown: runs-on: ubuntu-latest + container: + image: ghcr.io/api2r/pkgskills-ci:release # Only restrict concurrency for non-PR jobs concurrency: group: pkgdown-${{ github.event_name != 'pull_request' || github.run_id }} @@ -21,27 +23,50 @@ jobs: GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} permissions: contents: write + pull-requests: write steps: - uses: actions/checkout@v6 - - uses: r-lib/actions/setup-pandoc@v2 - - - uses: r-lib/actions/setup-r@v2 - with: - use-public-rspm: true - - - uses: r-lib/actions/setup-r-dependencies@v2 + - uses: ./.github/workflows/install with: - extra-packages: any::pkgdown, local::. + use-container: "true" + token: ${{ secrets.GITHUB_TOKEN }} needs: website + extra-packages: any::pkgdown gilead-biostats/qcthat local::. any::glue - name: Build site run: pkgdown::build_site_github_pages(new_process = FALSE, install = FALSE) shell: Rscript {0} + - name: Deploy PR preview 🧪 + if: github.event_name == 'pull_request' + uses: JamesIves/github-pages-deploy-action@v4.8.0 + with: + clean: false + branch: gh-pages + folder: docs + target-folder: pr/${{ github.event.number }} + + - name: Comment with PR site URL 🌐 + if: github.event_name == 'pull_request' + run: | + intPRNumber <- ${{ github.event.pull_request.number }} + strOwner <- tolower(qcthat::GetGHOwner()) + strRepo <- qcthat::GetGHRepo() + strURL <- glue::glue( + "https://{strOwner}.github.io/{strRepo}/pr/{intPRNumber}" + ) + print(paste("🌐 URL:", strURL)) + qcthat::CommentIssue( + intPRNumber, + glue::glue("🌐 [PR pkgdown deployed]({strURL})"), + NULL + ) + shell: Rscript {0} + - name: Deploy to GitHub pages 🚀 if: github.event_name != 'pull_request' - uses: JamesIves/github-pages-deploy-action@v4.4.1 + uses: JamesIves/github-pages-deploy-action@v4.8.0 with: clean: false branch: gh-pages diff --git a/.github/workflows/pr-commands.yaml b/.github/workflows/pr-commands.yaml index 5edb3186..8c94262a 100644 --- a/.github/workflows/pr-commands.yaml +++ b/.github/workflows/pr-commands.yaml @@ -11,8 +11,10 @@ jobs: if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/document') }} name: document runs-on: ubuntu-latest - env: - GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} + container: + image: ghcr.io/api2r/pkgskills-ci:release + permissions: + contents: write steps: - uses: actions/checkout@v6 @@ -20,14 +22,12 @@ jobs: with: repo-token: ${{ secrets.GITHUB_TOKEN }} - - uses: r-lib/actions/setup-r@v2 + - uses: ./.github/workflows/install with: - use-public-rspm: true - - - uses: r-lib/actions/setup-r-dependencies@v2 - with: - extra-packages: any::roxygen2 + use-container: "true" + token: ${{ secrets.GITHUB_TOKEN }} needs: pr-document + extra-packages: any::roxygen2 - name: Document run: roxygen2::roxygenise() @@ -43,37 +43,3 @@ jobs: - uses: r-lib/actions/pr-push@v2 with: repo-token: ${{ secrets.GITHUB_TOKEN }} - - style: - if: ${{ github.event.issue.pull_request && (github.event.comment.author_association == 'MEMBER' || github.event.comment.author_association == 'OWNER') && startsWith(github.event.comment.body, '/style') }} - name: style - runs-on: ubuntu-latest - env: - GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} - steps: - - uses: actions/checkout@v6 - - - uses: r-lib/actions/pr-fetch@v2 - with: - repo-token: ${{ secrets.GITHUB_TOKEN }} - - - uses: r-lib/actions/setup-r@v2 - - - name: Install dependencies - run: install.packages("styler") - shell: Rscript {0} - - - name: Style - run: styler::style_pkg() - shell: Rscript {0} - - - name: commit - run: | - git config --local user.name "$GITHUB_ACTOR" - git config --local user.email "$GITHUB_ACTOR@users.noreply.github.com" - git add \*.R - git commit -m 'Style' - - - uses: r-lib/actions/pr-push@v2 - with: - repo-token: ${{ secrets.GITHUB_TOKEN }} diff --git a/.github/workflows/qcthat.yaml b/.github/workflows/qcthat.yaml index 39f1dc59..a6ea54c7 100644 --- a/.github/workflows/qcthat.yaml +++ b/.github/workflows/qcthat.yaml @@ -1,5 +1,5 @@ # Workflow derived from -# https://github.com/Gilead-BioStats/qcthat/tree/v1.1.0/inst/workflows/qcthat.yaml. +# https://github.com/Gilead-BioStats/qcthat/tree/v1.1.1/inst/workflows/qcthat.yaml. on: pull_request: types: [opened, edited, reopened, synchronize, milestoned] @@ -48,22 +48,21 @@ env: jobs: qcthat: runs-on: ubuntu-latest + container: + image: ghcr.io/api2r/pkgskills-ci:release if: >- (github.event_name == 'issues' && contains(github.event.issue.labels.*.name, 'qcthat-uat')) || github.event_name != 'issues' env: GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} - steps: - - uses: actions/checkout@v5 - - - uses: r-lib/actions/setup-r@v2 - with: - use-public-rspm: true + - uses: actions/checkout@v6 - - uses: r-lib/actions/setup-r-dependencies@v2 + - uses: ./.github/workflows/install with: - extra-packages: Gilead-BioStats/qcthat@dev + use-container: "true" + token: ${{ secrets.GITHUB_TOKEN }} + extra-packages: Gilead-BioStats/qcthat@main local::. - name: Manage User Acceptance Testing if: >- @@ -77,7 +76,6 @@ jobs: - name: Generate Issue-Test Matrix if: >- (env.qcthat_PR_REPORT == 'true' || env.qcthat_RELEASE_REPORT == 'true' || env.qcthat_FAIL_FOR_TEST_FAILURES == 'true') && ( - github.event_name == 'push' || github.event_name == 'pull_request' || github.event_name == 'release' || (github.event_name == 'workflow_dispatch' && inputs.issueNumber == '') @@ -144,7 +142,6 @@ jobs: - name: Flag failure for completed if: >- env.qcthat_FAIL_FOR_TEST_FAILURES == 'true' && ( - github.event_name == 'push' || github.event_name == 'pull_request' || github.event_name == 'release' || (github.event_name == 'workflow_dispatch' && inputs.issueNumber == '') diff --git a/.github/workflows/test-coverage.yaml b/.github/workflows/test-coverage.yaml index e717d78b..6878a499 100644 --- a/.github/workflows/test-coverage.yaml +++ b/.github/workflows/test-coverage.yaml @@ -11,35 +11,43 @@ name: test-coverage jobs: test-coverage: runs-on: ubuntu-latest - env: - GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} - + container: + image: ghcr.io/api2r/pkgskills-ci:release steps: - uses: actions/checkout@v6 - - uses: r-lib/actions/setup-r@v2 - with: - use-public-rspm: true - - - uses: r-lib/actions/setup-r-dependencies@v2 + - uses: ./.github/workflows/install with: - extra-packages: any::covr + use-container: "true" + token: ${{ secrets.GITHUB_TOKEN }} needs: coverage + extra-packages: any::covr any::xml2 - name: Test coverage run: | - covr::codecov( + cov <- covr::package_coverage( quiet = FALSE, clean = FALSE, - install_path = file.path(Sys.getenv("RUNNER_TEMP"), "package") + install_path = file.path(normalizePath(Sys.getenv("RUNNER_TEMP"), winslash = "/"), "package") ) + print(cov) + covr::to_cobertura(cov) shell: Rscript {0} + - uses: codecov/codecov-action@v5 + with: + # Fail if error if not on PR, or if on PR and token is given + fail_ci_if_error: ${{ github.event_name != 'pull_request' || secrets.CODECOV_TOKEN }} + files: ./cobertura.xml + plugins: noop + disable_search: true + token: ${{ secrets.CODECOV_TOKEN }} + - name: Show testthat output if: always() run: | ## -------------------------------------------------------------------- - find ${{ runner.temp }}/package -name 'testthat.Rout*' -exec cat '{}' \; || true + find '${{ runner.temp }}/package' -name 'testthat.Rout*' -exec cat '{}' \; || true shell: bash - name: Upload test results diff --git a/AGENTS.md b/AGENTS.md index 87c77926..82d4c9a6 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,75 +1,73 @@ # AGENTS.md -## Package overview +## Repository overview -{tibblify} is an R package that rectangles nested lists. It converts deeply -nested R lists (e.g., parsed JSON) into tibbles. Users provide a *spec* -describing the expected structure (a `tspec_df()` containing `tib_*()` -collectors); tibblify traverses the list and fills those collectors. +**tibblify** — Rectangle Nested Lists -Key architecture: -- **R layer** (`R/`): spec construction, guessing (`guess_tspec()`), and the - user-facing API (`tibblify()`, `untibblify()`, etc.). -- **C layer** (`src/`): high-performance parsing; called from R via `.Call()`. -- **Vendored rlang** (`src/rlang/`): read-only C API; updated by the - `rlang-c.yaml` workflow — do not edit directly. -- **Tests** (`tests/testthat/`): 100% file-level coverage goal. +A tool to rectangle a nested list, that is to convert it into a tibble. This is done automatically or according to a given specification. A common use case is for nested lists coming from parsing 'JSON' files, or the 'JSON' responses of 'REST' 'APIs'. Rectangling uses the 'vctrs' package, and therefore offers a wide support of vector types. + +pkgdown site: https://tibblify.wrangle.zone +source repo: https://github.com/wranglezone/tibblify + +### Overall structure + +The project follows standard R package conventions with these key directories: + +tibblify/ +├── R/ # R source code +│ ├── tibblify-package.R # Auto-generated package docs +│ └── *.R # Function definitions, 1 file ~= 1 exported function +├── tests/testthat/ # Test suite +├── .github/ +│ ├── ISSUE_TEMPLATE/ # GitHub issue templates +│ ├── skills/ # Agent skill definitions +│ └── workflows/ # CI/CD configurations +├── src/ # C source code +│ └── rlang/ # Vendored helpers from the rlang package +├── man/ # Generated documentation +├── AGENTS.md # Main agent setup file +├── DESCRIPTION # Package metadata +├── NAMESPACE # Auto-generated export information +├── NEWS.md # Changelog +└── Various config files # .gitignore, codecov.yml, etc. + +--- ## Standard workflow For any feature, fix, or refactor: 1. **Update packages**: `pak::pak()` -2. **Run existing tests** — confirm everything passes before touching code: - `devtools::test(reporter = "check")` - If any tests fail, stop and ask the user how to proceed. -3. **Plan** — identify which R and C files are affected; check whether new exported - functions are needed (→ r-code skill) or C changes are required (→ c-code skill) -4. **Test first** — write a failing test, then implement (→ tdd-workflow skill): - `devtools::test(filter = "name", reporter = "check")` — should fail -5. **Implement** — minimal code to make the tests pass -6. **Refactor** — clean up while keeping all tests green -7. **Document** — for any new or changed exported functions, use the document skill -8. **Verify**: - ```r - devtools::test(reporter = "check") - covr_res <- devtools:::test_coverage_active_file("R/file_name.R") - which(purrr::map_int(covr_res, "value") == 0) - ``` - Then run `air format .` - Once, just before wrapping up: `devtools::check(error_on = "warning")`. - Warnings and errors must be resolved. NOTEs should be resolved too, with one - exception: "Compiled code should not call non-API entry points in R" is a - known issue being addressed progressively (primarily via rlang's efforts) — - leave it alone. -9. **News** — add a bullet at the top of `NEWS.md` (under the development version - heading). Rules: - - Only for user-facing changes; skip purely internal changes. - - One line per bullet; end with a full stop. - - Write for users, not developers. Frame positively, present tense: - `* \`tib_chr()\` now accepts ...` not `* Fixed a bug where ...` - - Put the function name (in backticks with `()`) near the start. - - Issue number and contributor go in parentheses before the final period: - `* \`tib_chr()\` now accepts ... (@{username}, #{issue_number}).` - - Get username: `gh api user --jq .login` +2. **Run tests** — confirm passing before changes: `devtools::test(reporter = "check")`. If any fail, stop and ask. +3. **Plan** — identify affected R files; check if new exports are needed. +4. **Test first** — write failing test, then implement: `devtools::test(filter = "name", reporter = "check")`. +5. **Implement** — minimal code to pass tests. +6. **Refactor** — clean up, keep tests green. +7. **Document** — document any new or changed exports. +8. **Verify**: Run `devtools::test(reporter = "check")`, then `devtools::check(error_on = "warning")`. Resolve warnings, errors, and NOTEs. +9. **News** — add bullet at top of `NEWS.md` (under dev heading): + - User-facing changes only. 1 line, end with `.` + - Present tense, positive framing, function names (backticks + `()`) near start: `` * `fn()` now accepts ... `` not `* Fixed ...` + - Issue/contributor before final period: `` * `fn()` now accepts ... (@user, #N). `` where `#N` is the GitHub issue number being implemented (e.g. `#42`). + - Get username: `gh api user --jq .login`; get issue number from the user's prompt, the branch name (`git branch --show-current`), or `gh issue list`. + - **Never guess or invent an issue number.** Before writing it, verify: (1) you received it from the user or the branch name, OR (2) you looked it up with `gh`. If you cannot trace the number to a concrete source, use `#noissue`. + +--- ## General -- When running R from the console, use `--quiet --vanilla`. +- R console: use `--quiet --vanilla`. - Always run `air format .` after generating R code. -- Code comments should explain *why*, not *what*. Omit comments that restate the code. -- When writing or reviewing any code, load the relevant skills (usually `r-code`, `tdd-workflow`, and `document`). +- Comments explain *why*, not *what*. ## Skills -Load skills from @.github/skills when the user triggers them. - -| Trigger | Path | -|---------------------------------------------------|------------------------------------------------| -| tag tests with issues | @.github/skills/tag-tests-with-issues/SKILL.md | -| document functions | @.github/skills/document/SKILL.md | -| create a GitHub issue | @.github/skills/create-issue/SKILL.md | -| implement issue / work on #NNN | @.github/skills/implement-issue/SKILL.md | -| edit files in src/ / C code | @.github/skills/c-code/SKILL.md | -| writing R functions / API design / error handling | @.github/skills/r-code/SKILL.md | -| writing or reviewing tests | @.github/skills/tdd-workflow/SKILL.md | +| Triggers | Path | +|----------|------| +| create GitHub issues | @.github/skills/create-issue/SKILL.md | +| document functions | @.github/skills/document/SKILL.md | +| from github | @.github/skills/github/SKILL.md | +| implement issue / work on #NNN | @.github/skills/implement-issue/SKILL.md | +| writing R functions / API design / error handling | @.github/skills/r-code/SKILL.md | +| search / rewrite code | @.github/skills/search-code/SKILL.md | +| writing or reviewing tests | @.github/skills/tdd-workflow/SKILL.md | diff --git a/DESCRIPTION b/DESCRIPTION index 37a019bb..e416efe0 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -30,6 +30,7 @@ Imports: vctrs (>= 0.7.2), withr (>= 2.5.2) Suggests: + astgrepr, covr (>= 3.6.1), jsonlite (>= 1.8.0), knitr (>= 1.40), diff --git a/R/aaa-conditions.R b/R/aaa-conditions.R new file mode 100644 index 00000000..35f21bb5 --- /dev/null +++ b/R/aaa-conditions.R @@ -0,0 +1,22 @@ +#' Raise a package-scoped error +#' +#' @inheritParams .shared-params +#' @inheritParams stbl::pkg_abort +#' @returns Does not return. +#' @keywords internal +.pkg_abort <- function( + message, + subclass, + call = caller_env(), + message_env = caller_env(), + ... +) { + stbl::pkg_abort( + "tibblify", + message, + subclass, + call = call, + message_env = message_env, + ... + ) +} From a1ae2281f076715ba21398085f8518ac29bbc53d Mon Sep 17 00:00:00 2001 From: Jon Harmon Date: Sun, 5 Apr 2026 13:26:36 -0500 Subject: [PATCH 2/2] add stbl to imports for error messaging --- DESCRIPTION | 1 + 1 file changed, 1 insertion(+) diff --git a/DESCRIPTION b/DESCRIPTION index e416efe0..20c57d67 100644 --- a/DESCRIPTION +++ b/DESCRIPTION @@ -25,6 +25,7 @@ Imports: lifecycle, purrr (>= 1.0.2), rlang (>= 1.1.0), + stbl, tibble (>= 3.2.1), tidyselect (>= 1.2.0), vctrs (>= 0.7.2),