Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
^\.vscode$
^AGENTS\.md$
^\.positai$
^\.claude$
57 changes: 57 additions & 0 deletions .github/skills/implement-issue/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
---
name: implement-issue
description: Implements a GitHub issue end-to-end. Use when asked to implement, work on, or fix a specific issue number.
compatibility: Requires the `gh` CLI and an authenticated GitHub session.
---

# Implement a GitHub issue

This skill wraps the Standard Workflow defined in `AGENTS.md`. Run the steps
below before and after that workflow.

## Before the standard workflow

**A. Read the issue in full:**

```bash
gh issue view {number}
```

If `gh` is not authenticated, stop and ask the user to authenticate before continuing.

**B. Check/create the branch:**

- If on `main`: `usethis::pr_init("fix-{number}-{description}")`
- Branch format: `fix-{number}-{description}`
- Parts separated by hyphens; `{description}` uses snake_case
- Example: `fix-279-ai_infrastructure`
Comment thread
jonthegeek marked this conversation as resolved.
- If a branch already exists for this issue, check it out instead

## Run the Standard Workflow from AGENTS.md

Steps 1–9 of the Standard Workflow are the core development loop.

## After the standard workflow

**C. Commit and push:**

1. Review commits already on this branch (not on `main`) — these are all part
of the same PR and should inform the PR description:
```bash
git log main..HEAD --oneline
```
2. Stage and commit all changes:
```bash
git add -A
git commit -m "{short imperative summary}"
```
3. Push and open the PR:
```bash
gh pr create --fill
```
Use `--title` and `--body` explicitly if `--fill` produces an inadequate
description.

This step may be overridden — the user may ask you to stop before committing,
handle the push themselves, or complete only part of the workflow. Always follow
explicit user instructions over these defaults.
139 changes: 139 additions & 0 deletions .github/skills/r-code/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,139 @@
---
name: r-code
description: Guide for writing R code in datawrap. Use when writing new functions, designing APIs, or reviewing/modifying existing R code.
---

# R code in datawrap

This skill covers how to design and write R functions in this package —
including signatures, API conventions, error handling, and common pitfalls.
For documenting functions, use the `document` skill. For tests, use the
`tdd-workflow` skill.

## General API design patterns

**Enum-like arguments** — declare choices as the default vector; resolve with
`arg_match0()` (exact match, no partial matching) at the top of the function:

```r
my_function <- function(x, mode = c("fast", "safe")) {
mode <- arg_match0(mode, c("fast", "safe"))
# mode is now guaranteed to be "fast" or "safe"
}
```

**`NULL` as "not provided"** — use `NULL` as the default for optional
arguments where there is no sensible scalar fallback; check with `is.null()`:

```r
my_function <- function(x, names_to = NULL) {
if (!is.null(names_to)) { ... }
}
```

**S3 object construction** — build as a named list, set class explicitly:

```r
.my_object <- function(x, type) {
out <- list(value = x, type = type)
class(out) <- c(paste0("my_", type), "my_object")
out
}
```

**`.call` propagation in internal validators** — helpers that validate
arguments and may throw errors should accept and forward `.call`:

```r
.check_something <- function(x, .call = caller_env()) {
if (bad(x)) cli::cli_abort("...", call = .call)
}

my_exported_fn <- function(x, .call = caller_env()) {
.check_something(x, .call)
}
```

**Return tibbles, not data frames:**

```r
my_function <- function(x) {
result |> tibble::as_tibble()
}
```

## Internal vs. exported functions

Export a function when:
- Users will call it directly
- Other packages may want to extend it
- It is a stable, intentional part of the API

Keep a function internal when:
- It is an implementation detail that may change
- It is only used within the package
- Exporting it would clutter the user-facing API

Internal helpers use a dot prefix (e.g. `.my_helper()`). Mark them with
`@keywords internal` and omit `@export`.

## Error handling

```r
cli::cli_abort(
"Input {.arg x} cannot be empty.",
"i" = "Provide a non-empty vector.",
call = caller_env()
)
```
Comment thread
jonthegeek marked this conversation as resolved.

- Always pass `call = caller_env()` (or `call = .call` or `call = call`, if a
call is passed to the function) so errors point to the user's call frame, not
an internal helper.
- Use custom error classes for errors that callers may want to catch
programmatically:
```r
cli::cli_abort(
"...",
class = c("my_specific_error", "datawrap_error"),
call = caller_env()
)
```

## Common package mistakes

```r
# Never use library() inside package code
library(dplyr) # Wrong
dplyr::filter(...) # Right
# or `@importFrom dplyr filter` if used extensively

# Never modify global state without restoring it
options(my_option = TRUE) # Wrong
withr::local_options(list(my_option = TRUE)) # Right

# Use system.file() for package data, not hardcoded paths
read.csv("/home/user/data.csv") # Wrong
system.file("extdata", "data.csv", package = "datawrap") # Right
```

## Dependencies

### Use existing imports first

Packages that are already in `Imports` in @DESCRIPTION should be used in
preference to base R equivalents when relevant.

For example: prefer `purrr::map()` over `lapply()`, `rlang::is_*()` predicates
over `is.*()`, `vctrs::vec_*()` over base length/NA checks, and
`withr::local_*()` over manual `on.exit()` state management.
Comment thread
jonthegeek marked this conversation as resolved.

### When to add a new dependency

Add a dependency when it provides significant functionality that would be
complex or brittle to reimplement — date parsing, web requests, complex string
manipulation. Stick with base R when the solution is straightforward and a
dependency would add overhead for little gain.

datawrap's existing dependencies are intentional. **Adding a new dependency
requires explicit discussion with the developer.**
186 changes: 186 additions & 0 deletions .github/skills/tdd-workflow/SKILL.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,186 @@
---
name: tdd-workflow
description: Test-driven development workflow for datawrap. Use when writing any R code (writing new features, fixing bugs, refactoring, or reviewing tests).
---

# TDD workflow for datawrap

## Core principle

Write a failing test first, then implement the minimal code to make it pass,
then refactor. Never write implementation code without a failing test driving it.

## File naming

Tests for `R/{name}.R` go in `tests/testthat/test-{name}.R`. Place new tests
next to similar existing ones.

## Running tests

```r
# Full suite
devtools::test(reporter = "check")

# Single file
devtools::test(filter = "name", reporter = "check")
```

Testing functions load code automatically. You do not need to call `library()` or `devtools::load_all()` separately.

## Coverage

Goal: **100%** for every edited file. After editing `R/file_name.R`, verify:

```r
covr_res <- devtools:::test_coverage_active_file("R/file_name.R")
which(purrr::map_int(covr_res, "value") == 0)
```
Comment thread
jonthegeek marked this conversation as resolved.

Files excluded from the coverage requirement:
- `R/datawrap-package.R`
- Files matching `R/import-standalone-*.R`

## Test types

### Unit tests

Test individual functions in isolation:

```r
test_that("create_dataset_dictionary returns a tibble (#2)", {
result <- create_dataset_dictionary(data.frame(x = 1L))
expect_s3_class(result, "tbl_df")
})
```

### Snapshot tests

For complex outputs that are hard to specify with equality assertions:

```r
test_that("print method is stable (#123)", {
expect_snapshot(print(create_dataset_dictionary(mtcars)))
})

# For errors, wrap expect_error() inside expect_snapshot() so both the error
# class and the message text are captured in the snapshot:
test_that("errors on invalid input (#456)", {
expect_snapshot({
(expect_error(
create_dataset_dictionary("not a data frame"),
class = "datawrap_error"
))
})
Comment thread
jonthegeek marked this conversation as resolved.
})
```

When snapshots change intentionally:

```r
testthat::snapshot_review("test_name")
testthat::snapshot_accept("test_name")
```

Snapshots are stored in `tests/testthat/_snaps/`.

## Test design principles

- **Self-sufficient:** each test contains its own setup, execution, and
assertion. Tests must be runnable in isolation.
- **Duplication over factoring:** repeat setup code rather than extracting it.
Clarity beats DRY in tests.
- **One concept per test:** a failing test should tell you exactly what broke.
- **Issue reference in description:** the `desc` of every new `test_that()` call
should end with a parenthetical issue reference:
```r
test_that("create_dataset_dictionary() returns correct columns (#1)", { ... })
test_that("errors on empty input (#noissue)", { ... })
```

## testthat Edition 3 — deprecated patterns

```r
# Deprecated → Modern
context("Data validation") # Remove — filename serves this purpose
expect_equivalent(x, y) # expect_equal(x, y, ignore_attr = TRUE)
with_mock(...) # local_mocked_bindings(...)
expect_is(x, "data.frame") # expect_s3_class(x, "data.frame")
```

## Essential expectations

### Equality & identity

```r
expect_equal(x, y) # with numeric tolerance
expect_equal(x, y, tolerance = 0.001)
expect_equal(x, y, ignore_attr = TRUE)
expect_identical(x, y) # exact match
```

### Conditions

```r
expect_error(code)
expect_error(code, "pattern")
expect_error(code, class = "datawrap_error")
expect_warning(code)
expect_no_warning(code)
expect_message(code)
expect_no_message(code)
```

### Collections

```r
expect_setequal(x, y) # same elements, any order
expect_in(element, set)
expect_named(x, c("a", "b"))
```

### Type & structure

```r
expect_type(x, "double")
expect_s3_class(x, "tbl_df")
expect_length(x, 10)
```

### Logical

```r
expect_true(x)
expect_false(x)
expect_null(x)
```

## `withr` patterns for temporary state

```r
withr::local_options(list(datawrap.option = TRUE))
withr::local_envvar(MY_VAR = "value")
withr::local_tempfile(lines = c("a", "b"))
```

## Fixtures

Store static test data in `tests/testthat/fixtures/` and access via:

```r
test_path("fixtures", "sample.rds")
```

## Mocking

```r
local_mocked_bindings(
external_fn = function(...) "mocked_result"
)
result <- my_function_that_calls_external_fn()
```

## Common mistakes

- **Do not modify tests to make them pass.** Fix the implementation.
- **Do not write tests that depend on other tests' state.** Each test must
be independently runnable.
Loading