Skip to content

Split run_sim() dataset preparation into create_sim_dataset()#48

Open
roninsightrx wants to merge 8 commits intomainfrom
split-sim
Open

Split run_sim() dataset preparation into create_sim_dataset()#48
roninsightrx wants to merge 8 commits intomainfrom
split-sim

Conversation

@roninsightrx
Copy link
Copy Markdown
Contributor

Summary

  • Extracts dataset preparation logic (regimen building, covariate sampling, observation record creation) from run_sim() into a new exported function create_sim_dataset()
  • run_sim() no longer accepts regimen, t_obs, covariates, or n_subjects — callers build the dataset first with create_sim_dataset(), then pass it as data
  • create_sim_dataset() accepts a Pharmpy model object or a model file path, an optional data override (data.frame or CSV), and all dataset-shaping arguments
  • Adds a full test suite for create_sim_dataset() covering input validation, t_obs, regimen, covariates, and n_subjects behaviour

Test plan

  • devtools::test(filter = "create_sim_dataset") passes on CI (Pharmpy available)
  • devtools::test(filter = "run_sim") passes on CI
  • devtools::check() clean

🤖 Generated with Claude Code

roninsightrx and others added 2 commits March 27, 2026 22:03
Move regimen building, covariate sampling, and t_obs record creation
out of run_sim() into a new exported function create_sim_dataset().
run_sim() now accepts a pre-built data.frame from create_sim_dataset()
as its `data` argument; the regimen, t_obs, covariates, and n_subjects
arguments have been removed from run_sim().

create_sim_dataset() accepts a model object or file path, an optional
data override (data.frame or CSV), and the dataset-shaping arguments.
It uses get_required_input_variables() to map CSV columns to $INPUT
names when a raw data file is supplied.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers input validation, return structure, CMT handling, n_subjects
truncation, data override, t_obs record creation, regimen replacement,
and covariate substitution. Pure validation tests run without Pharmpy;
functional tests skip when NONMEM/Pharmpy is not configured.

Also fixes create_sim_dataset() to only apply positional column
renaming (via get_required_input_variables) for CSV file inputs, not
for data.frames that already carry correct NONMEM column names.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors simulation dataset construction out of run_sim() into a new exported helper create_sim_dataset(), and updates tests/docs to reflect the new workflow (prepare dataset first, then pass it to run_sim(data=...)).

Changes:

  • Added exported create_sim_dataset() to build NONMEM-format simulation datasets (regimen, covariates, observation records).
  • Simplified run_sim() to consume a prepared dataset via data and removed dataset-shaping arguments from its interface.
  • Updated/added documentation and tests for the new split API.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 11 comments.

Show a summary per file
File Description
R/create_sim_dataset.R New exported dataset-prep function extracted from run_sim() logic.
R/run_sim.R Removes regimen/t_obs/covariate shaping; loops over .regimen datasets and runs NONMEM simulations.
tests/testthat/test-create_sim_dataset.R New test suite for create_sim_dataset().
tests/testthat/test-run_sim.R Updates tests to use create_sim_dataset() + run_sim() integration and adjusts prior stubs.
man/create_sim_dataset.Rd New generated docs for create_sim_dataset().
man/run_sim.Rd Updates run_sim() docs to reflect new data expectations.
man/create_regimen.Rd, man/combine_regimens.Rd Points regimen helpers at create_sim_dataset() instead of run_sim().
man/calc_pk_variables.Rd Updates generated docs after parameter inheritance changes.
NAMESPACE Exports create_sim_dataset().
DESCRIPTION Version bump.
R/create_regimen.R, R/set_simulation_clean.R Minor doc adjustments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

roninsightrx and others added 6 commits March 27, 2026 23:10
Passing a file path as `data` to run_sim() would silently assign a
character scalar to sim_data and then error cryptically on
sim_data[[".regimen"]]. Add an explicit cli_abort() with a hint to
use create_sim_dataset() first, and add a matching test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_sim() now extracts the last dose (AMT where EVID==1) from
sim_data_regimen and passes it to calc_pk_variables() so AUC_SS =
last_dose / CL is computed when CL is in the output table.
Previously, regimen = NULL was always passed, silently suppressing
AUC_SS after the run_sim refactor.

Also updates the add_pk_variables doc and adds a regression test.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…set()

When data is a CSV file, always rename columns positionally to the
names declared in $INPUT (not only when addl_cols > 0). This means
$INPUT aliases such as ID=SUBJ or DV=CONC are normalised to the
canonical NONMEM names (ID, DV, …) that downstream code relies on,
even when the column count matches exactly.

The rename is intentionally restricted to the character/CSV branch;
data.frames supplied by the caller already carry correct NONMEM names
and are used as-is.

Adds a regression test for the alias-rename path.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants