Split run_sim() dataset preparation into create_sim_dataset()#48
Open
roninsightrx wants to merge 8 commits intomainfrom
Open
Split run_sim() dataset preparation into create_sim_dataset()#48roninsightrx wants to merge 8 commits intomainfrom
roninsightrx wants to merge 8 commits intomainfrom
Conversation
Move regimen building, covariate sampling, and t_obs record creation out of run_sim() into a new exported function create_sim_dataset(). run_sim() now accepts a pre-built data.frame from create_sim_dataset() as its `data` argument; the regimen, t_obs, covariates, and n_subjects arguments have been removed from run_sim(). create_sim_dataset() accepts a model object or file path, an optional data override (data.frame or CSV), and the dataset-shaping arguments. It uses get_required_input_variables() to map CSV columns to $INPUT names when a raw data file is supplied. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Covers input validation, return structure, CMT handling, n_subjects truncation, data override, t_obs record creation, regimen replacement, and covariate substitution. Pure validation tests run without Pharmpy; functional tests skip when NONMEM/Pharmpy is not configured. Also fixes create_sim_dataset() to only apply positional column renaming (via get_required_input_variables) for CSV file inputs, not for data.frames that already carry correct NONMEM column names. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR refactors simulation dataset construction out of run_sim() into a new exported helper create_sim_dataset(), and updates tests/docs to reflect the new workflow (prepare dataset first, then pass it to run_sim(data=...)).
Changes:
- Added exported
create_sim_dataset()to build NONMEM-format simulation datasets (regimen, covariates, observation records). - Simplified
run_sim()to consume a prepared dataset viadataand removed dataset-shaping arguments from its interface. - Updated/added documentation and tests for the new split API.
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
R/create_sim_dataset.R |
New exported dataset-prep function extracted from run_sim() logic. |
R/run_sim.R |
Removes regimen/t_obs/covariate shaping; loops over .regimen datasets and runs NONMEM simulations. |
tests/testthat/test-create_sim_dataset.R |
New test suite for create_sim_dataset(). |
tests/testthat/test-run_sim.R |
Updates tests to use create_sim_dataset() + run_sim() integration and adjusts prior stubs. |
man/create_sim_dataset.Rd |
New generated docs for create_sim_dataset(). |
man/run_sim.Rd |
Updates run_sim() docs to reflect new data expectations. |
man/create_regimen.Rd, man/combine_regimens.Rd |
Points regimen helpers at create_sim_dataset() instead of run_sim(). |
man/calc_pk_variables.Rd |
Updates generated docs after parameter inheritance changes. |
NAMESPACE |
Exports create_sim_dataset(). |
DESCRIPTION |
Version bump. |
R/create_regimen.R, R/set_simulation_clean.R |
Minor doc adjustments. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Passing a file path as `data` to run_sim() would silently assign a character scalar to sim_data and then error cryptically on sim_data[[".regimen"]]. Add an explicit cli_abort() with a hint to use create_sim_dataset() first, and add a matching test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
run_sim() now extracts the last dose (AMT where EVID==1) from sim_data_regimen and passes it to calc_pk_variables() so AUC_SS = last_dose / CL is computed when CL is in the output table. Previously, regimen = NULL was always passed, silently suppressing AUC_SS after the run_sim refactor. Also updates the add_pk_variables doc and adds a regression test. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…set() When data is a CSV file, always rename columns positionally to the names declared in $INPUT (not only when addl_cols > 0). This means $INPUT aliases such as ID=SUBJ or DV=CONC are normalised to the canonical NONMEM names (ID, DV, …) that downstream code relies on, even when the column count matches exactly. The rename is intentionally restricted to the character/CSV branch; data.frames supplied by the caller already carry correct NONMEM names and are used as-is. Adds a regression test for the alias-rename path. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
…sim_dataset()" This reverts commit 335e7ba.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
run_sim()into a new exported functioncreate_sim_dataset()run_sim()no longer acceptsregimen,t_obs,covariates, orn_subjects— callers build the dataset first withcreate_sim_dataset(), then pass it asdatacreate_sim_dataset()accepts a Pharmpy model object or a model file path, an optionaldataoverride (data.frame or CSV), and all dataset-shaping argumentscreate_sim_dataset()covering input validation, t_obs, regimen, covariates, and n_subjects behaviourTest plan
devtools::test(filter = "create_sim_dataset")passes on CI (Pharmpy available)devtools::test(filter = "run_sim")passes on CIdevtools::check()clean🤖 Generated with Claude Code