fix(import_phoenix_ast): handle wide-format Phoenix exports#103
Open
efosternyarko wants to merge 1 commit intoAMRverse:mainfrom
Open
fix(import_phoenix_ast): handle wide-format Phoenix exports#103efosternyarko wants to merge 1 commit intoAMRverse:mainfrom
efosternyarko wants to merge 1 commit intoAMRverse:mainfrom
Conversation
…per sample, drug triplet columns) - Detect wide format via multiple '(MIC)'/'(MOC)' columns and pivot to long format before column resolution - Handle use_expertized logic during pivot (expert falls back to interp per row) - Drop untested drug rows (both MIC and SIR absent) - Add 'accession' to sample column auto-detect patterns, prioritised before 'isolate' - Switch final select from relocate() to select() to drop pivot intermediate columns Tested on three real Phoenix files: NMIC-422 (677 samples, 25 drugs), PMIC-84 (1228 samples, 18 drugs), Phoenix-Antibiogramm-Daten.xls (headerless)
Collaborator
Author
CI failure noteThe two failing jobs (
The most recent push to |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
import_phoenix_ast()failed silently or produced incorrect results when given Phoenix export files in wide format (one row per sample, with drug results stored as column triplets:XX (MIC),XX (Interp),XX (Expert)).The function was written for long-format Phoenix exports (one row per drug) and had no path for the wide format common in Phoenix XLSX batch exports.
Changes
R/import_pheno.R(MIC)/(MOC)columns. When detected, pivot to long format: metadata columns are preserved per row, and each drug triplet becomes one row with standardiseddrug,mic, andInterpcolumns.use_expertizedlogic during pivot — expert interpretation is preferred over rawInterpper row whenuse_expertized = TRUE(default), with per-row fallback toInterpwhere expert isNA."accession"to the sample ID detection patterns, prioritised before"^isolate$". Phoenix wide exports commonly use"Accession"as the sample identifier; previously"Isolate"(the within-batch isolate number) was matched first, collapsing hundreds of samples to a handful of unique IDs.relocate()toselect()so pivot intermediate columns (drug,Interp) are not leaked into the returned data frame.Testing
Tested on three real Phoenix export files:
NMIC-422 MIC Results 20026-Feb.xlsxPMIC-84 MIC Results 2026 -Feb .xlsxPhoenix-Antibiogramm-Daten.xlsAll three return the standardised 8-column output (
id,drug_agent,mic,disk,method,platform,pheno_provided,spp_pheno). The existing headerless positional path is unaffected.