Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 42 additions & 4 deletions R/data.R
Original file line number Diff line number Diff line change
Expand Up @@ -705,9 +705,9 @@
"btESBL_AST"


#' DASSIM genotype data (AMRfinderplus)
#' DASSIM genotype data (AMRFinderPlus)
#'
#' The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRfinderplus v4.0.23
#' The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRFinderPlus v4.0.23
#'
#' @format `DASSIM_geno` A data frame with 12414 rows and 31 columns:
#' - `Name`: Name.
Expand Down Expand Up @@ -758,6 +758,7 @@
#' - `method`, `platform`, `guideline`: Test method and platform and interpretation guideline.
#' - `pheno_provided`: S/I/R interpretation as provided in the raw input.
#' - `spp_pheno`: Species identifier, interpreted from `Scientific name` using `as.mo`, used to interpret `ecoff` and `pheno` columns.
#' - ...: Additional data columns from the NCBI AST Browser
#' @source <https://www.ncbi.nlm.nih.gov/pathogens/ast#chloramphenicol%20AND%20Escherichia>
"NCBI_Ecoli_AST_chl"

Expand All @@ -769,13 +770,15 @@
#'
#' @format `MICROBIGGE_Ecoli_CHLR` A data frame with 95,776 rows and 27 columns:
#' - `id`: BioSample.
#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRfinderplus `subclass` field in the downloaded file.
#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRFinderPlus `subclass` field in the downloaded file.
#' - `gene`, `node`, `marker`: gene identifiers.
#' - `mutation`: mutation within gene, parsed into HGVS nomenclature format from `amr_element_symbol` field in the downloaded file.
#' - `% Coverage of reference`: % Coverage of reference.
#' - `% Identity to reference`: % Identity to reference.
#' - ...: Additional data columns from AMRfinderplus#' @source <https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol%20AND%20Escherichia>
#' - ...: Additional data columns from AMRFinderPlus
#' #' @source <https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol%20AND%20Escherichia>
"MICROBIGGE_Ecoli_CHLR"

#' Example Resistance Gene Identifier (RGI) v6.0.6 Genotype Data
#'
#' Raw RGI v6.0.6 results file (run with `--include_loose`) for 12 genomes of multiple species, one AMR determinant per row.
Expand Down Expand Up @@ -823,3 +826,38 @@
#' @source ENA BioProject [PRJEB10018](https://www.ebi.ac.uk/ena/browser/view/PRJEB10018).
#' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
"rgi_EuSCAPE_raw"


#' Example AMRFinderPlus Genotype Data from EuSCAPE project
#'
#' AMRFinderPlus results file for Klebsiella pneumoniae from EuSCAPE project, one AMR determinant per row, downloaded from the EBI AMR portal using [download_ebi()] and imported using [import_geno()].
#'
#' @format `kp_mero_amrfp` A data frame with 32,385 rows and 34 columns:
#' - `id`: BioSample.
#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRFinderPlus `subclass` field in the downloaded file.
#' - `gene`, `node`, `marker`: gene identifiers.
#' - `mutation`: mutation within gene, parsed into HGVS nomenclature format from `amr_element_symbol` field in the downloaded file.
#' - `% Coverage of reference`: % Coverage of reference.
#' - `% Identity to reference`: % Identity to reference.
#' - ...: Additional data columns from AMRFinderPlus
#' @source [EBI AMR Portal](https://www.ebi.ac.uk/amr).
#' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
"kp_mero_amrfp"


#' Meropenem Phenotype Data from EuSCAPE project
#'
#' Meropenem phenotype data for Klebsiella pneumoniae from EuSCAPE project, one sample per row, downloaded from the EBI AMR portal using [download_ebi()] and imported using [import_pheno()].
#'
#' @format `kp_mero_euscape` A data frame with 1,490 rows and 43 columns:
#' - `id`: Sample identifier, imported from the `BioSample` column in the raw input.
#' - `drug_agent`: Antibiotic code, interpreted from `Antibiotic` using `as.ab`.
#' - `mic`: Minimum inhibitory concentration, formatted using `as.mic`.
#' - `disk`: Disk diffusion zone, formatted using `as.disk`.
#' - `method`, `platform`, `guideline`: Test method and platform and interpretation guideline.
#' - `pheno_provided`: S/I/R interpretation as provided in the raw input.
#' - `spp_pheno`: Species identifier, interpreted from `Scientific name` using `as.mo`, used to interpret `ecoff` and `pheno` columns.
#' - ...: Additional data columns from EBI AMR Portal
#' @source [EBI AMR Portal](https://www.ebi.ac.uk/amr).
#' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
"kp_mero_euscape"
23 changes: 12 additions & 11 deletions R/import_pheno.R
Original file line number Diff line number Diff line change
Expand Up @@ -2228,17 +2228,18 @@ import_phoenix_ast <- function(input,
#' )
#' }
import_sirscan_ast <- function(
mic_file = NULL,
disk_file = NULL,
interpr_file = NULL,
source = NULL,
species = NULL,
ab = NULL,
instrument_guideline = NULL,
sirscan_codes = sirscan_codes,
interpret_eucast = FALSE,
interpret_clsi = FALSE,
interpret_ecoff = FALSE) {
mic_file = NULL,
disk_file = NULL,
interpr_file = NULL,
source = NULL,
species = NULL,
ab = NULL,
instrument_guideline = NULL,
sirscan_codes = sirscan_codes,
interpret_eucast = FALSE,
interpret_clsi = FALSE,
interpret_ecoff = FALSE
) {
if (is.null(mic_file) && is.null(disk_file) && is.null(interpr_file)) {
stop("At least one of 'mic_file', 'disk_file', or 'interpr_file' must be provided")
}
Expand Down
36 changes: 36 additions & 0 deletions data-raw/prep_euscape_example_data.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@
kp_mero <- download_ebi(
antibiotic = "meropenem",
species = "Klebsiella pneumoniae",
reformat = TRUE,
interpret_eucast = TRUE,
interpret_ecoff = TRUE
)

# Filter for isolates in EuSCAPE paper (PMID: 31358985)
kp_mero_euscape <- kp_mero %>% filter(grepl("31358985", source))

# There are assemblies from NCBI that are flagged for contamination and supposed to be excluded. For example, see SAMEA3729690 (https://www.ncbi.nlm.nih.gov/datasets/genome/?biosample=SAMEA3729690)

contaminated_assemblies <- c("SAMEA3729690", "SAMEA3721062", "SAMEA3721052", "SAMEA3720966", "SAMEA3673128", "SAMEA3538742", "SAMEA3721188", "SAMEA3649589", "SAMEA3538652", "SAMEA3649503", "SAMEA3538911", "SAMEA3727711", "SAMEA3649452", "SAMEA3649453", "SAMEA3649454", "SAMEA3649467", "SAMEA3721063", "SAMEA3538862", "SAMEA3538667", "SAMEA3673004", "SAMEA3729818", "SAMEA3729660", "SAMEA3673078", "SAMEA3673097")

# Remove contaminated assemblies from phenotype list
kp_mero_euscape <- kp_mero_euscape %>%
filter(!id %in% contaminated_assemblies)

usethis::use_data(kp_mero_euscape, internal = FALSE, overwrite = TRUE)


kp_mero_amrfp <- download_ebi(
data = "genotype", species = "Klebsiella pneumoniae",
reformat = T
)

# Filter for isolates in EuSCAPE paper with meropenem phenotypes and remove contaminated samples
kp_mero_amrfp <- kp_mero_amrfp %>% filter(id %in% kp_mero_euscape$id)

# There are assemblies from NCBI that are supposed to be excluded and flagged for contamination. For example, see SAMEA3729690 (https://www.ncbi.nlm.nih.gov/datasets/genome/?biosample=SAMEA3729690)

kp_mero_amrfp <- kp_mero_amrfp %>%
filter(!id %in% contaminated_assemblies)

usethis::use_data(kp_mero_amrfp, internal = FALSE, overwrite = TRUE)
Binary file added data/kp_mero_amrfp.rda
Binary file not shown.
Binary file added data/kp_mero_euscape.rda
Binary file not shown.
4 changes: 2 additions & 2 deletions man/DASSIM_geno.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 3 additions & 2 deletions man/MICROBIGGE_Ecoli_CHLR.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions man/NCBI_Ecoli_AST_chl.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

29 changes: 29 additions & 0 deletions man/kp_mero_amrfp.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

30 changes: 30 additions & 0 deletions man/kp_mero_euscape.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading
Loading