diff --git a/R/data.R b/R/data.R
index 8d0414b..501c057 100644
--- a/R/data.R
+++ b/R/data.R
@@ -705,9 +705,9 @@
 "btESBL_AST"
 
 
-#' DASSIM genotype data (AMRfinderplus)
+#' DASSIM genotype data (AMRFinderPlus)
 #'
-#' The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRfinderplus v4.0.23
+#' The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRFinderPlus v4.0.23
 #'
 #' @format `DASSIM_geno` A data frame with 12414 rows and 31 columns:
 #' - `Name`:  Name.
@@ -758,6 +758,7 @@
 #' - `method`, `platform`, `guideline`: Test method and platform and interpretation guideline.
 #' - `pheno_provided`: S/I/R interpretation as provided in the raw input.
 #' - `spp_pheno`: Species identifier, interpreted from `Scientific name` using `as.mo`, used to interpret `ecoff` and `pheno` columns.
+#' - ...: Additional data columns from the NCBI AST Browser
 #' @source <https://www.ncbi.nlm.nih.gov/pathogens/ast#chloramphenicol%20AND%20Escherichia>
 "NCBI_Ecoli_AST_chl"
 
@@ -769,13 +770,15 @@
 #'
 #' @format `MICROBIGGE_Ecoli_CHLR` A data frame with 95,776 rows and 27 columns:
 #' - `id`:  BioSample.
-#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRfinderplus `subclass` field in the downloaded file.
+#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRFinderPlus `subclass` field in the downloaded file.
 #' - `gene`, `node`, `marker`: gene identifiers.
 #' - `mutation`: mutation within gene, parsed into HGVS nomenclature format from `amr_element_symbol` field in the downloaded file.
 #' - `% Coverage of reference`:  % Coverage of reference.
 #' - `% Identity to reference`:  % Identity to reference.
-#' - ...: Additional data columns from AMRfinderplus#' @source <https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol%20AND%20Escherichia>
+#' - ...: Additional data columns from AMRFinderPlus
+#' #' @source <https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol%20AND%20Escherichia>
 "MICROBIGGE_Ecoli_CHLR"
+
 #' Example Resistance Gene Identifier (RGI) v6.0.6 Genotype Data
 #'
 #' Raw RGI v6.0.6 results file (run with `--include_loose`) for 12 genomes of multiple species, one AMR determinant per row.
@@ -823,3 +826,38 @@
 #' @source ENA BioProject [PRJEB10018](https://www.ebi.ac.uk/ena/browser/view/PRJEB10018).
 #' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
 "rgi_EuSCAPE_raw"
+
+
+#' Example AMRFinderPlus Genotype Data from EuSCAPE project
+#'
+#' AMRFinderPlus results file for Klebsiella pneumoniae from EuSCAPE project, one AMR determinant per row, downloaded from the EBI AMR portal using [download_ebi()] and imported using [import_geno()].
+#'
+#' @format `kp_mero_amrfp` A data frame with 32,385 rows and 34 columns:
+#' - `id`:  BioSample.
+#' - `drug_agent`, `drug_class`: Antibiotic agent and class, determined by parsing AMRFinderPlus `subclass` field in the downloaded file.
+#' - `gene`, `node`, `marker`: gene identifiers.
+#' - `mutation`: mutation within gene, parsed into HGVS nomenclature format from `amr_element_symbol` field in the downloaded file.
+#' - `% Coverage of reference`:  % Coverage of reference.
+#' - `% Identity to reference`:  % Identity to reference.
+#' - ...: Additional data columns from AMRFinderPlus
+#' @source [EBI AMR Portal](https://www.ebi.ac.uk/amr).
+#' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
+"kp_mero_amrfp"
+
+
+#' Meropenem Phenotype Data from EuSCAPE project
+#'
+#' Meropenem phenotype data for Klebsiella pneumoniae from EuSCAPE project, one sample per row, downloaded from the EBI AMR portal using [download_ebi()] and imported using [import_pheno()].
+#'
+#' @format `kp_mero_euscape` A data frame with 1,490 rows and 43 columns:
+#' - `id`: Sample identifier, imported from the `BioSample` column in the raw input.
+#' - `drug_agent`: Antibiotic code, interpreted from `Antibiotic` using `as.ab`.
+#' - `mic`: Minimum inhibitory concentration, formatted using `as.mic`.
+#' - `disk`: Disk diffusion zone, formatted using `as.disk`.
+#' - `method`, `platform`, `guideline`: Test method and platform and interpretation guideline.
+#' - `pheno_provided`: S/I/R interpretation as provided in the raw input.
+#' - `spp_pheno`: Species identifier, interpreted from `Scientific name` using `as.mo`, used to interpret `ecoff` and `pheno` columns.
+#' - ...: Additional data columns from EBI AMR Portal
+#' @source [EBI AMR Portal](https://www.ebi.ac.uk/amr).
+#' See David *et al.* (2019) <https://doi.org/10.1038/s41564-019-0492-8>.
+"kp_mero_euscape"
diff --git a/R/import_pheno.R b/R/import_pheno.R
index dddcdb8..668d7d9 100644
--- a/R/import_pheno.R
+++ b/R/import_pheno.R
@@ -2228,17 +2228,18 @@ import_phoenix_ast <- function(input,
 #' )
 #' }
 import_sirscan_ast <- function(
-    mic_file = NULL,
-    disk_file = NULL,
-    interpr_file = NULL,
-    source = NULL,
-    species = NULL,
-    ab = NULL,
-    instrument_guideline = NULL,
-    sirscan_codes = sirscan_codes,
-    interpret_eucast = FALSE,
-    interpret_clsi = FALSE,
-    interpret_ecoff = FALSE) {
+  mic_file = NULL,
+  disk_file = NULL,
+  interpr_file = NULL,
+  source = NULL,
+  species = NULL,
+  ab = NULL,
+  instrument_guideline = NULL,
+  sirscan_codes = sirscan_codes,
+  interpret_eucast = FALSE,
+  interpret_clsi = FALSE,
+  interpret_ecoff = FALSE
+) {
   if (is.null(mic_file) && is.null(disk_file) && is.null(interpr_file)) {
     stop("At least one of 'mic_file', 'disk_file', or 'interpr_file' must be provided")
   }
diff --git a/data-raw/prep_euscape_example_data.R b/data-raw/prep_euscape_example_data.R
new file mode 100644
index 0000000..4d91de5
--- /dev/null
+++ b/data-raw/prep_euscape_example_data.R
@@ -0,0 +1,36 @@
+kp_mero <- download_ebi(
+  antibiotic = "meropenem",
+  species = "Klebsiella pneumoniae",
+  reformat = TRUE,
+  interpret_eucast = TRUE,
+  interpret_ecoff = TRUE
+)
+
+# Filter for isolates in EuSCAPE paper (PMID: 31358985)
+kp_mero_euscape <- kp_mero %>% filter(grepl("31358985", source))
+
+# There are assemblies from NCBI that are flagged for contamination and supposed to be excluded. For example, see SAMEA3729690 (https://www.ncbi.nlm.nih.gov/datasets/genome/?biosample=SAMEA3729690)
+
+contaminated_assemblies <- c("SAMEA3729690", "SAMEA3721062", "SAMEA3721052", "SAMEA3720966", "SAMEA3673128", "SAMEA3538742", "SAMEA3721188", "SAMEA3649589", "SAMEA3538652", "SAMEA3649503", "SAMEA3538911", "SAMEA3727711", "SAMEA3649452", "SAMEA3649453", "SAMEA3649454", "SAMEA3649467", "SAMEA3721063", "SAMEA3538862", "SAMEA3538667", "SAMEA3673004", "SAMEA3729818", "SAMEA3729660", "SAMEA3673078", "SAMEA3673097")
+
+# Remove contaminated assemblies from phenotype list
+kp_mero_euscape <- kp_mero_euscape %>%
+  filter(!id %in% contaminated_assemblies)
+
+usethis::use_data(kp_mero_euscape, internal = FALSE, overwrite = TRUE)
+
+
+kp_mero_amrfp <- download_ebi(
+  data = "genotype", species = "Klebsiella pneumoniae",
+  reformat = T
+)
+
+# Filter for isolates in EuSCAPE paper with meropenem phenotypes and remove contaminated samples
+kp_mero_amrfp <- kp_mero_amrfp %>% filter(id %in% kp_mero_euscape$id)
+
+# There are assemblies from NCBI that are supposed to be excluded and flagged for contamination. For example, see SAMEA3729690 (https://www.ncbi.nlm.nih.gov/datasets/genome/?biosample=SAMEA3729690)
+
+kp_mero_amrfp <- kp_mero_amrfp %>%
+  filter(!id %in% contaminated_assemblies)
+
+usethis::use_data(kp_mero_amrfp, internal = FALSE, overwrite = TRUE)
\ No newline at end of file
diff --git a/data/kp_mero_amrfp.rda b/data/kp_mero_amrfp.rda
new file mode 100644
index 0000000..b1ddeca
Binary files /dev/null and b/data/kp_mero_amrfp.rda differ
diff --git a/data/kp_mero_euscape.rda b/data/kp_mero_euscape.rda
new file mode 100644
index 0000000..83cad34
Binary files /dev/null and b/data/kp_mero_euscape.rda differ
diff --git a/man/DASSIM_geno.Rd b/man/DASSIM_geno.Rd
index acdcba2..3f7b6ec 100644
--- a/man/DASSIM_geno.Rd
+++ b/man/DASSIM_geno.Rd
@@ -3,7 +3,7 @@
 \docType{data}
 \name{DASSIM_geno}
 \alias{DASSIM_geno}
-\title{DASSIM genotype data (AMRfinderplus)}
+\title{DASSIM genotype data (AMRFinderPlus)}
 \format{
 \code{DASSIM_geno} A data frame with 12414 rows and 31 columns:
 \itemize{
@@ -49,6 +49,6 @@
 DASSIM_geno
 }
 \description{
-The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRfinderplus v4.0.23
+The DASSIM dataset screened for Antimicrobial resistance genes (ARGs) using AMRFinderPlus v4.0.23
 }
 \keyword{datasets}
diff --git a/man/MICROBIGGE_Ecoli_CHLR.Rd b/man/MICROBIGGE_Ecoli_CHLR.Rd
index b19dcdd..dfc5a63 100644
--- a/man/MICROBIGGE_Ecoli_CHLR.Rd
+++ b/man/MICROBIGGE_Ecoli_CHLR.Rd
@@ -8,12 +8,13 @@
 \code{MICROBIGGE_Ecoli_CHLR} A data frame with 95,776 rows and 27 columns:
 \itemize{
 \item \code{id}:  BioSample.
-\item \code{drug_agent}, \code{drug_class}: Antibiotic agent and class, determined by parsing AMRfinderplus \code{subclass} field in the downloaded file.
+\item \code{drug_agent}, \code{drug_class}: Antibiotic agent and class, determined by parsing AMRFinderPlus \code{subclass} field in the downloaded file.
 \item \code{gene}, \code{node}, \code{marker}: gene identifiers.
 \item \code{mutation}: mutation within gene, parsed into HGVS nomenclature format from \code{amr_element_symbol} field in the downloaded file.
 \item \verb{\% Coverage of reference}:  \% Coverage of reference.
 \item \verb{\% Identity to reference}:  \% Identity to reference.
-\item ...: Additional data columns from AMRfinderplus#' @source \url{https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol\%20AND\%20Escherichia}
+\item ...: Additional data columns from AMRFinderPlus
+#' @source \url{https://www.ncbi.nlm.nih.gov/pathogens/microbigge/#chloramphenicol\%20AND\%20Escherichia}
 }
 }
 \usage{
diff --git a/man/NCBI_Ecoli_AST_chl.Rd b/man/NCBI_Ecoli_AST_chl.Rd
index 3d63dad..9e37ef9 100644
--- a/man/NCBI_Ecoli_AST_chl.Rd
+++ b/man/NCBI_Ecoli_AST_chl.Rd
@@ -14,6 +14,7 @@
 \item \code{method}, \code{platform}, \code{guideline}: Test method and platform and interpretation guideline.
 \item \code{pheno_provided}: S/I/R interpretation as provided in the raw input.
 \item \code{spp_pheno}: Species identifier, interpreted from \verb{Scientific name} using \code{as.mo}, used to interpret \code{ecoff} and \code{pheno} columns.
+\item ...: Additional data columns from the NCBI AST Browser
 }
 }
 \source{
diff --git a/man/kp_mero_amrfp.Rd b/man/kp_mero_amrfp.Rd
new file mode 100644
index 0000000..2d27341
--- /dev/null
+++ b/man/kp_mero_amrfp.Rd
@@ -0,0 +1,29 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/data.R
+\docType{data}
+\name{kp_mero_amrfp}
+\alias{kp_mero_amrfp}
+\title{Example AMRFinderPlus Genotype Data from EuSCAPE project}
+\format{
+\code{kp_mero_amrfp} A data frame with 32,385 rows and 34 columns:
+\itemize{
+\item \code{id}:  BioSample.
+\item \code{drug_agent}, \code{drug_class}: Antibiotic agent and class, determined by parsing AMRFinderPlus \code{subclass} field in the downloaded file.
+\item \code{gene}, \code{node}, \code{marker}: gene identifiers.
+\item \code{mutation}: mutation within gene, parsed into HGVS nomenclature format from \code{amr_element_symbol} field in the downloaded file.
+\item \verb{\% Coverage of reference}:  \% Coverage of reference.
+\item \verb{\% Identity to reference}:  \% Identity to reference.
+\item ...: Additional data columns from AMRFinderPlus
+}
+}
+\source{
+\href{https://www.ebi.ac.uk/amr}{EBI AMR Portal}.
+See David \emph{et al.} (2019) \url{https://doi.org/10.1038/s41564-019-0492-8}.
+}
+\usage{
+kp_mero_amrfp
+}
+\description{
+AMRFinderPlus results file for Klebsiella pneumoniae from EuSCAPE project, one AMR determinant per row, downloaded from the EBI AMR portal using \code{\link[=download_ebi]{download_ebi()}} and imported using \code{\link[=import_geno]{import_geno()}}.
+}
+\keyword{datasets}
diff --git a/man/kp_mero_euscape.Rd b/man/kp_mero_euscape.Rd
new file mode 100644
index 0000000..9b39381
--- /dev/null
+++ b/man/kp_mero_euscape.Rd
@@ -0,0 +1,30 @@
+% Generated by roxygen2: do not edit by hand
+% Please edit documentation in R/data.R
+\docType{data}
+\name{kp_mero_euscape}
+\alias{kp_mero_euscape}
+\title{Meropenem Phenotype Data from EuSCAPE project}
+\format{
+\code{kp_mero_euscape} A data frame with 1,490 rows and 43 columns:
+\itemize{
+\item \code{id}: Sample identifier, imported from the \code{BioSample} column in the raw input.
+\item \code{drug_agent}: Antibiotic code, interpreted from \code{Antibiotic} using \code{as.ab}.
+\item \code{mic}: Minimum inhibitory concentration, formatted using \code{as.mic}.
+\item \code{disk}: Disk diffusion zone, formatted using \code{as.disk}.
+\item \code{method}, \code{platform}, \code{guideline}: Test method and platform and interpretation guideline.
+\item \code{pheno_provided}: S/I/R interpretation as provided in the raw input.
+\item \code{spp_pheno}: Species identifier, interpreted from \verb{Scientific name} using \code{as.mo}, used to interpret \code{ecoff} and \code{pheno} columns.
+\item ...: Additional data columns from EBI AMR Portal
+}
+}
+\source{
+\href{https://www.ebi.ac.uk/amr}{EBI AMR Portal}.
+See David \emph{et al.} (2019) \url{https://doi.org/10.1038/s41564-019-0492-8}.
+}
+\usage{
+kp_mero_euscape
+}
+\description{
+Meropenem phenotype data for Klebsiella pneumoniae from EuSCAPE project, one sample per row, downloaded from the EBI AMR portal using \code{\link[=download_ebi]{download_ebi()}} and imported using \code{\link[=import_pheno]{import_pheno()}}.
+}
+\keyword{datasets}
diff --git a/vignettes/KpneumoMeropenem_ComparingAMRgenotypers.Rmd b/vignettes/KpneumoMeropenem_ComparingAMRgenotypers.Rmd
new file mode 100644
index 0000000..857e98a
--- /dev/null
+++ b/vignettes/KpneumoMeropenem_ComparingAMRgenotypers.Rmd
@@ -0,0 +1,1130 @@
+---
+title: "Example comparing AMR genotypers"
+author: "Kara Tsang"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Example comparing AMR genotypers}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>",
+  fig.width = 10,
+  out.width = "100%"
+)
+
+# If the package is not installed, load the development version
+if (!requireNamespace("AMRgen", quietly = TRUE)) {
+  devtools::load_all()
+}
+```
+
+# Introduction
+
+This vignette demonstrates an example of how to investigate associations between *Klebsiella pneumoniae* meropenem antibiotic susceptibility testing (AST) data and AMR genotype data (Kleborate, AMRFinderPlus and RGI). We will be using the AST data and AMR genotyping outputs for n=1490 isolates from the European Survey of Carbapenemase-Producing Enterobacteriaceae (EuSCAPE) published as ['Epidemic of carbapenem-resistant *Klebsiella pneumoniae* in Europe is driven by nosocomial spread' by David, S., et al. Nature Microbiology, 2016](doi.org/10.1038/s41564-019-0492-8). 
+
+Whole genome sequence reads were downloaded from [NCBI Bioproject PRJEB10018](https://www.ebi.ac.uk/ena/browser/view/PRJEB10018) / [European Nucleotide Archive ERP011196](https://www.ebi.ac.uk/ena/browser/view/ERP011196), trimmed using Trim Galore v0.5.0, and assembled using Unicycler v0.5.0. The assembled genomes were then run through each AMR genotyper:
+
+- [Kleborate v3.1.3](https://github.com/klebgenomics/Kleborate/releases/tag/v3.1.3)
+- [Kleborate version development branch - commit #4ec1dcb](https://github.com/klebgenomics/Kleborate/tree/development) on March 17, 2026
+- [Resistance Gene Idenfier (RGI v6.0.6)](https://github.com/arpcard/rgi/releases/tag/6.0.6) - using the Comprehensive Antibiotic Resistance Database (CARD, v4.0.1)
+
+AMRFinderPlus results were generated by [EMBL-EBI Antimicrobial Resistance Portal](https://www.ebi.ac.uk/amr/methods/) using 
+- [AMRFinderPlus v4.0](https://github.com/ncbi/amr/) with [NCBI Reference Gene Catalog](https://www.ncbi.nlm.nih.gov/pathogens/refgene/) database version 2025-07-16.1.
+
+
+## Load the required packages
+
+```{r setup, warning=FALSE, message=FALSE}
+library(AMRgen)
+library(dplyr)
+library(ggplot2)
+library(tidyr)
+library(stringr)
+```
+
+# Phenotype Data 
+
+The `download_ebi()` function lets you load phenotype data and interpret susceptible, intermediate, and resistant phenotypes using EUCAST breakpoints and ECOFF. A copy of the data objected produced below is available in the AMRgen package as `kp_mero_euscape`.
+
+``` {r download_ebi_phenotype, warning=FALSE, message=FALSE}
+# Download Klebsiella pneumoniae AST data from EBI, filtering for meropenem and re-interpret with EUCAST breakpoints and ECOFF
+
+kp_mero <- download_ebi(
+  antibiotic = "meropenem",
+  species = "Klebsiella pneumoniae",
+  reformat = TRUE,
+  interpret_eucast = TRUE,
+  interpret_ecoff = TRUE
+)
+
+# Filter for isolates in EuSCAPE paper (PMID: 31358985)
+kp_mero_euscape <- kp_mero %>% filter(grepl("31358985", source))
+
+# There are assemblies from NCBI that are flagged for contamination and supposed to be excluded. For example, see SAMEA3729690 (https://www.ncbi.nlm.nih.gov/datasets/genome/?biosample=SAMEA3729690)
+
+contaminated_assemblies <- c("SAMEA3729690", "SAMEA3721062", "SAMEA3721052", "SAMEA3720966", "SAMEA3673128", "SAMEA3538742", "SAMEA3721188", "SAMEA3649589", "SAMEA3538652", "SAMEA3649503", "SAMEA3538911", "SAMEA3727711", "SAMEA3649452", "SAMEA3649453", "SAMEA3649454", "SAMEA3649467", "SAMEA3721063", "SAMEA3538862", "SAMEA3538667", "SAMEA3673004", "SAMEA3729818", "SAMEA3729660", "SAMEA3673078", "SAMEA3673097")
+
+# Remove contaminated assemblies from phenotype list
+kp_mero_euscape <- kp_mero_euscape %>%
+  filter(!id %in% contaminated_assemblies)
+```
+
+Check the data frame
+``` {r}
+head(kp_mero_euscape)
+```
+
+## Phenotype Data Summary 
+
+Summarize the downloaded phenotype data and plot the minimum inhibitory concentration (MIC) distributions with EUCAST breakpoints and ECOFF. 
+
+```{r phenotype_summary, fig.height=5}
+# Summary of meropenem phenotype data including S/I/R count using EUCAST breakpoint and ECOFF
+summarise_pheno(kp_mero_euscape, pheno_cols = c("pheno_eucast", "ecoff"))
+
+# MIC distribution coloured by phenotype interpretation using EUCAST breakpoint
+assay_by_var(
+  pheno_table = kp_mero_euscape,
+  antibiotic = "Meropenem",
+  measure = "mic",
+  colour_by = "pheno_eucast",
+  species = "Klebsiella pneumoniae"
+)
+
+# Summary of meropenem phenotypes using ECOFF
+kp_mero_euscape %>% count(ecoff)
+
+# MIC distribution coloured by ECOFF
+assay_by_var(
+  pheno_table = kp_mero_euscape,
+  antibiotic = "Meropenem",
+  measure = "mic",
+  colour_by = "ecoff",
+  species = "Klebsiella pneumoniae"
+)
+```
+# Kleborate 
+
+[Kleborate](https://github.com/klebgenomics/Kleborate) screens *Klebsiella pneumoniae* species complex (KpSC) genome assemblies to identify sequence types (MLST), species, antimicrobial resistance (AMR) genes, virulence loci (e.g., yersiniabactin, aerobactin), and capsule/LPS serotypes (K and O antigens),  published [here]( https://doi.org/10.1038/s41467-021-24448-3). It can be run on the [command line](https://github.com/klebgenomics/Kleborate) or via [Pathogenwatch](https://pathogen.watch/).
+
+## Import Kleborate Genotype Data
+
+The `import_kleborate()` function imports the output table from  [Kleborate](https://github.com/klebgenomics/Kleborate), extracts the AMR genotyping data, and formats it to be used with AMRgen functions. 
+
+Mutation notation in Kleborate changed after v3.1.3 to adhere to [HGVS Nomenclature](https://hgvs-nomenclature.org/stable/), so:
+
+- To import Kleborate output <=v3.1.3 (using informal nomenclature (e.g. [gene]-[mutation], [gene]-X%, OmpK36GD)), in the `import_kleborate()` function, set `hgvs = FALSE`. 
+
+- To import Kleborate output >v3.1.3 (using HGVS Nomenclature), in the `import_kleborate()` function, set to `hgvs=TRUE` (which is already the default option).
+
+We are importing the latest version of Kleborate which is in the [development branch - commit #4ec1dcb](https://github.com/klebgenomics/Kleborate/tree/development) from March 17, 2026. This version uses HGVS Nomenclature for describing mutations and includes an updated AMR database compared to the most recent [Kleborate release v3.2.4](https://github.com/klebgenomics/Kleborate/releases/tag/v3.2.4).
+
+A table of Kleborate results generated for the EuSCAPE genomes is available in the AMRgen package as `kleborate_raw`. Let's import this to AMRgen genotype table format and summarise the content:
+
+```{r import_kleborate_dev}
+# Updated Kleborate results from the development branch as of March 17, 2026 (commit #4ec1dcb)
+head(kleborate_raw, n = 10)
+
+# Import Kleborate
+kleborate_dev <- import_kleborate(kleborate_raw)
+
+# View summary of genotypes
+summarise_geno(kleborate_dev)
+```
+
+## Kleborate Genotype and Phenotype Summary
+
+Summarize how many markers are associated with the beta-lactam and carbapenem drug class since Kleborate only operates at a drug class level. 
+
+```{r kleborate_geno_pheno}
+summarise_geno_pheno(kleborate_dev, kp_mero_euscape,
+  pheno_cols = c("pheno_eucast", "ecoff")
+)
+```
+## Generate Binary Matrix for Kleborate AMR Markers
+
+Most AMRgen analysis functions require a binary matrix with one sample per row, and columns indicating the phenotype and genotype data in columns, where a `1` indicates the presence and `0` indicates the absence of the phenotype or genotypic marker in that sample. This is produced using the `get_binary_matrix` function:
+
+```{r kleborate_binary_matrix}
+kleborate_binary_matrix <- get_binary_matrix(
+  geno_table = kleborate_dev,
+  pheno_table = kp_mero_euscape,
+  antibiotic = "Meropenem",
+  drug_class_list = c("Carbapenems"),
+  sir_col = "pheno_eucast",
+  keep_assay_values = TRUE,
+  keep_assay_values_from = "mic",
+  marker_col = "marker.label"
+)
+
+head(kleborate_binary_matrix, n = 10)
+```
+## Solo PPV Analysis for Kleborate AMR Markers
+
+To understand the individual contribution of an AMR marker found "solo" (i.e., in the absence of another carbapenem resistance determinant), we use the `solo_ppv_analysis()` function. 
+
+The `combined_plot` is a visual representation of each AMR marker found solo, the phenotypic distribution of isolates, and the positive predictive values (PPVs). The `solo_stats` table provides the PPVs, standard error (`se`), lower confidence interval (`ci.lower`), and upper confidence interval (`ci.upper`). 
+
+```{r kleborate_soloPPV, fig.height=6}
+soloPPV_kleborate_mero <- solo_ppv_analysis(binary_matrix = kleborate_binary_matrix)
+
+soloPPV_kleborate_mero$solo_stats
+```
+## Combinatorial PPV Analysis for Kleborate AMR Markers
+
+To understand the contribution of AMR markers found in combination with one another, we use the `ppv()` function. 
+The `plot` is a visual summary of each AMR marker combination observed in an UpSet plot format, including phenotypic distribution and PPVs for each combination. The `summary` table includes each AMR marker combination observed, including number of resistant isolates, positive predictive values, and median assay values (and interquartile range) where relevant.
+
+
+```{r kleborate_comboPPV, fig.height=10}
+comboPPV_kleborate_mero <- ppv(
+  binary_matrix = kleborate_binary_matrix,
+  order = "value",
+  min_set_size = 2,
+  antibiotic = "Meropenem",
+  upset_grid = TRUE,
+  plot_assay = TRUE,
+  assay = "mic"
+)
+
+comboPPV_kleborate_mero$summary
+```
+
+## UpSet Plot for Kleborate AMR Markers
+
+Similar to the previous `ppv()` function, the `amr_upset()` function will generate a `summary` table and `plot` that shows the combinations of AMR markers found in the isolates and their phenotypic distribution. The `plot` from `amr_upset()` is similar to the `plot` generated by the previous `ppv()` function, but missing the PPV panel. 
+
+```{r kleborate_upset_plot, fig.height=10}
+# Changing the OmpK35:- and OmpK36:- labels to match the figures used in AMRgen paper (for figure aesthetics)
+kleborate_binary_matrix_plotting <- kleborate_binary_matrix %>%
+  rename(`OmpK36-Δ` = `OmpK36..-`) %>%
+  rename(`OmpK35-Δ` = `OmpK35..-`)
+
+kp_mic_upset_kleborate <- amr_upset(kleborate_binary_matrix_plotting, assay = "mic", bp_R = "8", bp_S = "2", ecoff_bp = "0.125", min_set_size = 1)
+kp_mic_upset_kleborate$summary
+```
+## Group Kleborate AMR Markers
+
+Since there are a number of carbapenem resistance gene alleles, we want to group them by their respective gene families (e.g., KPC, VIM, OXA, etc.) to simplify. In addition, we want to generalize porin defects, so any OmpK35 or OmpK36 mutation detected with a `Ter` or `del` in the HGVS nomenclature, we will group together as `OmpK35-\u0394` or `OmpK35-\u0394` representing a loss/truncation of the protein. 
+
+```{r kleborate_grouping}
+# Selecting the relevant columns for relabeling and plotting later
+kp_mero_euscape_mic <- kp_mero_euscape %>% select(id, measurement)
+kp_Carb_kleborate_euscape <- kleborate_raw %>% select(strain, Omp_mutations, Bla_Carb_acquired)
+kp_kleborate_mic_euscape <- left_join(kp_mero_euscape_mic, kp_Carb_kleborate_euscape, by = c("id" = "strain"))
+
+# Grouping by carbapenem resistance genes into their gene family
+kp_kleborate_mic_euscape <- kp_kleborate_mic_euscape %>%
+  mutate(Carb = case_when(
+    grepl(";", Bla_Carb_acquired) ~ "multiple",
+    grepl("KPC", Bla_Carb_acquired) ~ "KPC",
+    grepl("NDM", Bla_Carb_acquired) ~ "NDM",
+    grepl("OXA", Bla_Carb_acquired) ~ "OXA",
+    grepl("VIM", Bla_Carb_acquired) ~ "VIM",
+    grepl("IMP", Bla_Carb_acquired) ~ "IMP",
+    grepl("CTX-M", Bla_Carb_acquired) ~ "CTX-M",
+    TRUE ~ "none"
+  )) %>%
+  # Assigning wildtype (wt) or mutation (mut) porin status (i.e., any change from wt is considered mut)
+  mutate(Porin = if_else(Omp_mutations == "-", "wt", "mut"))
+
+# Grouping porin mutations
+kp_kleborate_mic_euscape <- kp_kleborate_mic_euscape %>%
+  mutate(ompK35_label = case_when(
+    grepl("OmpK35[^;]*(Ter|del)", Omp_mutations) ~ "OmpK35-\u0394",
+    TRUE ~ "OmpK35-wt"
+  )) %>%
+  mutate(ompK36_label = case_when(
+    grepl("OmpK36[^;]*(Ter|del)", Omp_mutations) ~ "OmpK36-\u0394",
+    grepl("OmpK36:p.134_135insGD", Omp_mutations) ~ "OmpK36:p.134_135insGD",
+    grepl("OmpK36:p.136_137insTD", Omp_mutations) ~ "OmpK36:p.136_137insTD",
+    grepl("OmpK36:c.25C>T", Omp_mutations) ~ "OmpK36:c.25C>T",
+    grepl("OmpK36:p.135_136insD", Omp_mutations) ~ "OmpK36:p.135_136insD",
+    TRUE ~ "OmpK36-wt"
+  )) %>%
+  mutate(porin_label = paste0(ompK35_label, "; ", ompK36_label)) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36-wt", "wt OmpK35 OmpK36")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36-\u0394", "OmpK36-\u0394")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36:c.25C>T", "OmpK36:c.25C>T")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36:p.134_135insGD", "OmpK36:p.134_135insGD")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36:p.135_136insD ", "OmpK36:p.135_136insD")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-\u0394; OmpK36-wt", "OmpK35-\u0394")) %>%
+  mutate(porin_label = str_replace_all(porin_label, "OmpK35-wt; OmpK36:p.135_136insD", "OmpK36:p.135_136insD"))
+
+# Setting the orders for plotting
+kp_kleborate_mic_euscape$Porin <- factor(kp_kleborate_mic_euscape$Porin, levels = c("wt", "mut"))
+kp_kleborate_mic_euscape$Carb <- factor(kp_kleborate_mic_euscape$Carb, levels = c("none", "KPC", "NDM", "OXA", "VIM", "IMP", "CTX-M", "multiple"))
+kp_kleborate_mic_euscape$porin_label <- factor(kp_kleborate_mic_euscape$porin_label,
+  levels = c(
+    "wt OmpK35 OmpK36",
+    "OmpK35-\u0394",
+    "OmpK35-\u0394; OmpK36:p.134_135insGD",
+    "OmpK35-\u0394; OmpK36-\u0394",
+    "OmpK35-\u0394; OmpK36:p.136_137insTD",
+    "OmpK35-\u0394; OmpK36:c.25C>T",
+    "OmpK35-\u0394; OmpK36:p.135_136insD",
+    "OmpK36-\u0394",
+    "OmpK36:p.134_135insGD",
+    "OmpK36:p.135_136insD",
+    "OmpK36:c.25C>T"
+  )
+)
+```
+
+## Plot the Grouped Kleborate AMR Markers
+
+We will now generate a combined box plot + scatter plot that shows each isolate with its meropenem MIC value, generalized grouping of carbapenem resistance gene families and porin status, with detailed porin information. 
+
+```{r kleborate_grouping_plot, fig.height=6, fig.width=8}
+# Counting the porin types to be used in the plot
+counts <- kp_kleborate_mic_euscape %>%
+  count(porin_label)
+
+labels_with_n <- setNames(
+  paste0(counts$porin_label, " (n=", counts$n, ")"),
+  counts$porin_label
+)
+
+# Generate the plot
+kp_kleborate_grouped_plot <- ggplot(
+  kp_kleborate_mic_euscape,
+  aes("", as.numeric(measurement))
+) +
+  geom_boxplot(width = 0.7, outlier.shape = NA) +
+  geom_jitter(aes(color = porin_label), size = 1.5, alpha = 0.7) +
+  scale_color_manual(
+    name = "Porin Status",
+    values = c(
+      "wt OmpK35 OmpK36" = "grey",
+      "OmpK35-\u0394" = "#DE4BAD",
+      "OmpK35-\u0394; OmpK36:p.134_135insGD" = "#3B81E3",
+      "OmpK35-\u0394; OmpK36-\u0394" = "#7FB550",
+      "OmpK35-\u0394; OmpK36:p.136_137insTD" = "#ff8a5e",
+      "OmpK35-\u0394; OmpK36:c.25C>T" = "#733808",
+      "OmpK35-\u0394; OmpK36:p.135_136insD" = "pink",
+      "OmpK36-\u0394" = "#fff2a6",
+      "OmpK36:p.134_135insGD" = "#935BBD",
+      "OmpK36:p.135_136insD" = "red",
+      "OmpK36:c.25C>T" = "black"
+    ),
+    labels = labels_with_n
+  ) +
+
+  # log2 scale for MIC distribution
+  scale_y_continuous(
+    trans = "log2", breaks = c(0.25, 0.5, 1, 2, 4, 8, 16, 32),
+    name = "Meropenem MIC (mg/L)"
+  ) +
+
+  # horizontal reference lines for EUCAST and ECOFF
+  geom_hline(yintercept = 2, linetype = "solid", color = "black", linewidth = 0.4) +
+  geom_hline(yintercept = 8, linetype = "solid", color = "black", linewidth = 0.4) +
+  geom_hline(yintercept = 0.125, linetype = "dashed", color = "black", linewidth = 0.4) +
+  facet_grid(
+    . ~ Carb + Porin,
+    switch = "x"
+  ) +
+  theme_bw() +
+  theme(
+    strip.placement = "outside",
+    strip.background = element_blank(),
+    panel.border = element_blank(),
+    panel.grid = element_blank(),
+    panel.spacing = unit(0, "lines"),
+    axis.line = element_line(color = "black", linewidth = 0.4),
+    plot.margin = margin(5.5, 5.5, 5.5, 60)
+  ) +
+  labs(x = "")
+
+kp_kleborate_grouped_plot
+
+```
+
+``` {r, eval=F}
+# if you have the 'cowplot' library you can use it to add labels "Carb" and "Porin" to the rows at the bottom of the plot
+library(cowplot)
+ggdraw(kp_kleborate_grouped_plot) +
+  draw_label("Carb", x = 0.08, y = 0.11, hjust = 0, fontface = "bold", size = 10) +
+  draw_label("Porin", x = 0.08, y = 0.065, hjust = 0, fontface = "bold", size = 10)
+```
+
+From the above `kp_kleborate_grouped_plot` plot (with ECOFF = 0.125mg/L, EUCAST Susceptible breakpoint at <=2mg/L, EUCAST Resistant breakpoint >8mg/L), we can see that in isolates:
+
+- with no carbapenemase and wildtype (wt) porin, the median meropenem MIC is <0.25mg/L.
+- with any porin defect (no carbapenemase), the median meropenem MIC is elevated to 1mg/L. 
+
+Across isolates with only a single carbapenemase (and wildtype porin), median MICs generally remained below the EUCAST resistant breakpoint (>8 mg/L), except for those carrying NDM, IMP, or multiple carbapenemases, which exceeded the resistant breakpoint.
+
+Notably, isolates with KPC or OXA carbapenemases had median MICs ≤2 mg/L and the presence of a porin defect increased the median MICs to above the EUCAST resistant breakpoint.
+
+## Mean (Median) Meropenem MIC Table 
+
+While the `kp_kleborate_grouped_plot` plot is useful for visually inspecting the median MIC values, to numerically show the mean (median) meropenem MIC (mg/L) for isolates with different carbapenem resistance gene families and porin statuses, we will generate a table.
+
+```{r kleborate_mic_table}
+# Simplifying porin status
+kp_kleborate_mic_euscape <- kp_kleborate_mic_euscape %>%
+  mutate(ompK35 = case_when(
+    grepl("OmpK35[^;]*(Ter|del)", Omp_mutations) ~ "\u0394",
+    TRUE ~ "wt"
+  )) %>%
+  mutate(ompK36 = case_when(
+    grepl("OmpK36[^;]*(Ter|del)", Omp_mutations) ~ "\u0394",
+    grepl("OmpK36:p.134_135insGD", Omp_mutations) ~ "Loop mutation",
+    grepl("OmpK36:p.136_137insTD", Omp_mutations) ~ "Loop mutation",
+    grepl("OmpK36:p.135_136insD", Omp_mutations) ~ "Loop mutation",
+    grepl("OmpK36:c.25C>T", Omp_mutations) ~ "c25t mutation",
+    TRUE ~ "wt"
+  ))
+
+# Convert the table into wide format with porin status as rows and carbapenemase status as columns
+mean_MIC_table <- kp_kleborate_mic_euscape %>%
+  group_by(ompK35, ompK36, Carb) %>%
+  summarise(
+    mean_MIC = round(mean(as.numeric(measurement), na.rm = TRUE), 1),
+    median_MIC = round(median(as.numeric(measurement), na.rm = TRUE), 1),
+    .groups = "drop"
+  ) %>%
+  mutate(mean_median = paste0(mean_MIC, " (", median_MIC, ")")) %>%
+  select(-mean_MIC, -median_MIC) %>%
+  pivot_wider(
+    names_from = Carb,
+    values_from = mean_median
+  ) %>%
+  select(ompK35, ompK36, none, KPC, NDM, OXA, VIM, IMP, `CTX-M`, multiple) %>%
+  mutate(order_group = case_when(
+    ompK35 == "wt" & ompK36 == "wt" ~ 1,
+    ompK35 == "\u0394" & ompK36 == "wt" ~ 2,
+    ompK35 == "\u0394" & ompK36 == "Loop mutation" ~ 3,
+    ompK35 == "\u0394" & ompK36 == "c25t mutation" ~ 4,
+    ompK35 == "\u0394" & ompK36 == "\u0394" ~ 5,
+    ompK35 == "wt" & ompK36 == "Loop mutation" ~ 6,
+    ompK35 == "wt" & ompK36 == "c25t mutation" ~ 7,
+    ompK35 == "wt" & ompK36 == "\u0394" ~ 8,
+    TRUE ~ 9
+  )) %>%
+  arrange(order_group) %>%
+  select(-order_group)
+
+mean_MIC_table
+```
+
+``` {r eval=F}
+# If you have the gt pacakge, you can use it to make the table aesthetically pleasing
+library(gt)
+mean_MIC_table_aes <- mean_MIC_table %>%
+  gt() %>%
+  cols_label(
+    ompK35 = html("<b>OmpK35</b>"),
+    ompK36 = html("<b>OmpK36</b>"),
+    none = html("<b>None</b>"),
+    KPC = html("<b>KPC</b>"),
+    NDM = html("<b>NDM</b>"),
+    OXA = html("<b>OXA</b>"),
+    VIM = html("<b>VIM</b>"),
+    IMP = html("<b>IMP</b>"),
+    `CTX-M` = html("<b>CTX-M</b>"),
+    multiple = html("<b>Multiple</b>")
+  ) %>%
+  tab_spanner(
+    label = html("<b>Porin status</b>"),
+    columns = c("ompK35", "ompK36")
+  ) %>%
+  tab_spanner(
+    label = html("<b>Carbapenemase status</b>"),
+    columns = c("none", "KPC", "NDM", "OXA", "VIM", "IMP", "CTX-M", "multiple")
+  ) %>%
+  fmt_missing(
+    columns = everything(),
+    missing_text = "-"
+  ) %>%
+  # Background color based on EUCAST breakpoints
+  data_color(
+    columns = c("none", "KPC", "NDM", "OXA", "VIM", "IMP", "CTX-M", "multiple"),
+    fn = function(x) {
+      numeric_vals <- as.numeric(str_extract(x, "^[0-9.]+"))
+
+      case_when(
+        is.na(numeric_vals) ~ "transparent",
+        numeric_vals <= 2 ~ "#3CAEA3",
+        numeric_vals <= 8 ~ "#F6D55C",
+        numeric_vals > 8 ~ "#ED553B"
+      )
+    }
+  ) %>%
+  # Changing orange cells to have white font so it is easier to see the numbers
+  data_color(
+    columns = c("none", "KPC", "NDM", "OXA", "VIM", "IMP", "CTX-M", "multiple"),
+    fn = function(x) {
+      numeric_vals <- as.numeric(str_extract(x, "^[0-9.]+"))
+
+      case_when(
+        is.na(numeric_vals) ~ "black",
+        numeric_vals > 8 ~ "white",
+        TRUE ~ "black"
+      )
+    },
+    apply_to = "text"
+  ) %>%
+  # aligning columns
+  cols_align(
+    align = "center",
+    columns = everything()
+  ) %>%
+  # text options
+  tab_options(
+    table.font.names = "Arial",
+    table.font.size = 14,
+    heading.title.font.size = 16,
+    table.border.top.color = "black",
+    table.border.bottom.color = "black"
+  ) %>%
+  # column widths
+  cols_width(
+    ompK35 ~ px(125),
+    ompK36 ~ px(125),
+    everything() ~ px(100)
+  )
+
+# Adding title and legend for colour
+mean_MIC_table_aes <- mean_MIC_table_aes %>%
+  tab_header(
+    title = html("<b>Mean (Median) Meropenem Minimum Inhibitory Concentration (mg/L)</b>"),
+    subtitle = html(
+      "<span style='font-size:12px;'>
+      <b>EUCAST clinical breakpoint:</b>
+      <span style='color:#3CAEA3;'>■</span> Susceptible (≤ 2 mg/L) &nbsp;
+      <span style='color:#F6D55C;'>■</span> Intermediate (susceptible, increased exposure; 2–8 mg/L) &nbsp;
+      <span style='color:#ED553B;'>■</span> Resistant (> 8 mg/L)
+      </span>"
+    )
+  )
+
+mean_MIC_table_aes
+```
+
+In the `mean_MIC_table` table above, you can see that isolates have:
+- a mean meropenem MIC of 0.3mg/L, when they have wildtype OmpK35/36 porin status with no carbapenemase  
+- a mean meropenem MIC that exceeds the EUCAST R breakpoint (>8mg/L) when a KPC, NDM, or IMP or multiple carbapenemases are harboured
+
+Specifically, for OXA-encoding isolates, the only porin defect that does not change the median meropenem MIC is OmpK35-Δ, whereas all other porin defects shift the median MIC to 16-32mg/L. 
+
+Isolates with VIM carbapenemases alone had a median MIC of 2 mg/L, which increased to 8 mg/L when combined with porin defects (particularly OmpK36-Δ or the loss of both porins).
+
+The only porin defects that raise the mean meropenem MIC above EUCAST S breakpoint (<=2mg/L) are:
+
+- OmpK35-Δ and OmpK36 β-strand Loop 3 insertion, 
+- OmpK35-Δ and OmpK36-Δ, 
+- OmpK36 β-strand Loop 3 insertion,
+- OmpK36-Δ
+
+In summary, the presence of any carbapenemase with any porin defect elevates the mean meropenem MIC to above the EUCAST R breakpoint, with the exception of isolates harbouring an OXA carbapenemase and OmpK35-Δ. 
+
+### Make Figure for AMRgen manuscript
+
+The code below was used to combine the UpSet plot (created using the `amr_upset()` function), the grouped carbapenemase/porin status plot, and the mean (median) meropenem MIC table as shown in the AMRgen manuscript. To make these components you need the `cowplot` and `gt` packages as noted above, as well as the `png` packages.
+
+```{r, eval=FALSE}
+library(png)
+
+# Save the table as a temporary file to retrieve it later when combining the panels
+tmp <- tempfile(fileext = ".png")
+
+gtsave(
+  mean_MIC_table_aes,
+  tmp,
+  vwidth = 4000,
+  vheight = 300
+)
+
+table_grob <- rasterGrob(
+  png::readPNG(tmp),
+  interpolate = TRUE
+)
+
+# Add whitespace to the right and left of table
+table_with_padding <- plot_grid(
+  NULL, # left spacer
+  table_grob,
+  NULL, # right spacer
+  ncol = 3,
+  rel_widths = c(0.001, 0.85, 0.149) # 5% white on each side
+)
+
+# Save the figure with panel labels
+ggsave("Figure5_Kp_EuSCAPE.png",
+  plot = plot_grid(
+    kp_mic_upset_kleborate$plot,
+    kp_kleborate_mic_euscape_plot_label,
+    table_with_padding,
+    labels = c("a)", "b)", "c)"),
+    ncol = 1,
+    rel_heights = c(1, 0.9, 0.5)
+  ),
+  width = 12, height = 15,
+  bg = "white"
+)
+```
+
+# Kleborate (<= v3.1.3)
+
+## Import Kleborate (<= v3.1.3) Genotype Data
+
+We will now compare the latest version of Kleborate (which we've been using up until this point) [development branch; commit #4ec1dcb](https://github.com/klebgenomics/Kleborate/tree/development) to an older version of Kleborate v3.1.3. The older versions (<=v3.1.3) of Kleborate uses informal nomenclature to describe mutations (e.g. [gene]-[mutation], [gene]-X%, OmpK36GD), whereas the updated version follows the HGVS nomenclature standard.
+
+A table of Kleborate v3.1.3 results generated for the EuSCAPE genomes is available in the AMRgen package as `kleborate_raw_v313`. Let's import it and summarise its contents:
+
+```{r import_kleborate}
+# View Kleborate output v3.1.3 (using informal nomenclature (e.g. [gene]-[mutation], [gene]-X%, OmpK36GD))
+head(kleborate_raw_v313, n = 10)
+
+# Use import_kleborate() function and set `hgvs = FALSE` for Kleborate outputs generated from <=v3.1.3 (using non-HGVS nomenclature)
+kleborate_v313 <- import_kleborate(
+  input_table = kleborate_raw_v313,
+  sample_col = "strain",
+  hgvs = FALSE
+)
+
+# Summarize genotype table
+summarise_geno(kleborate_v313, sample_col = "id", marker_col = "marker.label")
+```
+
+# Compare Kleborate versions
+
+Comparing only the carbapenem resistance determinants from an updated version Kleborate (development branch) to a previous version (v3.1.3).
+
+```{r compare_kleborate}
+# Grouping Kleborate development branch carbapenem resistance determinants, so that there is one row per sample
+kleborate_dev_markers_grouped <- kleborate_dev %>%
+  filter(Kleborate_Class == "Omp_mutations" | Kleborate_Class == "Bla_Carb_acquired") %>%
+  select(id, marker.label) %>%
+  rename(kleborate = marker.label) %>%
+  group_by(id) %>%
+  summarise(
+    Kleborate_dev_markers = kleborate %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+# Since we know that Kleborate v3.1.3 does not use HGVS notation, we will change the names of the OmpK36 mutations to match HGVS notation for comparison purposes
+# After, we group them so there is one row per sample
+kleborate_v313_markers_grouped <- kleborate_v313 %>%
+  filter(Kleborate_Class == "Omp_mutations" | Kleborate_Class == "Bla_Carb_acquired") %>%
+  select(id, marker.label) %>%
+  mutate(
+    marker.label = str_replace_all(marker.label, "OmpK36GD", "OmpK36:p.134_135insGD"),
+    marker.label = str_replace_all(marker.label, "OmpK36_c25t", "OmpK36:c.25C>T"),
+    marker.label = str_replace_all(marker.label, "OmpK36TD", "OmpK36:p.136_137insTD")
+  ) %>%
+  rename(kleborate = marker.label) %>%
+  group_by(id) %>%
+  summarise(
+    Kleborate_v313_markers = kleborate %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+# Joining Kleborate version tables for comparison
+compare_kleborate_versions <- full_join(
+  kleborate_dev_markers_grouped,
+  kleborate_v313_markers_grouped
+)
+
+# Comparing Kleborate versions and creating two new columns to show what each version is missing
+compare_kleborate_versions <- compare_kleborate_versions %>%
+  rowwise() %>%
+  mutate(
+    dev_missing = {
+      v313_vec <- str_split(Kleborate_v313_markers, ";")[[1]]
+      dev_vec <- str_split(Kleborate_dev_markers, ";")[[1]]
+      missing <- setdiff(v313_vec, dev_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    },
+    v313_missing = {
+      v313_vec <- str_split(Kleborate_v313_markers, ";")[[1]]
+      dev_vec <- str_split(Kleborate_dev_markers, ";")[[1]]
+      missing <- setdiff(dev_vec, v313_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    }
+  ) %>%
+  ungroup()
+
+# Table listing the AMR markers that are missing from Kleborate v3.1.3
+compare_kleborate_versions %>% count(v313_missing, sort = TRUE)
+
+# Table listing the AMR markers that are missing from the updated Kleborate version
+compare_kleborate_versions %>% count(dev_missing, sort = TRUE)
+```
+
+We can see that there are no carbapenem resistance determinants missed by the updated Kleborate development version, in fact, the new version now additionally calls OmpK36:p.135_136insD (n=12), and OmpK35:- (n=4), and OmpK36:- (n=2) that were previously unidentified. 
+
+Noting that there are no new carbapenem resistance genes identified in this new version of Kleborate which includes an updated AMR database. 
+
+Up until this point, we have only explored using different versions of Kleborate as the AMR genotyper. The next few sections will explore using different AMR genotypers and comparing their results. 
+
+First up, we have AMRFinderPlus!
+
+# AMRFinderPlus
+
+NCBI has developed [AMRFinderPlus](https://github.com/ncbi/amr/wiki), a tool that identifies AMR genes, resistance-associated point mutations, and select other classes of genes using protein annotations and/or assembled nucleotide sequence, published [here](https://doi.org/10.1038/s41598-021-91456-0). It can only be used via command line and also has [organism-specific options](https://github.com/ncbi/amr/wiki/Running-AMRFinderPlus#--organism-option).
+
+## Import AMRFinderPlus Genotype Data
+
+The download_ebi() can download AMRFinderPlus genotype data from the [EBI AMR portal](https://ftp.ebi.ac.uk/pub/databases/amr_portal/releases/), filter your species of interest, and reformat into AMRgen genotype table. The AMRFinderPlus data being used here is from the [2025-12 release using AMRFinderPlus version v4.0](https://www.ebi.ac.uk/amr/methods/). Noting that as of AMRFinderPlus (v4.2.5), there is additional functionality to identify putatively function disrupting mutations in genes (i.e., ompk35 and ompk36 loss and truncations) that lead to resistance, which Kleborate already identifies. 
+
+```{r download_ebi_amrfp_genotype, eval=F}
+# Download Klebsiella pneumoniae genotype AMRFinderPlus data and re-format the data into an AMRgen genotype table
+amrfp <- download_ebi(
+  data = "genotype", species = "Klebsiella pneumoniae",
+  reformat = T
+)
+
+# Filter for isolates in EuSCAPE paper with meropenem phenotypes and remove contaminated samples
+kp_mero_amrfp <- kp_mero_amrfp %>% 
+  filter(id %in% kp_mero_euscape$id) %>%
+  filter(!id %in% contaminated_assemblies)
+```
+
+A copy of this data frame is avaiable in the AMRgen pacakge as `kp_mero_amrfp`:
+``` {r}
+head(kp_mero_amrfp)
+
+# Summary of carbapenem resistance determinants
+summarise_geno(kp_mero_amrfp)
+```
+
+## Compare AMRFinderPlus to Kleborate genotype results
+
+We are going to compare the AMRFinderPlus results that we just downloaded and the Kleborate development branch results. We know that there are differences between AMRFinderPlus and Kleborate development branch in detecting porin defects: 
+
+- Kleborate development branch does not identify OmpK35_E132K. As per NCBI Reference Gene Catalog, the citation related to this mutation is PMID: 20660684. In the paper, the mutation (ompK35_E132K) is detected in a strain (AIS080884) with an ompk36 mutation (Ser255Thr) that is categorized as low carbapenem resistance. There is no other other literature that experimentally tests this mutation alone, other papers only report the presence of the mutation (often in combination with other mutations/carbapenemases).
+
+- AMRFinderPlus (<v4.2.5) only does not detect nucleotide mutations (e.g., OmpK36_c25t), nor does it detect loss of OmpK35/36 (e.g., OmpK35:- or OmpK36:-). 
+
+As such, these differences make it difficult to compare, so we will simplify and remove `OmpK35:-` and `OmpK36:-` from the Kleborate development branch results. 
+
+
+```{r amrfp_v_kleborate}
+# To count and see the names of the carbapenem resistance determinants
+kp_mero_amrfp %>%
+  filter(drug_class == "Carbapenems") %>%
+  count(marker, sort = TRUE)
+
+# Massaging AMRfp marker names to match Kleborate names
+amrfp_simplified <- kp_mero_amrfp %>%
+  filter(drug_class == "Carbapenems") %>%
+  mutate(
+    marker_amrfp = str_replace_all(marker, "bla", ""),
+    marker_amrfp = str_replace_all(marker_amrfp, "ompK36_D135DGD", "OmpK36:p.134_135insGD"),
+    marker_amrfp = str_replace_all(marker_amrfp, "ompK36_D135DD", "OmpK36:p.135_136insD"),
+    marker_amrfp = str_replace_all(marker_amrfp, "ompK36_T136TDT", "OmpK36:p.136_137insTD")
+  ) %>%
+  select(id, marker_amrfp) %>%
+  rename(AMRfp = marker_amrfp) %>%
+  group_by(id) %>%
+  summarise(
+    AMRfp_markers = AMRfp %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+# Filtering Kleborate AMR markers
+# Excluding `OmpK35:-` and `OmpK36:-` since we know that AMRfinderplus (<v4.2.5) does not detect loss/truncations of OmpK35 and OmpK36
+kleborate_dev_simplified <- kleborate_dev %>%
+  filter(Kleborate_Class == "Omp_mutations" | Kleborate_Class == "Bla_Carb_acquired") %>%
+  select(id, marker.label) %>%
+  rename(kleborate = marker.label) %>%
+  filter(!kleborate %in% c("OmpK35:-", "OmpK36:-")) %>%
+  group_by(id) %>%
+  summarise(
+    Kleborate_markers = kleborate %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+# Joining AMRFinderPlus and Kleborate tables
+compare_amrfp_kleborate <- full_join(
+  amrfp_simplified,
+  kleborate_dev_simplified
+)
+
+# Comparing AMRFinderPlus and Kleborate tables and creating two new columns to show what AMRFinderPlus is missing and what Kleborate is missing
+compare_amrfp_kleborate <- compare_amrfp_kleborate %>%
+  rowwise() %>%
+  mutate(
+    Kleborate_dev_missing = {
+      amr_vec <- str_split(AMRfp_markers, ";")[[1]]
+      kleb_vec <- str_split(Kleborate_markers, ";")[[1]]
+      missing <- setdiff(amr_vec, kleb_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    },
+    AMRfp_missing = {
+      amr_vec <- str_split(AMRfp_markers, ";")[[1]]
+      kleb_vec <- str_split(Kleborate_markers, ";")[[1]]
+      missing <- setdiff(kleb_vec, amr_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    }
+  ) %>%
+  ungroup()
+
+# Table listing the AMR markers that are missing from Kleborate (but detected in AMRFinderPlus)
+compare_amrfp_kleborate %>% count(Kleborate_dev_missing, sort = TRUE)
+
+# Table listing the AMR markers that are missing from AMRFinderPlus (but detected in Kleborate)
+compare_amrfp_kleborate %>% count(AMRfp_missing, sort = TRUE)
+```
+
+Since we know there are differences in porin defect detection between Kleborate and AMRFinderPlus, we can focus on the carbapenemase detection. AMRFinderPlus is missing the detection of `CTX-M-33` in one genome and  `KPC-12` in another genome.
+
+### CTX-M-33
+
+However, AMRFinderPlus is not "missing" `CTX-M-33` and `KPC-12` in their database.
+In the case of `CTX-M-33`, it is detected in n=1 genome and is annotated as conferring resistance to `Cephalosporins (3rd gen.)` instead of `Carbapenems`, which is why it has been excluded from the AMR genotype table since we filtered for `drug_class=="Carbapenems"`. 
+
+```{r amrfp_ctxm33}
+# CTX-M-33 is annotated as conferring resistance to Cephalosporins (3rd gen.) and is identified by AMRFinderPlus in Sample SAMEA3721133
+kp_mero_amrfp %>%
+  filter(gene == "blaCTX-M-33") %>%
+  select(id, gene, drug_class)
+
+# Confirming that CTX-M-33 is identified in SAMEA3721133 using Kleborate development branch
+compare_amrfp_kleborate %>%
+  filter(AMRfp_missing == "CTX-M-33") %>%
+  select(id, AMRfp_markers, Kleborate_markers)
+
+# Checking phenotype of SAMEA3721133
+kp_mero_euscape %>%
+  filter(id == "SAMEA3721133") %>%
+  select(id, mic, pheno_eucast, ecoff)
+```
+
+The primary literature that describes CTX-M-33 by [Galani, et al.](https://doi.org/10.1016/j.ijantimicag.2006.11.010) describes the clinical strain of *E. coli* 2439 harbouring CTX-M-33, *E. coli* RC85 recipient, and *E. coli* 2439 transconjugant (aka E. coli RC85 recipient + CTX-M-33). The authors performed antibiotic susceptibility tests on *E. coli* RC85 recipient and *E. coli* 2439 transconjugant which showed that MIC for third gen cephalosporins increased, but the MIC for imipenem remained the when harbouring CTX-M-33. In the Comprehensive Antibiotic Resistance Database (CARD, v4.0.1)  [CTX-M-33](https://card.mcmaster.ca/ontology/38295) is annotated as conferring resistance to cephalosporins (not carbapenems). The EuSCAPE isolate (SAMEA3721133 from above) only harbours CTM-M-33 and is meropenem resistant. Based on this conflicting evidence, it is unclear if `CTX-M-33` in *K. pneumoniae* confers resistance to carbapenems. The experimental work was performed in an *E. coli* strain and not *K. pneumoniae* and only imipenem was the only carbapenem tested. Additional experimental work and epidemiological support from *K. pneumoniae* strains harbouring CTX-M-33 with carbapenem susceptibility test results is needed to understand the substrate activity of CTX-M-33.
+
+### KPC-12
+
+Similarly, in the case of `KPC-12`, it is detected in n=1 genome using AMRFinderPlus and is annotated as conferring resistance to `Cephalosporins (3rd gen.)` instead of `Carbapenems`, which is why it has been excluded from the genotype table since we filtered for `drug_class=="Carbapenems"`. 
+
+```{r amrfp_kpc12}
+# KPC-12 is annotated as conferring resistance to Cephalosporins (3rd gen.)  and is identified by AMRFinderPlus in Sample SAMEA3649729
+kp_mero_amrfp %>%
+  filter(gene == "blaKPC-12") %>%
+  select(id, gene, drug_class)
+
+# Confirming that KPC-12 is identified in SAMEA3649729 using Kleborate. It also harbours OmpK36:p.134_135insGD
+compare_amrfp_kleborate %>%
+  filter(AMRfp_missing == "KPC-12") %>%
+  select(id, AMRfp_markers, Kleborate_markers)
+
+# Checking phenotype of SAMEA3649729
+kp_mero_euscape %>%
+  filter(id == "SAMEA3649729") %>%
+  select(id, mic, pheno_eucast, ecoff)
+```
+
+The only primary literature discussing KPC-12 is by [Han, et al.](https://doi.org/10.2147/IDR.S465699) where they show that a strain of *E. coli* DH5alpha + empty plasmid vs. *E. coli* DH5alpha + plasmid with `KPC-12` does not change meropenem MIC (0.06mg/L) with small elevation in imipenem (0.25 vs. 1 mg/L) and ertapenem (<=0.12mg/L vs. 0.25 mg/L) MICs. Whereas, *E. coli* DH5alpha + empty plasmid vs. *E. coli* DH5alpha + plasmid with `KPC-12` elevates the MICs for ceftriaxone (<=0.25 vs. >=64 mg/L, 3rd gen cephalosporin) and cefuroxime (8 vs. 256 mg/L, 2nd gen cephalosporin). This experiment was performed in an *E. coli* strain, not *K. pneumoniae*. In the EuSCAPE dataset (from above), there is only one isolate (SAMEA3649729) with KPC-12, which harbours both KPC-12 and OmpK36:p.134_135insGD and is meropenem resistant. Based on this conflicting evidence, it is not clear whether `KPC-12` should be changed to conferring resistance to 3rd generation cephalosporins or carbapenems. Similar to Kleborate, CARD v4.0.1 also has [KPC-12](https://card.mcmaster.ca/ontology/38722) annotated as a carbapenemase. Further experimental work performed in a *K. pneumoniae* strain and having additional evidence from *K. pneumoniae* strains with genotype-phenotype data can help strengthen our understanding of its substrate specificity. 
+
+We will not continue to investigate what is missing in the Kleborate development branch, compared to AMRFinderPlus, for the purpose and lengthiness of this vignette. These two examples act as ways to investigate differences in AMR genotypers and how ultimately, choosing a specific genotyper will impact the foundation upon which we understand AMR genotype-phenotype relationships. 
+
+Next up, we have the Resistance Gene Identifier (RGI)!
+
+# Resistance Gene Identifier (RGI) 
+
+[RGI](https://github.com/arpcard/rgi) identifies resistance determinants from protein or nucleotide data using homology and mutation models, published [here](https://doi.org/10.1093/nar/gkac920). The software uses reference data from the [Comprehensive Antibiotic Resistance Database](https://card.mcmaster.ca/). It can be run via the  [website](https://card.mcmaster.ca/analyze/rgi) or on the [command line](https://github.com/arpcard/rgi).
+
+CARD is an ontology-drive database including resistance genes, their products, and the antibiotics they confer resistance towards. CARD operates at both a drug class and drug-specific level, where curators can establish `gene confers_resistance_to_drug antibiotic` relationships. Lastly, CARD includes both intrinsic/core and acquired resistance determinants. 
+
+## Import Resistance Gene Identifier (RGI) results
+
+Import RGI results using the `import_rgi()` function. This function imports and processes genotyping results from RGI extracting antimicrobial resistance determinants and mapping them to standardised drug classes/antibiotics. It also shortens determinant names using `CARD Short Names` as provided by CARD (https://card.mcmaster.ca/download in `aro_index.tsv`).
+
+**Note**: Check the number of genomes that you expect using `summarise_geno()`. RGI text output will be empty if there are no AMR determinants identified in the submitted genome, so you have to either:
+
+1) Add sample IDs with no AMR determinants into the `ORF_ID` column of the RGI text output that you are importing, or
+
+2) List your sample IDs in a vector using the `samples_no_amr=` parameter in the `import_rgi()` function. For example, `import_rgi(rgi_EuSCAPE_raw, samples_no_amr = c("SampleA_noAMR", "SampleB_noAMR", "SampleC_noAMR"))`
+
+The data frame `rgi_EuSCAPE_raw` included in the AMRgen package provides CARD RGI calls for the EuSCAPE genomes:
+```{r import_rgi}
+# Sample IDs with no AMR determinants have been added to rgi_EuSCAPE_raw under the `ORF_ID` column with the rest of the rows left blank
+tail(rgi_EuSCAPE_raw, n = 10)
+
+# Import RGI output with n=1490 isolates
+rgi <- import_rgi(rgi_EuSCAPE_raw)
+
+# Summarize genotype data
+summarise_geno(rgi)
+```
+
+## Generate Binary Matrix for RGI AMR Markers
+```{r}
+rgi_binary_matrix <- get_binary_matrix(
+  geno_table = rgi,
+  pheno_table = kp_mero_euscape,
+  antibiotic = "Meropenem",
+  drug_class_list = c("Carbapenems"),
+  sir_col = "pheno_eucast",
+  marker_col = "marker.label",
+  keep_assay_values = TRUE,
+  keep_assay_values_from = "mic"
+)
+```
+
+## Solo PPV Analysis for RGI AMR Markers
+```{r rgi_soloPPV, eval=FALSE}
+# No solo markers error when you run solo_ppv_analysis()! Since CARD/RGI includes intrinsic and acquired resistance determinants, there could be intrinsic / core resistance determinants that are found across most (if not all) genomes which obstructs our view of carbapenem resistance determinants found alone.
+
+soloPPV_rgi_mero <- solo_ppv_analysis(binary_matrix = rgi_binary_matrix)
+```
+
+As such, we will exclude the core/intrinsic resistance determinants, using their prevalence and exclude any determinants identified across more than 80% of genomes.
+```{r rgi_binary_prev_filt}
+rgi_binary_matrix_prev80 <- rgi_binary_matrix %>%
+  select(where(~ {
+    if (is.numeric(.x)) {
+      prop_ones <- mean(.x == 1, na.rm = TRUE) # fraction of 1s
+      prop_ones <= 0.80 # keep only if ≤ 80% prevalent across all genomes
+    } else {
+      TRUE
+    }
+  }))
+```
+
+Try running the `solo_ppv_analysis()` function again. 
+```{r rgi_soloPPV_2, fig.height=5}
+soloPPV_rgi_mero <- solo_ppv_analysis(binary_matrix = rgi_binary_matrix_prev80)
+# Count number of genomes that have either `MdtQ` or `MdtQ:-`
+sum(rgi_binary_matrix_prev80$MdtQ == "1" | rgi_binary_matrix_prev80$`MdtQ..-` == "1", na.rm = TRUE)
+```
+
+Only MdtQ and MdtQ variants (MdtQ:-) were identified in the absence of other carbapenem resistance determinants. [MdtQ](https://card.mcmaster.ca/ontology/46464) is an outer-membrane porin identified in *K. pneumoniae* - however the primary paper by [Fan, et al. ](10.1016/j.micpath.2023.106325) only reports a clinical strain with MdtQ resistant to carbapenems, but does not show antibiotic susceptibility tests for proper controls of the same strain with MdtQ vs. without MdtQ. In addition, 96% (n=1429/1490) of the EuSCAPE *K. pneumoniae* genomes have MdtQ or a variant of MdtQ, therefore it could be considered a core gene. In summary, because is no compelling evidence that MdtQ confers resistance to carbapenems and that it is likely a core gene, we will exclude it from further analyses.
+
+```{r rgi_binary_prev_filt_mdtq}
+# Exclude MdtQ and MdtQ:- from the binary matrix
+rgi_binary_matrix_prev80 <- rgi_binary_matrix_prev80 %>% select(-MdtQ, -`MdtQ..-`)
+```
+
+Try running the `solo_ppv_analysis()` function... again. 
+
+```{r rgi_soloPPV_3, fig.height=6,fig.height=10}
+soloPPV_rgi_mero <- solo_ppv_analysis(binary_matrix = rgi_binary_matrix_prev80)
+```
+
+We can finally see carbapenem resistance determinants alone! Since RGI does not detect porin defects, many AMR markers are found alone compared to Kleborate's and AMRFinderPlus' `solo_ppv_analysis()` (see below). For example, NDM-1 alone was found in 59 resistant genomes (using RGI), 31 resistant genomes (using Kleborate), and 41 resistant genomes (using AMRFinderPlus). 
+
+```{r, soloPPV_kleborate_show, fig.height=6,fig.height=10}
+soloPPV_kleborate_mero
+```
+
+```{r soloPPV_amrfp}
+# Generate binary matrix for AMRFinderPlus
+amrfp_binary_matrix <- get_binary_matrix(
+  geno_table = kp_mero_amrfp,
+  pheno_table = kp_mero_euscape,
+  antibiotic = "Meropenem",
+  drug_class_list = c("Carbapenems"),
+  sir_col = "pheno_eucast",
+  keep_assay_values = TRUE,
+  keep_assay_values_from = "mic"
+)
+# Solo PPV analysis
+soloPPV_amrfp_mero <- solo_ppv_analysis(binary_matrix = amrfp_binary_matrix)
+```
+
+## Combinatorial PPV Analysis for RGI AMR Markers
+
+```{r rgi_comboPPV, fig.height=8, fig.width=10}
+comboPPV_rgi_mero <- ppv(
+  binary_matrix = rgi_binary_matrix_prev80,
+  order = "value",
+  min_set_size = 2,
+  antibiotic = "Meropenem",
+  upset_grid = TRUE,
+  plot_assay = TRUE,
+  assay = "mic"
+)
+# Summary of combinatorial PPV
+comboPPV_rgi_mero$summary
+```
+
+Evidently, RGI does not detect OmpK35 or OmpK36 defects so we can only compare the carbapenemases that are detected by RGI vs. Kleborate development branch.
+
+## Compare RGI to Kleborate Genotype Results
+
+```{r rgi_v_kleborate}
+rgi_simplified <- rgi %>%
+  filter(drug_class == "Carbapenems") %>%
+  filter(`Resistance Mechanism` == "antibiotic inactivation") %>%
+  select(id, marker.label) %>%
+  distinct() %>%
+  rename(rgi = marker.label) %>%
+  group_by(id) %>%
+  summarise(
+    rgi_markers = rgi %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+kleborate_simplified <- kleborate_dev %>%
+  filter(Kleborate_Class == "Omp_mutations" | Kleborate_Class == "Bla_Carb_acquired") %>%
+  select(id, marker.label) %>%
+  rename(kleborate = marker.label) %>%
+  filter(!grepl("OmpK35|OmpK36", kleborate)) %>%
+  group_by(id) %>%
+  summarise(
+    Kleborate_markers = kleborate %>%
+      sort() %>%
+      str_c(collapse = ";")
+  )
+
+compare_rgi_kleborate <- full_join(
+  rgi_simplified,
+  kleborate_simplified
+)
+
+compare_rgi_kleborate <- compare_rgi_kleborate %>%
+  rowwise() %>%
+  mutate(
+    Kleborate_missing = {
+      rgi_vec <- str_split(rgi_markers, ";")[[1]]
+      kleb_vec <- str_split(Kleborate_markers, ";")[[1]]
+      missing <- setdiff(rgi_vec, kleb_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    },
+    rgi_missing = {
+      rgi_vec <- str_split(rgi_markers, ";")[[1]]
+      kleb_vec <- str_split(Kleborate_markers, ";")[[1]]
+      missing <- setdiff(kleb_vec, rgi_vec)
+      if (length(missing) == 0) NA_character_ else str_c(missing, collapse = ";")
+    }
+  ) %>%
+  ungroup()
+
+compare_rgi_kleborate %>% count(Kleborate_missing, sort = TRUE)
+compare_rgi_kleborate %>% count(rgi_missing, sort = TRUE)
+```
+
+## Combining Kleborate, AMRFinderPlus and RGI results
+
+To merge Kleborate (development branch), AMRFinderPlus (from EBI), and RGI results, we will combine the binary matrices generated by `get_binary_matrix()`.
+
+Note that you will have to inspect how each of the AMR markers are named and change them so that they match and can be merged, for example AMRFinderPlus appends "bla" in front of all beta-lactamases, whereas RGI and Kleborate do not. **This assumes that the same name is referring to the same reference sequence that is used in each tool/database which is not necessarily true** (even if we wish it were true). Hypothetical example, the NDM-1 sequence in CARD/RGI is `ABCD` vs. AMRFinderPlus NDM-1 sequence is `ACCD` vs. Kleborate NDM-1 sequence is `ACDD`. All AMR databases strive to use the same reference accessions and sequences, but sometimes there can be discrepancies, which need to be kept in mind. 
+
+In the following code, unique AMR markers (i.e., only identified by one AMR genotyper) will be have a suffix to describe the AMR genotyper that it is found by (e.g., Kpne_KpnG will be Kpne_KpnG_rgi). AMR markers identified by more than one genotyper will be merged, where if it was identified by any genotyper in that sample, the binary matrix will have a `1` (present), otherwise `0` (absent).
+
+```{r combine_kleborate_amrfp_rgi}
+# Phenotype columns to remove (that we can put back in later)
+cols_to_remove <- c("pheno", "ecoff", "mic", "R", "NWT")
+
+# Remove columns
+# We will be using the RGI binary matrix where core/intrinsic genes are removed
+df_rgi <- rgi_binary_matrix_prev80 %>% select(-cols_to_remove)
+df_kleborate <- kleborate_binary_matrix %>% select(-cols_to_remove)
+
+# Massage AMRFinderPlus marker names to match RGI/Kleborate
+df_amrfp <- amrfp_binary_matrix %>%
+  select(-cols_to_remove) %>%
+  rename("OmpK36..p.134_135insGD" = "ompK36_D135DGD") %>%
+  rename("OmpK36..p.135_136insD" = "ompK36_D135DD") %>%
+  rename("OmpK36..p.136_137insTD" = "ompK36_T136TDT")
+colnames(df_amrfp) <- gsub("bla", "", colnames(df_amrfp))
+
+# Sort each dataframe by id to make sure the samples are all in the same order
+df_rgi <- df_rgi[order(df_rgi[[1]]), ]
+df_amrfp <- df_amrfp[order(df_amrfp[[1]]), ]
+df_kleborate <- df_kleborate[order(df_kleborate[[1]]), ]
+
+# All column names (excluding id)
+cols_rgi <- colnames(df_rgi)[-1]
+cols_amrfp <- colnames(df_amrfp)[-1]
+cols_kleb <- colnames(df_kleborate)[-1]
+
+all_cols <- unique(c(cols_rgi, cols_amrfp, cols_kleb))
+
+# Function to safely get column or return 0s
+get_col <- function(df, col) {
+  if (col %in% colnames(df)) {
+    df[[col]]
+  } else {
+    rep(0, nrow(df))
+  }
+}
+
+# Initialize final df (assuming same order of ids)
+combined_binary_matrix <- data.frame(id = df_rgi[[1]])
+
+# Merge all columns using OR
+for (col in all_cols) {
+  combined_binary_matrix[[col]] <- as.integer(
+    get_col(df_rgi, col) |
+      get_col(df_amrfp, col) |
+      get_col(df_kleborate, col)
+  )
+}
+
+# Identify unique columns (present in ONLY one binary matrix)
+unique_rgi <- setdiff(cols_rgi, union(cols_amrfp, cols_kleb))
+unique_amrfp <- setdiff(cols_amrfp, union(cols_rgi, cols_kleb))
+unique_kleb <- setdiff(cols_kleb, union(cols_rgi, cols_amrfp))
+
+# Rename unique columns with suffix
+colnames(combined_binary_matrix)[colnames(combined_binary_matrix) %in% unique_rgi] <- paste0(unique_rgi, "_rgi")
+colnames(combined_binary_matrix)[colnames(combined_binary_matrix) %in% unique_amrfp] <- paste0(unique_amrfp, "_amrfp")
+colnames(combined_binary_matrix)[colnames(combined_binary_matrix) %in% unique_kleb] <- paste0(unique_kleb, "_kleborate")
+
+# Joining back the phenotype columns
+phenotype_cols <- rgi_binary_matrix_prev80 %>%
+  select(id, pheno, ecoff, mic, R, NWT)
+
+# Merge back into your final_df
+combined_binary_matrix <- combined_binary_matrix %>%
+  left_join(phenotype_cols, by = "id")
+
+# Relocate phenotype columns to the front
+combined_binary_matrix <- combined_binary_matrix %>%
+  relocate(pheno, ecoff, mic, R, NWT, .after = id)
+```
+
+## Solo PPV Analysis for AMRFinderPlus, RGI, Kleborate AMR Markers
+```{r soloPPV_combined, message=FALSE}
+combined_solo_ppv <- solo_ppv_analysis(binary_matrix = combined_binary_matrix)
+```
+From this solo_ppv_analysis() plot, we can see that there are markers that are found alone which have strong support for a particular phenotype, e.g., NDM-1 association with meropenem resistance (n=31/38 R isolates), LptD:- identified by RGI (associated with susceptibility). Noting that LptD:- indicates a variant of LptD, so the variants need to be further investigated to see if there is a particular mutation/defect that is associated with meropenem susceptibility. 
+
+Another way to investigate the association between AMR markers and meropenem susceptibility is to use the `amr_logistic()` function to perform logistic regression to analyse the relationship between the markers and a specified antibiotic.
+
+## Logistic regression for AMRFinderPlus, RGI, Kleborate AMR Markers
+```{r log_reg_combined}
+combined_logist <- amr_logistic(
+  binary_matrix = combined_binary_matrix,
+  antibiotic = "meropenem",
+  ecoff_col = "ecoff",
+  maf = 10, # filter for AMR markers in at least 10 samples
+  single_plot = TRUE
+)
+# model coefficients
+combined_logist$modelR
+```
+A coefficient above zero indicates that the presence of the AMR marker increases the likelihood of resistance, whereas a coefficient below zero indicates a decrease in probability of resistance. From the plot above showing AMR markers found in more than 10 isolates, majority have a positive association with resistance with the exception of LptD:- (identified by RGI), ompK35_E132K (identified by AMRFinderPlus), and VIM-4:- (identified by RGI). 
+
+
+## Combinatorial PPV Analysis for AMRFinderPlus, RGI, Kleborate AMR Markers
+```{r comboPPV_combined}
+comboPPV_combined_mero <- ppv(
+  binary_matrix = combined_binary_matrix,
+  order = "value",
+  min_set_size = 2,
+  antibiotic = "Meropenem",
+  upset_grid = TRUE,
+  plot_assay = TRUE,
+  assay = "mic"
+)
+
+comboPPV_combined_mero$summary
+```
+
+
+