Comparative Analysis of Diverse Bioinformatics Pipeline Configurations based on Mock Communities and Biopsy samples
Statistical report for a comparative analysis of diverse pipeline configurations using mock community samples and biopsy samples obtained during regular colonoscopy from patients with primary sclerosing cholangitis (PSC).
This study points out the possible biases caused by the bioinformatics processing itself and emphasizes the necessity of cautious presentation of results from similar studies without considering possible variations caused by data processing. A comparative analysis of the custom dataset of 16S rRNA amplicon sequencing reads is presented, which is processed with various settings across the three commonly used pipelines - DADA2, VSEARCH, Deblur.
This study is based on data from the research on the pathology of primary sclerosing cholangitis (PSC). The biological material was obtained during routine protocolary colonoscopies from patients who had undergone liver transplantation for PSC and regularly attended follow-up visits at IKEM. The study was performed according to the Declaration of Helsinki, including the changes accepted during the 59th WMA General Assembly, and approved by the Joint Ethics Committee of IKEM and Thomayer Hospital. All patients provided dedicated informed consent.
In addition, mock community samples (ZymoBIOMICS Microbial Community DNA Standard) were used as internal standards. A dataset of 62 samples was created, comprising 22 internal standards, 20 cecum biopsy samples from patients who underwent liver transplantation for PSC and did not develop recurrence and 20 control samples of cecum biopsy of patients who underwent liver transplantation due to alcohol use disorder (AUD).
The report is divided into 2 parts.
Part 1 refers to the mock community analysis alongside with negative control samples, to check to what degree there is contamination in individual libraries.
Report Part 1 is available at: https://xpolak37.github.io/CA-Microbiome/Report%20Part1.html
Part 2 refers to the analysis of real biopsy samples.
Report Part 2 is available at: https://xpolak37.github.io/CA-Microbiome/Report%20Part2.html