STOLEN is a collection of self-written code snippets that turned out to be useful in the everyday life of a bioinformatician.
- batch_correct.py: Corrects RNA expression data from different sources for potential batch effects using pyComBat.
- merge_megsap_expression_files.py: Merge RNA count files from megSAP into one TSV file containing TPM values.
- remove_duplicate_read_names.py: Removes duplicate read names from a fastq.gz file. Abundant reads with the same name will be discarded.
- prepare_strelka_for_pvacseq.py: Adds annotations that are neccessary for the pVACseq neoantigen prediction pipeline to Strelka2 VCF files.