BreakTracer is designed to identify inserted sequence fragments at structural variant (SV) breakpoints using long-read sequencing data. It can detect mobile element insertions, viral sequence integrations, and internal tandem duplications (ITDs), such as FLT3-ITDs. In contrast to generic SV callers like delly, BreakTracer can resolve complex SVs mediated by such insertions and link the inserted sequences to adjacent genomic rearrangements.
BreakTracer is available as a Bioconda package, as a pre-compiled statically linked binary, as a minimal docker container or as a singularity containter (SIF file). The static binaries you can simply download here and then make them executable, e.g.:
chmod a+x ./breaktracer-v0.2.8-linux-amd64
./breaktracer-v0.2.8-linux-amd64
BreakTracer can be built from source using a recursive clone and make. BreakTracer depends on HTSlib and Boost.
git clone --recursive https://github.com/tobiasrausch/breaktracer.git
cd breaktracer/
make all
BreakTracer has been designed to identify inserted sequence fragments at structural variant (SV) breakpoints using long-read sequencing data. For instance, to identify L1 fragments at SV breakpoints:
breaktracer find -n L1 -g hg38.fa input.bam > bp.ins.vcf
BreakTracer can also be used to identify a custom FASTA sequence inserted at SV breakpoints. For instance, to identify a human papillomavirus integration you can use
breaktracer find -e hpv.seq.fa -g hg38.fa input.bam > bp.ins.vcf
or with BCF output:
breaktracer find -e hpv.seq.fa -g hg38.fa -o bp.ins.bcf input.bam
As BreakTracer also identifies plain insertions, the method can be used to detect events such as tandem duplications, for example, FLT3 internal tandem duplications (FLT3-ITDs). In this case, the duplicated FLT3 segment serves as the insertion source sequence, which can then be analyzed with BreakTracer.
samtools faidx hg38.fa chr13:28033983-28034368 | sed 's/>.*$/>source/' > tdsource.fa
samtools faidx tdsource.fa
Make sure to adjust the insertion parameters to detect small insertions (≥15bp). To speed up the analysis, you may also want to subset the BAM file to the region of interest.
samtools view -b input.bam chr13:28033983-28034368 > sub.bam
samtools index sub.bam
breaktracer find -m 15 -c 15 -s 15 -r 0 -e tdsource.fa -g hg38.fa sub.bam
BreakTracer is free and open source (BSD). Consult the accompanying LICENSE file for more details.
HTSlib is heavily used for alignment processing. Boost for various data structures and algorithms and Edlib for pairwise alignments using edit distance.