Skip to content

Error in function annotateCn  #3

@nalcala

Description

@nalcala

Hi,

Thanks for the great tool! I am having an error for a very segmented tumor (more CN segments than variants) that is caused by this line:

vcfGR$cn[S4Vectors::to(overlaps)] <- cnaGR$cn[S4Vectors::from(overlaps)]

It seems that it is due to an error in the code, which should be
vcfGR$cn[S4Vectors::from(overlaps)] <- cnaGR$cn[S4Vectors::to(overlaps)]
instead of
vcfGR$cn[S4Vectors::to(overlaps)] <- cnaGR$cn[S4Vectors::from(overlaps)]

Indeed, overlaps are computed as overlaps <- GenomicRanges::findOverlaps(vcfGR, cnaGR) so "from" refers to vcfGR (which has 1827 lines in my case) and "to" refers to cnaGR (which has 2188 lines in my case). Nevertheless in the code it does the opposite and extract "from" values from cnaGR to assign them to "to" values in vcfGR. This would not throw an error unless there are more lines in cnaGR, in which case to(overlaps) will likely have values out of the range of vcfGR, hence the crash, but in any case this glitch will lead to a wrong annotation of vcf.

For example, in my case, the 1000th entry in overlaps leads to:

cnaGR[S4Vectors::from(overlaps)[1000],]
GRanges object with 1 range and 1 metadata column:
      seqnames            ranges strand |        cn
         <Rle>         <IRanges>  <Rle> | <numeric>
  [1]     chr8 94391001-95165000      |    4.9207
vcfGR[S4Vectors::to(overlaps)[1000],]
GRanges object with 1 range and 6 metadata columns:
                      seqnames    ranges strand | paramRangeID            REF                ALT      QUAL      FILTER        cn
                         <Rle> <IRanges>  <Rle> |     <factor> <DNAStringSet> <DNAStringSetList> <numeric> <character> <numeric>
  chr10:122588467_G/A    chr10 122588467       |           NA              G                  A        NA        PASS    7.9278

which are not even in the same chromosome, while the correct answer would be:

vcfGR[S4Vectors::from(overlaps)[1000],] 
GRanges object with 1 range and 6 metadata columns:
                     seqnames    ranges strand | paramRangeID            REF                ALT      QUAL      FILTER        cn
                        <Rle> <IRanges>  <Rle> |     <factor> <DNAStringSet> <DNAStringSetList> <numeric> <character> <numeric>
  chr9:129660438_G/C     chr9 129660438       |           NA              G                  C        NA        PASS    5.8201
cnaGR[S4Vectors::to(overlaps)[1000],]
GRanges object with 1 range and 1 metadata column:
      seqnames              ranges strand |        cn
         <Rle>           <IRanges>  <Rle> | <numeric>
  [1]     chr9 129060001-129755000       |    5.8201

Thanks!

Nicolas

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions