Skip to content

Questions about the tvr_consensus in tlens_by_allele.tsv file #33

@Flora076

Description

@Flora076

Hello Zach,
Wonderful software and I am using it for Schmidt’s data. I noticed that you have mentioned in other discussions to compare the FASTA file in the tel_reads.fa.gz file, with the supporting reads column from tlens_by_allele.tsv file, and pull the matched sequences out respectively.
I have a question regarding it, as there are more than 1ID in the supporting reads in the tsv file. I extracted the sequences but they are not the same, however they are still being clustered together. May I know why?
I got the following result from the tvr_consensus in the tsv file:

CCCCCCAAAAAAAAVVVVVVVVCCCCCCCGGGGGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNCCCNCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFCFFFFFFFFFFFFFFCFFFCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTTTTTTTVVVVVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVVVVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCLLLLLLLCCCCCCCCCCCCCCCCLLLLLLCCCCCCCCLLLCCCCCCCLLLLLLLLLLLLLLLLLLLCCCCMCLLLLLLLLLLLLLLLLLLLCCCLLLLLLLLLLLLCVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTTTTTTTTTT

Canonical uses the Symbol C to represent the motif sequence TTAGGG. Does the C in the tsv_consensus reading above, represent the symbol or the C from the normal TCGA nitrogenous bases from DNA?
Are the colours in all_final_alleles.png, reflected in the extraction of the tsv_consensus?
I matched the tel_reads.fa.gz and ‘supporting reads’ column in tlens_by_allele.tsv, then reduced it by the sequence motifs into the respective symbols. However, my result is different from the sequence of tsv_consensus derived from the tsv file. Is there an explanation for the difference?

Looking forward to your clarification!
Thank you.
Flora

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions