Hello Zach,
Wonderful software and I am using it for Schmidt’s data. I noticed that you have mentioned in other discussions to compare the FASTA file in the tel_reads.fa.gz file, with the supporting reads column from tlens_by_allele.tsv file, and pull the matched sequences out respectively.
I have a question regarding it, as there are more than 1ID in the supporting reads in the tsv file. I extracted the sequences but they are not the same, however they are still being clustered together. May I know why?
I got the following result from the tvr_consensus in the tsv file:
CCCCCCAAAAAAAAVVVVVVVVCCCCCCCGGGGGCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCNNNNNNNNNNCCCNCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFCFFFFFFFFFFFFFFCFFFCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTTTTTTTVVVVVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVVVVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHVVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCLLLLLLLCCCCCCCCCCCCCCCCLLLLLLCCCCCCCCLLLCCCCCCCLLLLLLLLLLLLLLLLLLLCCCCMCLLLLLLLLLLLLLLLLLLLCCCLLLLLLLLLLLLCVVCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCTTTTTTTTTTTTTTTTTTTTTT
Canonical uses the Symbol C to represent the motif sequence TTAGGG. Does the C in the tsv_consensus reading above, represent the symbol or the C from the normal TCGA nitrogenous bases from DNA?
Are the colours in all_final_alleles.png, reflected in the extraction of the tsv_consensus?
I matched the tel_reads.fa.gz and ‘supporting reads’ column in tlens_by_allele.tsv, then reduced it by the sequence motifs into the respective symbols. However, my result is different from the sequence of tsv_consensus derived from the tsv file. Is there an explanation for the difference?
Looking forward to your clarification!
Thank you.
Flora
Hello Zach,
Wonderful software and I am using it for Schmidt’s data. I noticed that you have mentioned in other discussions to compare the FASTA file in the
tel_reads.fa.gzfile, with the supporting reads column fromtlens_by_allele.tsvfile, and pull the matched sequences out respectively.I have a question regarding it, as there are more than 1ID in the supporting reads in the tsv file. I extracted the sequences but they are not the same, however they are still being clustered together. May I know why?
I got the following result from the
tvr_consensusin the tsv file:Canonical uses the Symbol C to represent the motif sequence TTAGGG. Does the C in the
tsv_consensusreading above, represent the symbol or the C from the normal TCGA nitrogenous bases from DNA?Are the colours in
all_final_alleles.png, reflected in the extraction of thetsv_consensus?I matched the tel_reads.fa.gz and ‘supporting reads’ column in
tlens_by_allele.tsv, then reduced it by the sequence motifs into the respective symbols. However, my result is different from the sequence oftsv_consensusderived from the tsv file. Is there an explanation for the difference?Looking forward to your clarification!
Thank you.
Flora