Skip to content

Saturation calculation #5

@afkoeppel

Description

@afkoeppel

This isn't really an issue so much as a question. First off, I want to say thanks for posting this tutorial. I'm analyzing some Tnseq data for the first time and your pipeline has been incredibly helpful to me. One other thing I'm trying to accomplish is to compute the saturation of Tn insert sites (% of TA sites in the reference that are covered).
In my initial calculation I took the number of rows in the hits.txt file and divided by the number of TA's in the reference genome, which I had counted up using a custom script. One mistake (at least I'm pretty sure it was a mistake) that I made in doing this, was that I also counted the AT's, (I'm used to thinking of TA and AT as pretty much the same thing, since TA on one strand is AT on the other), but in reviewing your diagrams here I believe this was the wrong move, and gave a much lower saturation than it should have (there are many more ATs than TAs in my reference sequence).
I was wondering if you could confirm my suspicion that ATs shouldn't be counted, or if I'm going about this the wrong way entirely, recommend a method for computing % saturation.
Thanks so much, and thanks again for the great tutorial.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions