Add prefixsplit function by Tixii · Pull Request #168 · lh3/seqtk

Tixii · 2021-02-22T21:06:29Z

This adds the ability to read a fastq/fasta file and split the file based on the prefix of each read to enable faster sorting of read sets.

Usage: seqtk prefixsplit [options] <output_filename> <in.fa>
Options:
-p INT length of prefix
-A force FASTA output (discard quality)
-C drop comments at the header lines

It will create files for each prefix of the specified length, e.g.
output_filename.AA.fa
output_filename.AC.fa
....
plus a single file that contains those reads with an N at any position in the prefix:
output_filename.N.fa

Currently only prefix lengths of 1, 2, or 3 are possible, as I felt that creating more than 64 files wouldnt be useful.

There are options to remove the quality scores and drop comments using the same methods as the seqtk seq function.

I have tried to stick to the coding format of the rest of the file, however, this is my first time coding in C and therefore I am sure there are improvements that could be made.

This adds the ability to read a fastq/fasta file and split the file based on the prefix of each read to enable fasting sorting of read sets. Usage: seqtk prefixsplit [options] <output_filename> <in.fa> Options: -p INT length of prefix -A force FASTA output (discard quality) -C drop comments at the header lines It will create files for each prefix of the specified length, e.g. output_filename.AA.fa output_filename.AC.fa .... plus a single file that contains those with an N in the prefix: output_filename.N.fa There are options to remove the quality scores and drop comments using the same methods as the seqtk seq function. I have tried to stick to the coding format of the rest of the file, however, this is my first time coding in C and therefore I am sure there are improvements that could be made.

typo

Unknown added 4 commits February 22, 2021 13:04

fix usage help message

2950ce1

typo

Add option to only print the sequence with no other information

77fcebc

add seqtk exe to gitignore

0eee545

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add prefixsplit function#168

Add prefixsplit function#168
Tixii wants to merge 4 commits intolh3:masterfrom
Tixii:prefixsplit

Tixii commented Feb 22, 2021 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Tixii commented Feb 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Tixii commented Feb 22, 2021 •

edited

Loading