Skip to content

Positional arguments (especially seqkit_stats_nosecondary) in duplex_tools assess_split_on_adapter #40

@rocpv1977

Description

@rocpv1977

Hi!

I am trying to asses how well duplex_tools split_on_adapter is doing its job and duplex_tools assess_split_on_adapter asks for the following positional arguments:
seqkit_stats_nosecondary
edited_reads
unedited_reads
split_multiple_times

I imagine the last three are the .pkl files that are created in the folder for split files, but I am not sure what "seqkit_stats_nosecondary". I have tried to introduce the output of

seqkit stats path/to/file --all

and

seqkit stats path/to/file --all

but I get this error:

/media/seq-ur/65225E7076CF2AF3/basecalling_bacterias/K_oxytoca/K_oxytoca_29_03_2023/pass/split/seqkit_stats contains 1 reads
Traceback (most recent call last):
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'read'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/seq-ur/venv/bin/duplex_tools", line 33, in
sys.exit(load_entry_point('duplex-tools==0.3.2', 'console_scripts', 'duplex_tools')())
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/init.py", line 39, in main
args.func(args)
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 129, in main
assess(
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 32, in assess
txt = txt[txt['read'].isin(expected_read_ids)]
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 3760, in getitem
indexer = self.columns.get_loc(key)
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 'read'

Could you help me understand what "seqkit_stats_nosecondary" is?

Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions