You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to asses how well duplex_tools split_on_adapter is doing its job and duplex_tools assess_split_on_adapter asks for the following positional arguments:
seqkit_stats_nosecondary
edited_reads
unedited_reads
split_multiple_times
I imagine the last three are the .pkl files that are created in the folder for split files, but I am not sure what "seqkit_stats_nosecondary". I have tried to introduce the output of
seqkit stats path/to/file --all
and
seqkit stats path/to/file --all
but I get this error:
/media/seq-ur/65225E7076CF2AF3/basecalling_bacterias/K_oxytoca/K_oxytoca_29_03_2023/pass/split/seqkit_stats contains 1 reads
Traceback (most recent call last):
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'read'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/seq-ur/venv/bin/duplex_tools", line 33, in
sys.exit(load_entry_point('duplex-tools==0.3.2', 'console_scripts', 'duplex_tools')())
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/init.py", line 39, in main
args.func(args)
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 129, in main
assess(
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 32, in assess
txt = txt[txt['read'].isin(expected_read_ids)]
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 3760, in getitem
indexer = self.columns.get_loc(key)
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 'read'
Could you help me understand what "seqkit_stats_nosecondary" is?
Thanks!
The text was updated successfully, but these errors were encountered:
Thanks for the question. You're definitely on the right track. You are expected to give it the output from seqkit bam on a bam file that does not have secondary alignments. If your alignment has been done in a way that includes secondary alignments, you would be expected to filter out secondary reads, for example with samtools view:
Hi!
I am trying to asses how well duplex_tools split_on_adapter is doing its job and duplex_tools assess_split_on_adapter asks for the following positional arguments:
seqkit_stats_nosecondary
edited_reads
unedited_reads
split_multiple_times
I imagine the last three are the .pkl files that are created in the folder for split files, but I am not sure what "seqkit_stats_nosecondary". I have tried to introduce the output of
seqkit stats path/to/file --all
and
seqkit stats path/to/file --all
but I get this error:
/media/seq-ur/65225E7076CF2AF3/basecalling_bacterias/K_oxytoca/K_oxytoca_29_03_2023/pass/split/seqkit_stats contains 1 reads
Traceback (most recent call last):
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3652, in get_loc
return self._engine.get_loc(casted_key)
File "pandas/_libs/index.pyx", line 147, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/index.pyx", line 176, in pandas._libs.index.IndexEngine.get_loc
File "pandas/_libs/hashtable_class_helper.pxi", line 7080, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas/_libs/hashtable_class_helper.pxi", line 7088, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'read'
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/seq-ur/venv/bin/duplex_tools", line 33, in
sys.exit(load_entry_point('duplex-tools==0.3.2', 'console_scripts', 'duplex_tools')())
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/init.py", line 39, in main
args.func(args)
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 129, in main
assess(
File "/home/seq-ur/venv/lib/python3.9/site-packages/duplex_tools/assess_split_on_adapter.py", line 32, in assess
txt = txt[txt['read'].isin(expected_read_ids)]
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/frame.py", line 3760, in getitem
indexer = self.columns.get_loc(key)
File "/home/seq-ur/venv/lib/python3.9/site-packages/pandas/core/indexes/base.py", line 3654, in get_loc
raise KeyError(key) from err
KeyError: 'read'
Could you help me understand what "seqkit_stats_nosecondary" is?
Thanks!
The text was updated successfully, but these errors were encountered: