You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It makes no mention of duplex-tools, but says to pip install ont-guppy-duplex-pipeline and then run the script from that package, guppy_duplex, on the original fast5 files.
As far as I can see, this script is a rather clunky wrapper that calls guppy in simplex mode, then performs the equivalent of duplex_tools pairs_from_summary (the code for this is in ont_guppy_duplex_pipeline/channel_neighbours.py and looks like it's related to your duplex_tools/pairs_from_summary.py but the logic is not quite the same) and then runs guppy_basecaller_duplex to get the final result.
My main interest just now is to get a good but quick assessment of the approx number of duplex reads in each dataset, for QC purposes, and so duplex-tools seems the more useful approach. But so save others from having to peer through source code like I've been doing, could you please add some info to the README.md to say what is the relationship between these two ONT-developed packages?
Cheers!
The text was updated successfully, but these errors were encountered:
Sorry, my mistake - I see ont-guppy-duplex-pipeline does also incorporate an alignment-based filtering step, but it does not yield the same results as this package. I get about twice the number of candidate duplex pairs. I guess I'll need to actually basecall these to see how many are false positives.
The scripts in the current version in Guppy were taken from an earlier version of this repository, hence the similarities. Guppy needs updating, IIRC the major difference is the compute performance. @ollenordesjo can comment on the output differences.
Depending on your requirements, you may want to set this threshold lower than the default (I would suggest including the best ~85% of reads or something similar, whereever that threshold may be for your dataset). We had some discussions about setting this threshold more adaptively, but decided that a constant threshold would keep it more reproducible on a per-read level.
The current advice from ONT regarding how to perform duplex basecalling is here:
https://community.nanoporetech.com/posts/guppy-v6-0-0-release (dated 6th December 2021 - login required to view)
It makes no mention of duplex-tools, but says to
pip install ont-guppy-duplex-pipeline
and then run the script from that package,guppy_duplex
, on the original fast5 files.As far as I can see, this script is a rather clunky wrapper that calls guppy in simplex mode, then performs the equivalent of
duplex_tools pairs_from_summary
(the code for this is in ont_guppy_duplex_pipeline/channel_neighbours.py and looks like it's related to your duplex_tools/pairs_from_summary.py but the logic is not quite the same) and then runsguppy_basecaller_duplex
to get the final result.My main interest just now is to get a good but quick assessment of the approx number of duplex reads in each dataset, for QC purposes, and so duplex-tools seems the more useful approach. But so save others from having to peer through source code like I've been doing, could you please add some info to the README.md to say what is the relationship between these two ONT-developed packages?
Cheers!
The text was updated successfully, but these errors were encountered: