-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question - Split on adapter #29
Comments
Hi, Thanks for the question. It's not yet possible, but I would suspect that it would be useful. We intend to release a better version of template/complement splitting today hopefully that should be better than adapter splitting for duplex. |
Thanks for your quick reply. I will try it out when it is released. |
Hi @jagos01, v0.2.20 is now out, and you can use this to recover reads which are non-split. Feel free to try it out by
This should give you new pod5s in the Feel free to try it out and let me know how things are working. |
Hello @onordesjo, I followed the directions outlined in the readme for duplex calling with dorado. I generated the pair_id files for both step 2a and 2b. They contained 4667 and 7867 pairs respectively. When stereo basecalling those reads, dorado only basecalled 4114 and 1338 reads. Why is the number of stereo basecalled reads less than the number of read pairs? |
Hi @jagos01. Can I ask what type of data you have been looking at? Whole genome? Any amplification? There is some filtering happening in Dorado to ensure that bad pairs don't get through, so that is to be expected. I would expect less pairs generated in step 2b than 2b but greater retention of good pairs. 2a would also necessarily have to be generated without a subset (or alternatively a selection of channels). Any of this information would help to explain what you are seeing. |
Hello @onordesjo. This is bacterial whole genome sequence data. No amplification was carried out. The data is split over two runs (had to restart the sequencer a couple hours into the run). I was also expecting less pairs from 2b. 2a was generated from the complete data set. |
Ah, then my next question is if they were basecalled at the same time (did the SAM contain reads from both of the runs?)
…________________________________
From: jagos01 ***@***.***>
Sent: Monday, December 19, 2022 10:00:44 PM
To: nanoporetech/duplex-tools ***@***.***>
Cc: Subscribed ***@***.***>
Subject: Re: [nanoporetech/duplex-tools] Question - Split on adapter (Issue #29)
Hello @onordesjo<https://github.com/onordesjo>, I followed the directions outlined in the readme for duplex calling with dorado. I generated the pair_id files for both step 2a and 2b. They contained 4667 and 7867 pairs respectively. When stereo basecalling those reads, dorado only basecalled 4114 and 1338 reads. Why is the number of stereo basecalled reads less than the number of read pairs?
Thanks
—
Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AR6SGNYKDNBJGDOYCC7LROLWODLIZANCNFSM6AAAAAATC5UIJQ>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
IMPORTANT NOTICE: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.
CONFIDENTIAL
|
I inspected the pod5 reads for each run and the unmapped BAM file contains reads from both runs. |
Thanks, that helps. Would be keen to take a look at the bam. If you're happy to share it, feel free to email me at olle.nordesjo at nanoporetech.com and I can take a closer look at it.
…________________________________
From: jagos01 ***@***.***>
Sent: Monday, December 19, 2022 11:47:39 PM
To: nanoporetech/duplex-tools ***@***.***>
Cc: Olle Nordesjo ***@***.***>; Comment ***@***.***>
Subject: Re: [nanoporetech/duplex-tools] Question - Split on adapter (Issue #29)
I inspected the pod5 reads for each run and the unmapped BAM file contains reads from both runs.
—
Reply to this email directly, view it on GitHub<#29 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AR6SGNYN5AWFJMLB4KR3ARLWODXZXANCNFSM6AAAAAATC5UIJQ>.
You are receiving this because you commented.Message ID: ***@***.***>
IMPORTANT NOTICE: The information transmitted is intended only for the person or entity to which it is addressed and may contain confidential and/or privileged material. Any review, re-transmission, dissemination or other use of, or taking of any action in reliance upon, this information by persons or entities other than the intended recipient is prohibited. If you received this in error, please contact the sender and delete the material from any computer. Although we routinely screen for viruses, addressees should check this e-mail and any attachment for viruses. We make no warranty as to absence of viruses in this e-mail or any attachments.
CONFIDENTIAL
|
Thanks, I have emailed a link to the bam file. |
Hello,
I am duplex basecalling with dorado. Can split_on_adapter accept unmapped bam files for input/output?
Thanks
The text was updated successfully, but these errors were encountered: