-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prevent pilon from changing contig names after polishing #7
Comments
Pilon will not be implemented in this pipeline. |
I am sorry to ask but would it be possible to 1. add the full contig name to the polished fasta headers. and also get the consensus polished sequence instead of many contigs ? Thank you in advance. it is quite basic and would be valuable to PILON.
|
Dear @Isoris, |
Hello Yes! I just wonder if it would be possible to directly add a new option in pilon to have better headers but also to call a consensus like with bcftools like.if we choose --diploid we could call.with heterozygosity -H 1 in bcftools. But it seems that the Pilon software is.not maintained anymore, right? In my case I could find a way to polish my genome so it is still a very useful too. Thank you and sorry for posting in the wrong repo. |
No worries! |
In my case. I first created a meryl database of Kmers from HIFI reads and
from Illumina reads. Then I filtered it for the kmers that appears at least
2 times.
After that I ran Merqury and got the bed files of the assembly errors based
on the k-mers missing in the assembly but present in the meryl database.
Then I used this bed of positions of assemblies errors to provide the
--targets to PILON.
My QV increased from 41 to 52 in my eukaryote genome ( 1Gb ) in the two
haplotypes.
I am now comparing the results contig by contig. It seems that targeted
PILON is much better than racon and other tools because it can be more
haplotype aware? But I am not sure it seems that it only correct small
scale SVs. Maybe It would need another tool to increase QV further to 60?
My data was 12 X HiFi 33 X nanopore 36 X HiC 36 X illumina.
At first the genome was at QV 31. Then I used NextPolish2 (3 times) and the
QV increased to 40. Then Pilon on each separate scaffolds with --targets
increased it from 40 to 52.
Do you have any recommendations on what's next ? Or I should simply stop
here ?
Thank you.
…On Thu, Oct 17, 2024, 2:00 PM Håkon Kaspersen ***@***.***> wrote:
No worries!
I actually stopped using Pilon for my polishing because recently it has
been found that it can introduce errors into the assembly.
Have a look here:
https://rrwick.github.io/2023/05/15/short-read-polishing-short-read-assemblies.html
Depending on your assembly method, polishing a short-read assembly may not
be optimal.
—
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASYS5TC3U2MINGGUBFCMTI3Z35OAFAVCNFSM6AAAAABP75B3MKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJYG4YTCMRWHE>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
That seems like a reasonable solution, I was not aware of the possibility of targeted correction like that! |
But now I am left with 1000 contigs in addition to the 27 chromosomes. Even
after Scaffolding with HiC there is still 500+ gaps in the 27 chromosomes
in addition to the gaps in the 1000 unplaced contigs.
Yes the target of pilon is amazing but I run it on separate scaffolds and
maybe It is much better to directly use it on the full genome in one time.
But also I created many jobs with 25 gb of memory for --frags R1 short
reads + 25 gb of RAM memory for R2 short reads + 20 Gb for HIFI alignment
bam --bam + 20 gb for ONT reads --bam + 1 GB / Mb of genome so if we
calculate It would require at least 1 TB of Ram in one single run.. so it's
impossible to run it in one time...
Maybe the authors of PILON could make it more easy and split the fasta
based on the target and then run the genome polishing separately on each
scaffold to minimize rhe memory footprint??
It would be a great improvement if it was possible.
…On Thu, Oct 17, 2024, 8:53 PM Håkon Kaspersen ***@***.***> wrote:
That seems like a reasonable solution, I was not aware of the possibility
of targeted correction like that!
I don't think I am the right person to ask, it all depends on your plans
and your goals!
—
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASYS5TDWA5R3LZUDF7H3QODZ366PNAVCNFSM6AAAAABP75B3MKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGYYTQMBZGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I agree with you that Pilon is not good to polish the whole thing. But when
used in a targeted way as a second polished after a first polishing tool it
seems to work somehow.
…On Thu, Oct 17, 2024, 8:53 PM Håkon Kaspersen ***@***.***> wrote:
That seems like a reasonable solution, I was not aware of the possibility
of targeted correction like that!
I don't think I am the right person to ask, it all depends on your plans
and your goals!
—
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ASYS5TDWA5R3LZUDF7H3QODZ366PNAVCNFSM6AAAAABP75B3MKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMJZGYYTQMBZGU>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
When working on issue #5 I noticed that Pilon changes the contig headers in the resulting fasta file. This is unfortunate due to the "circular=true" and depth information that Unicycler provides in the headers.
Asked for help on the Pilon github:
broadinstitute/pilon#151
The text was updated successfully, but these errors were encountered: