-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Some questions about using pseudohaploid. #5
Comments
Hi Kun,
Please see below-
On Wed, Mar 18, 2020 at 12:42 AM xiekunwhy ***@***.***> wrote:
Hi,
I have some questions about using pseudohaploid.
1. For polishing, pseudohaploid use before polishing or after
polishing?
I would recommend before polishing since you are going to be filtering out
about half of your assembly.
1. For repeat mask, pseudohaploid use before repeat masking or after
repeat masking
If you use nucmer for the alignments, I would do it before repeat masking
but then you'll need to tune the parameters to avoid computing too many
alignments. The critical values are -l (minimum exact match length) and -c
(minimum cluster length of alignments). Depending on the species, etc, you
will probably need to set -l around 50 to 100 and -c around 100 to 500. If
this takes excessively long you could increase the lengths to -l 250 -c
2500 (or larger).
1. How long will it take to run the whole pipeline of pseudohaploid
for a ~3.2G plant genome (~4.5G generated from wtdgb2)?
The longest phase will be computing the whole genome alignments. If you
have access to a cluster I would recommend sge_mummer:
https://github.com/fritzsedlazeck/sge_mummer
Once you have that the postprocessing will be.a few hours
Good luck
Mike
…
Best wishes,
Kun
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#5>, or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP342PBABU6WFXSFTFTTTRIBGK5ANCNFSM4LOFQRLQ>
.
|
Hi Mike, MIN_IDENTITY=90 Best, |
You could try decreasing the MIN_CONTAIN and increase MAX_CHAIN_GAP but
these parameters are very sample specific. Unfortunately, I dont have an
automated procedure for setting them right now
Good luck
Mike
…On Thu, Mar 19, 2020 at 6:20 AM xiekunwhy ***@***.***> wrote:
Hi Mike,
Thank you for your suggestions, I ran pseudohaploid using raw contig
output from wtdgb2(~4.2G, when expected size is 3.2G, N50 ~375k, L50
~2800), but just only two small contigs(smaller than 10k) were removed. All
parameters are as following, any other suggestions?
MIN_IDENTITY=90
MIN_LENGTH=1000
MIN_CONTAIN=93
MAX_CHAIN_GAP=20000
nucmer --maxmatch -c 100 -l 500
Best,
Kun
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP347HDQER5MNHSSBCNDLRIHWYDANCNFSM4LOFQRLQ>
.
|
Hi Mike, Still not work well when I changed these two parameters (MIN_CONTAIN=50 MAX_CHAIN_GAP=50000, but only 10 short contig were removed). Any other suggestions? Or I think I need to try some other tools, and do you have some recommendation? Best, |
Im afraid I would need to review the data to offer any more specific
advice. Have you tried plotting a dotplot to look for co-linear contigs?
Good luck
Mike
…On Thu, Mar 19, 2020 at 9:15 PM xiekunwhy ***@***.***> wrote:
Hi Mike,
Still not work well when I changed these two parameters (MIN_CONTAIN=50
MAX_CHAIN_GAP=50000, but only 10 short contig were removed). Any other
suggestions?
Or I think I need to try some other tools, and do you have some
recommendation?
Best,
Kun
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#5 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AABP343U2RWCSY5Y2HT3BADRIK7U3ANCNFSM4LOFQRLQ>
.
|
Hi,
I have some questions about using pseudohaploid.
Best wishes,
Kun
The text was updated successfully, but these errors were encountered: