-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
estimation of footprint to assemble a large plant genome #3
Comments
Hello Dario, Best regards, Changes for ra.py:
|
Thank you Robert, Yesterday I run Ra on a small subset of my input file to test its behavior on a cluster.
is this indicative of too little memory? I am pretty sure the alignment step was already completed.
This test was run with 352,825 fastas (5% of what I want to use), asking for 36 CPUs (to be hyperthreaded to 72, in a single node) and 144 GB of RAM, the command was Also, would it be possible to restart the assembly from the step just after the minimap2 alignment? In this way I can submit two jobs, each with the resources optimized for the task (CPUs at first, memory later). Dario |
I would not say that this is a memory issue, maybe some bug. Would you mind sharing the subset causing the error so I can see locally? Unfortunately, you can not restart the assembly after minimap2 step with Best regards, |
@dcopetti, you can also try the updated version (all in memory) here: https://github.com/lbcb-sci/raven. |
Hello,
we would like to use Ra to assemble a ~5 Gb plant genome (the genome is actually 2.5Gb in size, but it is highly-heterozygous, so we want to distinguish the two alleles in separate contigs). I have about 45x (of 5 Gb) of ONT data (N50 15 kb, QV >7) and we wonder if there is a way to predict the size of the minimap2 file and optimize the alignment step, since it is the most expensive.
For example, to simplify the output, adding/tweaking the minimap2 options: -X -p -N
and to be more sensitive: -A -B -O -E -z.
Do you think there is room in that part to increase specificity of alignments and decrease footprint and computation time?
Lastly, can you estimate the memory usage for an assembly where we have up to 36 CPUs (hyperthreading to 72) and max 500 GB RAM available? Will the polishing/graph construction steps be more memory demanding than the alignment step?
Thanks,
Dario
The text was updated successfully, but these errors were encountered: