Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

minimap2-2.20(r1061) gets killed #749

Closed
harisankarsadasivan opened this issue May 27, 2021 · 14 comments
Closed

minimap2-2.20(r1061) gets killed #749

harisankarsadasivan opened this issue May 27, 2021 · 14 comments
Labels

Comments

@harisankarsadasivan
Copy link

Dr Li, (@lh3 ),

With the two most recent releases,
./minimap2 -t 1 -x ava-ont reads.fq reads.fq > /dev/null
gets killed right after the print for "collected minimizers" is completed.

@lh3
Copy link
Owner

lh3 commented May 27, 2021

What's your input? I tested the recent versions on E. coli reads and got the right output.

@harisankarsadasivan
Copy link
Author

harisankarsadasivan commented May 27, 2021

Dr. Li @lh3 , it is the first 100K human ONT long reads from "HG002_GM24385_1_2_3_Guppy_3.6.0_prom.fastq.gz" available online from dnanexus.. Fails for both compressed & uncompressed reads.

@lh3
Copy link
Owner

lh3 commented May 27, 2021

I don't have access to dnanexus, but on another HG002 Nanopore dataset, minimap2 doesn't stop at "collected minimizers". Do you have enough memory? Does older minimap2 work?

BTW, these days, it is not recommended to use minimap2 for assembly. There are better Nanopore assemblers for large datasets.

@harisankarsadasivan
Copy link
Author

@lh3
Yes, definitely seems like a memory problem as it works with 1K reads but not 100K. My system memory is 32GB and peak consumption with 60 threads on 30 cores does not seem to go beyond 25% right before it is killed.
However, this is an interesting observation for me as my 32GB RAM was enough for overlapping/ aligning 100K human reads with minimap2-2.18(r1015). Could you confirm if the new release has higher memory requirements for some reason and also if it works for large datasets (100K reads)?
Thank You for your opinion, Dr. Li. I'm just trying to understand how minimap2 works.

@lh3
Copy link
Owner

lh3 commented May 28, 2021

Could you make your dataset available? 32GB should be enough for 100k reads. v2.20 shouldn't use more memory than v2.18 in the ava-ont mode.

@Nisha-Hemandhar-Kumar
Copy link

I have also been facing a similar problem and the available disk space is 712GB, with 32GB RAM. I have been trying to align the nanopore datasets to the mouse genome (using minimap v2.18), however the process is killed. Could @lh3 you please suggest on the possible solutions?

@msierk
Copy link

msierk commented Jul 26, 2021

I'm having a similar problem aligning reads against a 16S index. I was able to run this earlier with ~v2.17 on an older iMac with 16GB memory. Now I've tried running the same job on my newer MacBook Air running Big Sur with 8GB memory and an EC2 t2.2xlarge instance running ubuntu with 32GB memory and they both get killed. I do not get a message saying insufficient memory, just the following after running for a few minutes (version 2.21-r1071):

ubuntu@ip-172-31-1-218:~/Classification$ cat run_minimap2.out
[M::main::0.3570.35] loaded/built the index for 21699 target sequence(s)
[M::mm_mapopt_update::0.369
0.37] mid_occ = 5776
[M::mm_idx_stat] kmer size: 15; skip: 10; is_hpc: 0; #seq: 21699
[M::mm_idx_stat::0.376*0.38] distinct minimizers: 394317 (48.91% are singletons); average occurrences: 14.321; average spacing: 5.589; total length: 31559225
./run_minimap2.sh: line 4: 1734 Killed ../apps/minimap2/minimap2 -ax map-ont ../apps/minimap2/ncbi_16SRNA.mmi -o barcode06_all_fast_t2xlarge2.sam --secondary=no -t 8 barcode06_all.fastq.gz

@donthuanalyst
Copy link

Hello,

I am also have similar issue. I am trying to align two different versions genome assemblies of the same species using the following command and my job is getting killed similar to other people posted here. Please see the error message below.

Error Message:

[M::mm_idx_gen::9.428*1.31] collected minimizers
[M::mm_idx_gen::10.605*1.83] sorted minimizers
[M::main::10.607*1.83] loaded/built the index for 177 target sequence(s)
[M::mm_mapopt_update::11.006*1.80] mid_occ = 50
[M::mm_idx_stat] kmer size: 19; skip: 19; is_hpc: 0; #seq: 177
[M::mm_idx_stat::11.283*1.78] distinct minimizers: 20039135 (95.27% are singletons); average occurrences: 1.130; average spacing: 9.947; total leng
th: 225250884
/cm/local/apps/slurm/var/spool/job1799308/slurm_script: line 14: 13532 Killed                  minimap2 -t 15 -ax asm5 /work/user/thu/genomes
/AmelHAv3.1/GCF_003254395.2_Amel_HAv3.1_genomic.fna /work/user/thu/HB_assemblies/asm.v2.FINAL.fasta > asm_vs_AmelHAv3.1.mMap2.out

Script used to run minimap2:

#SBATCH --cpus-per-task 14
#SBATCH --mem-per-cpu 20000
#SBATCH -J minMap2

cd $SLURM_SUBMIT_DIR

cd ../results/AmelHAv3.1_vs_asm

module load minimap2/7025b0b

minimap2 -t 13 -ax asm5 /work/user/thu/genomes/AmelHAv3.1/GCF_003254395.2_Amel_HAv3.1_genomic.fna /work/user/thu/HB_assemblies/asm.v2.FINAL.fasta > asm_vs_AmelHAv3.1.mMap2.out

@lh3
Copy link
Owner

lh3 commented Aug 3, 2021

Do you have sequences such that I can reproduce the issue? Thanks.

@msierk
Copy link

msierk commented Aug 3, 2021 via email

lh3 added a commit that referenced this issue Aug 4, 2021
This is more apparent when there are many candidate chains. Although only a
small numbers of them are extended, they are still occupying memory. A
realloc() solves this problem. This is a long existiing issue.
@lh3
Copy link
Owner

lh3 commented Aug 4, 2021

Thanks, @msierk. This is indeed caused by an unusual memory leak. Please try the dev branch. It should have this issue fixed. All versions of minimap2 have this problem. For mapping against a large number of 16S, you may add -f1000. This may speed up alignment with little/no impact to accuracy.

@donkirkby Your issue looks different because your reference is not repetitive (based on mid_occ = 50). If you still have the issue, could you provide a test data? Thanks.

@donkirkby
Copy link
Contributor

I suspect that @lh3's comment above is for @donthuanalyst instead of me.

@lh3
Copy link
Owner

lh3 commented Aug 4, 2021

@donkirkby Yes. Thanks, and sorry for mentioning you instead.

@msierk
Copy link

msierk commented Aug 4, 2021

Thanks, @msierk. This is indeed caused by an unusual memory leak. Please try the dev branch. It should have this issue fixed. All versions of minimap2 have this problem. For mapping against a large number of 16S, you may add -f1000. This may speed up alignment with little/no impact to accuracy.

@donkirkby Your issue looks different because your reference is not repetitive (based on mid_occ = 50). If you still have the issue, could you provide a test data? Thanks.

That seems to have fixed it - thanks!

@lh3 lh3 closed this as completed Dec 22, 2021
xenshinu pushed a commit to Minimap2onGPU/mm2-gb that referenced this issue Jun 7, 2022
This is more apparent when there are many candidate chains. Although only a
small numbers of them are extended, they are still occupying memory. A
realloc() solves this problem. This is a long existiing issue.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

6 participants