Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bus error while trying to cluster a nearly 800GB FASTA file using mmseqs easy-linclust #921

Open
dezhi0730 opened this issue Dec 16, 2024 · 2 comments

Comments

@dezhi0730
Copy link

Hello, author:

I encountered a Bus error while trying to cluster a nearly 800GB FASTA file using mmseqs easy-linclust. Below are my command, error message, and system configuration details. I would appreciate your guidance on resolving this issue.
Command:

#!/bin/bash
#SBATCH --job-name=clust  # Job name
#SBATCH --output=logs/easy_clust_%j.log       # Output log file (%j will be replaced with the job ID)
#SBATCH --error=logs/easy_clust_%j.log         # Error log file (%j will be replaced with the job ID)
#SBATCH --ntasks=1                   # Number of tasks
#SBATCH --nodes=1                    # Number of nodes
#SBATCH --cpus-per-task=40
#SBATCH --gres=gpu:1                 # Number of GPUs
#SBATCH --partition=stdg_defq        # Partition name
#SBATCH --time=168:00:00               # Time limit (hh:mm:ss)
 
# Load necessary modules
module load mamba-24.3     # Example: load any necessary modules
source activate /exchange/xx
 
# Print job information
echo "Job ID: $SLURM_JOB_ID"
echo "Node List: $SLURM_JOB_NODELIST"
echo "Submit Directory: $SLURM_SUBMIT_DIR"
 
# Run your application
mmseqs easy-linclust /dfs/is/home/x266288/data_process/assets/FASTA/merged_all.fasta /dfs/is/home/x266288/data_process/assets/db/clustered/indi+oas/clustedRes /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir --min-seq-id 0.95 --cov-mode 1 -c 1.0      # Replace with your actual application command
 

Error Message:

Job ID: 192313
Node List: stdg22
Submit Directory: /home-cdo/x266288/data_process/utils
easy-linclust /dfs/is/home/x266288/data_process/assets/FASTA/merged_all.fasta /dfs/is/home/x266288/data_process/assets/db/clustered/indi+oas/clustedRes /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir --min-seq-id 0.95 --cov-mode 1 -c 1.0
 
MMseqs Version:                         13.45111
Cluster mode                            0
Max connected component depth           1000
Similarity type                         2
Threads                                 40
Compressed                              0
Verbosity                               3
Substitution matrix                     nucl:nucleotide.out,aa:blosum62.out
Add backtrace                           false
Alignment mode                          0
Alignment mode                          0
Allow wrapped scoring                   false
E-value threshold                       0.001
Seq. id. threshold                      0.95
Min alignment length                    0
Seq. id. mode                           0
Alternative alignments                  0
Coverage threshold                      1
Coverage mode                           1
Max sequence length                     65535
Compositional bias                      1
Max reject                              2147483647
Max accept                              2147483647
Include identical seq. id.              false
Preload mode                            0
Pseudo count a                          1
Pseudo count b                          1.5
Score bias                              0
Realign hits                            false
Realign score bias                      -0.2
Realign max seqs                        2147483647
Gap open cost                           nucl:5,aa:11
Gap extension cost                      nucl:2,aa:1
Zdrop                                   40
Alphabet size                           nucl:5,aa:21
k-mers per sequence                     21
Spaced k-mers                           0
Spaced k-mer pattern                    
Scale k-mers per sequence               nucl:0.200,aa:0.000
Adjust k-mer length                     false
Mask residues                           1
Mask lower case residues                0
k-mer length                            0
Shift hash                              67
Split memory limit                      0
Include only extendable                 false
Skip repeating k-mers                   false
Rescore mode                            0
Remove hits by seq. id. and coverage    false
Sort results                            0
Remove temporary files                  true
Force restart with latest tmp           false
MPI runner                              
Database type                           0
Shuffle input database                  true
Createdb mode                           1
Write lookup file                       0
Offset of numeric ids                   0
 
linclust /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/input /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/clu /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/clu_tmp -e 0.001 --min-seq-id 0.95 -c 1 --cov-mode 1 --spaced-kmer-mode 0 --remove-tmp-files 1
 
Set cluster mode GREEDY MEM.
kmermatcher /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/input /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/clu_tmp/12397887837406899853/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size nucl:5,aa:13 --min-seq-id 0.95 --kmer-per-seq 21 --spaced-kmer-mode 0 --kmer-per-seq-scale nucl:0.200,aa:0.000 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 1 -k 0 -c 1 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 40 --compressed 0 -v 3
 
kmermatcher /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/input /dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/clu_tmp/12397887837406899853/pref --sub-mat nucl:nucleotide.out,aa:blosum62.out --alph-size nucl:5,aa:13 --min-seq-id 0.95 --kmer-per-seq 21 --spaced-kmer-mode 0 --kmer-per-seq-scale nucl:0.200,aa:0.000 --adjust-kmer-len 0 --mask 0 --mask-lower-case 0 --cov-mode 1 -k 0 -c 1 --max-seq-len 65535 --hash-shift 67 --split-memory-limit 0 --include-only-extendable 0 --ignore-multi-kmer 0 --threads 40 --compressed 0 -v 3
 
Database size: 2080936687 type: Nucleotide
 
Not enough memory to process at once need to split
[=================================================================] 2.08B 33m 39s 920ms
Process file into 11 parts
Generate k-mers list for 1 split
[=================================================================] 2.08B 37m 43s 776ms
 
Adjusted k-mer length 19
Sort kmer 0h 4m 42s 840ms
Sort by rep. sequence 0h 1m 40s 458ms
Generate k-mers list for 2 split
[=================================================================] 2.08B 37m 40s 661ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 55s 392ms
Sort by rep. sequence 0h 1m 43s 902ms
Generate k-mers list for 3 split
[=================================================================] 2.08B 36m 51s 84ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 55s 543ms
Sort by rep. sequence 0h 1m 41s 750ms
Generate k-mers list for 4 split
[=================================================================] 2.08B 37m 24s 796ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 52s 357ms
Sort by rep. sequence 0h 1m 40s 557ms
Generate k-mers list for 5 split  
[=================================================================] 2.08B 37m 57s 412ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 57s 804ms
Sort by rep. sequence 0h 1m 39s 453ms
Generate k-mers list for 6 split
[=================================================================] 2.08B 37m 10s 891ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 55s 794ms
Sort by rep. sequence 0h 1m 38s 542ms
Generate k-mers list for 7 split
[=================================================================] 2.08B 36m 53s 9ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 55s 788ms
Sort by rep. sequence 0h 1m 40s 551ms
Generate k-mers list for 8 split
[=================================================================] 2.08B 36m 54s 754ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 49s 532ms
Sort by rep. sequence 0h 1m 40s 244ms
Generate k-mers list for 9 split
[=================================================================] 2.08B 36m 24s 93ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 58s 556ms
Sort by rep. sequence 0h 1m 37s 893ms
Generate k-mers list for 10 split
[=================================================================] 2.08B 36m 46s 198ms
 
Adjusted k-mer length 19
Sort kmer 0h 2m 57s 392ms
Sort by rep. sequence 0h 1m 36s 238ms
Generate k-mers list for 11 split
[=================================================================
/dfs/is/home/x266288/data_process/tmp_dir/tmp_dir/1053738512421706396/clu_tmp/12397887837406899853/linclust.sh: line 26: 23857 Bus error               (core dumped) $RUNNER "$MMSEQS" kmermatcher "$INPUT" "${TMP_PATH}/pref" ${KMERMATCHER_PAR}
Error: kmermatcher died
Error: Search died
 

System Configuration:

MMseqs2 Version:13.45111
MEM:378G

From the error message, it seems related to memory allocation or hardware limitations, but I am unsure how to debug or fix this issue. If you could provide any suggestions or debugging tips, it would be greatly appreciated!

@dezhi0730
Copy link
Author

When the final split was being processed, the program got stuck for a long time. However, from the htop view, it shows that there is still a large portion of memory available, and the CPU core utilization is not very high.

@dezhi0730
Copy link
Author

When I try the same command again,I get this err message:

Adjusted k-mer length 19
Sort kmer 0h 2m 52s 516ms
Sort by rep. sequence 0h 1m 36s 729ms
Generate k-mers list for 11 split
[=================================================================
/dfs/is/home/x266288/data_process/tmp_dir/temp_dir/11178384644005550917/clu_tmp/9926663674530773728/linclust.sh: line 26: 12453 Killed                  $RUNNER "$MMSEQS" kmermatcher "$INPUT" "${TMP_PATH}/pref" ${KMERMATCHER_PAR}
Error: kmermatcher died
Error: Search died

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant