Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Getting died with <Signals.SIGKILL: 9> when trying to run Lighter with the human genome size (3.2MB) #36

Open
PeterSu92 opened this issue Aug 15, 2023 · 10 comments

Comments

@PeterSu92
Copy link

I'm trying to run this on my computer with a large FASTQ input file, and am running it as a subprocess in Python:

# Set the desired parameters

kmer_size = 31
genome_size = 2000000000
error_rate = 0.1
num_threads = 10

Construct the Lighter command

lighter_command = [
lighter_executable_path,
'-r', input_reads_path,
'-k', str(kmer_size),
str(genome_size),str(error_rate), # Additional arguments
'-t', str(num_threads)
]

However, if I set the genome size any larger than the above, it won't work, as I get the following error message:

line 526, in run
raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '['../../Lighter/lighter', '-r', '100000_NG1D7PJA9F_1.fq', '-k', '31', '3200000000', '0.1', '-t', '10']' died with <Signals.SIGKILL: 9>.

The README says to put in at least the size of the genome of the organism in question, which in this case is the human genome. Am I doing something wrong that's a simple fix? Thank you!

@mourisl
Copy link
Owner

mourisl commented Aug 15, 2023

I think signal 9 means the process is killed by the system. Did you run Lighter on a server? How much memory did you specify?

@PeterSu92
Copy link
Author

No, just on my local machine, which is a windows but I'm running it on WSL. Do I need to change the allocation? I remember reading about that somewhere..

@mourisl
Copy link
Owner

mourisl commented Aug 16, 2023

I never tested Lighter in that environment. For the human genome, I think you may need to have about 15G memory.

@PeterSu92
Copy link
Author

Hmm what happens if I set the genome size to 2Gb instead of the 3.2Gb, in terms of how the algorithm works?

@mourisl
Copy link
Owner

mourisl commented Aug 16, 2023

Worth a try, the genome size is not a very strict parameter. With 2G specification, the memory probably would be around 10G.

@PeterSu92
Copy link
Author

Looks like it was able to run the job, any way to tell if it did anything super awry?

[2023-08-15 22:34:19] =============Start====================
[2023-08-15 22:34:31] Bad quality threshold is "D"
[2023-08-15 22:34:53] Finish sampling kmers
[2023-08-15 22:34:53] Bloom filter A's false positive rate: 0.000000
[2023-08-15 22:34:59] Finish storing trusted kmers
[2023-08-15 22:36:39] Finish error correction
Processed 768132 reads:
36267 are error-free
Corrected 295242 bases(0.403410 corrections for reads with errors)
Trimmed 0 reads with average trimmed bases 0.000000
Discard 0 reads
Error correction with Lighter is complete.

@mourisl
Copy link
Owner

mourisl commented Aug 17, 2023

The number of rea, 768132, for the human genome seems too low. Is your data Illumina or PacBio/Nanopore?

@PeterSu92
Copy link
Author

Oh, I purposefully used a subset of reads to troubleshoot code so that it wouldn't take forever since I was just trying to play around. In theory though, it should've been 10M reads in that file that Lighter processed (minus however many had an 'N' base call or too low of a Phred quality score), which is why I was wondering if there's still a problem, because this is still an order of magnitude off (hundreds of thousands vs. millions).

@mourisl
Copy link
Owner

mourisl commented Aug 18, 2023

Lighter usually fails to correct if the read coverage is too low (< about 8x). So the downsampled reads might be too sparse and the corrected number of basees seems too low. I guess it might be fine on full data set.

@PeterSu92
Copy link
Author

Alright I tried it on the full dataset and it gave the signal 9 error again. So do you think I just need more RAM then?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants