-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finding structural varianCommand terminated by signal 11 #11
Comments
Hi I will leave the VALOR segmentation fault issue to @f0t1h , but I can take a look at the SONIC related part of your bug report. Larger SONIC file with empty output is definitely not expected. Is it possible for you to share your files so I can try to pinpoint the problem? |
Hi, Thanks for the super quick reply! |
the reference genome is missing :-) can you send the FASTA as well? |
My guess is, sonic->chromosome_names, array is null. I will double check tomorrow. |
Hi again, Sorry for the delay, I was checking wether or not I could share the reference genome. Best, |
well, I will need something to test. But a quick look at the gaps file shows a problem, it is not a valid BED file. you need something like: chromosome1 45400 55400 separated by tabs. those "edges=5300..117305 left=156508 right=143120 ver=1.10 style=3" parts have to go |
I asked and I have the right to share a subset of my data. Here is an archive containing everything: The invalid BED file you mentioned came from the way my reference genome's chromosomes were named. I updated their name and tried to run Sonic / Valor again, but nothing changed, and I ended up with an empty Sonic file, and Valor stopping upon start. |
well, at least for this set it looks fine. an you also try with the same subset? with real filessonic --ref reference.fasta --gaps Gaps/gaps.bed --reps Repeats/repeats.out --dups SegDups/segdups.bed --make-sonic ref.sonic calkan@donut:~/tmp/sonic/DataSonic/ExampleDataset$ ls -l ref.sonic with null filescalkan@donut: |
I tried with this subset, yes. Here is what I get: ./sonic --ref reference.fasta --gaps Gaps/gaps.bed --reps Repeats/repeats.out --dups SegDups/segdups.bed --make-sonic ref.sonic ll ref.sonic I tried on two different computers and got the same output. |
Apparently, this could be caused by the version of Sonic, which is not the latest by default when cloning Valor. I manually pulled the latest version and managed to properly create the Sonic file. I will attempt to run Valor again, but need to update my BAM file, since my reference genome's headers contained spaces when the BAM file was generated, which seems to be incompatible with Valor. Thanks a lot for your answers! |
oh, I didn't notice it was a versioning issue. I will update the version in this repo then. |
I updated the SONIC submodule in this repo. Let us know if you still have the segmentation fault problem. I won't be able to help with VALOR code itself, it will be in @f0t1h 's court if problem persists. |
Hi again, I modified my reference genome so it does not contain spaces, it thus only has simple headers such as ">0", ">1", etc. I then tried to re-run Valor with these new files, and using 1 threads, it still reports that it ended by signal 11. Here are the last few lines of the log:
This is not a memory issue, since I reserved 50GB, and Valor barely used 300MB. Is there anything else I am doing wrong? Thanks in advance. |
leaving this to @f0t1h |
Can you send me the sonic file, so I can test what is wrong? |
Also, can you send all logs from the beginning? @morispi |
Here is an archive containing the Sonic file and the logs from the run: I ran again on a single chromosome in order to get a smaller sonic / log. |
Does this error happen always on first chromosome in the sample, or randomly? |
Well the data I shared only contains one chromosome, so it always happens on this one. |
Apart from that, on a regular file with multiple chromosomes, it seems to happen randomly. |
and
I run above code loading the sonic file you provided and got no errors. Can you try the same? Sonic file looks healthy. Is it possible that there is a corruption in the bam file? |
Here is the output from Valgrind: Valgrind.log I don't think the BAM file is corrupted, since it was generated recently, with LongRanger 2.2.2, and since LongRanger did not report any error while running. Is there any way I can check if the BAM file is corrupted though? |
You can use I will be offline now since it is past 2AM here. I will follow up tomorrow. |
Samtools quickcheck did not report any error. I tried to use samtools view to write my bam file to another bam file, and attempted to run Valor on that new bam file again. It still ended with the same error. No problem with following up tomorrow, thanks for your answers! |
Can you set --contig_count or -c to number of chromosomes in your reference? By default, it assumes Human sequencing and runs first 24 (22+ X +Y) chromosomes. It may be the cause of your problem. If this fixes it, I will update the software accordingly. |
Sorry for not answering yesterday, it was a holiday in France. Anyway, thank you very much for your answers! |
Edit: Well, it seemed to work at first... It still terminated with signal 11, but much later that previous runs. |
I am in process of cleaning the command line interface, once done I will update the README as well. Can you let me know minimum number of bases in these contigs? Valor is designed for large variants, problem might be because of the smaller contigs. I will update the code tomorrow to check this. |
Sorry, it seems like I forgot to reply. The main contig of interest is about 3.9Mbp. I managed to run Valor successfully on this contig only, by subsampling my BAM and FASTA files. However Valor did not output any variant. Is it because the contig is too small? |
Hello,
I am attempting to run Valor on a non-human dataset for which no default Sonic file is available.
I thus followed the guidelines on Sonic's GitHub page and created my own file. I generated two Sonic files, one with empty bed / out files passed to the --dups, --reps, and --gaps parameters, and another with files containing actual data passed to the parameters. First thing that seemed strange to me is that the Sonic file generated from empty bed/out files has a greater size that the other one. Is that an expected behaviour?
I generated the segmental duplication file using SEDEF, as mentioned here calkan/sonic#11 , but since my assembly is not soft-masked, SEDEF generated an empty file. RepeatMasker ran correctly, and I ran a script of my own to find the gaps. Both gaps and repetitions files were thus correct and contained data.
Anyway, I then attempted to run Valor with these two Sonic files, to see if there would be any difference. However, Valor fails with both files, reporting that "Finding structural varianCommand terminated by signal 11". Both iterations of Valor fail on the same chromosome.
Do you know what might be causing this issue? Is there anything I'm doing wrong with Sonic files? I'm at a loss here.
Thanks in advance.
Best,
Pierre
The text was updated successfully, but these errors were encountered: