Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reference impute using maximal matches: Killed #49

Open
jerrywzy opened this issue Jul 1, 2021 · 2 comments
Open

Reference impute using maximal matches: Killed #49

jerrywzy opened this issue Jul 1, 2021 · 2 comments

Comments

@jerrywzy
Copy link

jerrywzy commented Jul 1, 2021

Hi there,

I am currently trying to use PBWT to impute a reference panel A onto another reference panel B, and vice versa. I am able to impute reference panel A with reference panel B. However, when imputing reference panel B with reference panel A, the process gets Killed on my Linux server.

Here are some lines from the output with file paths taken out:

read genotypes from panel_B/ch1_A.vcf.gz with 2504 sample names and 3738240 sites on chromosome 1: M, N are 5008, 3738240
user    500.356235      system  4.765889        max_RSS 448492  Memory  583036203
impute against reference panel_A_pbwt/chr1
read pbwt PBW3 file with 133049391 bytes: M, N are 9620, 7687647
read 7687647 sites on chromosome 1 from file
read 4810 sample names
1874940 sites selected from 7687647, pbwt size for 9620 haplotypes is 61442035
built reverse PBWT - size 61408995
1874940 sites selected from 3738240, pbwt size for 5008 haplotypes is 47999564
Imputation preliminaries: user  213.029846      system  1.175402        max_RSS 692372  Memory  2156881417
Reference impute using maximal matches: Killed 

What could be causing this? I've tried it with two different servers with the same results.

@richarddurbin
Copy link
Owner

richarddurbin commented Jul 1, 2021 via email

@jerrywzy
Copy link
Author

jerrywzy commented Jul 2, 2021

Hi Richard,

Thanks for the reply. I've rerun the command with "-check", and got to the same point where the process gets killed again, unfortunately.

Here are the last few lines of the output:

written 95762838 chars pbwt: M, N are 5008, 3700000
written 3700000 sites from 10177 to 247152709
written 2504 samples
read genotypes from panelB_chr1_IDfixed.vcf with 2504 sample names and 3738240 sites on chromosome 1: M, N are 5008, 3738240
user    441.591560      system  16.612428       max_RSS 449228  Memory  583036114
impute against reference panelA/pbwt/chr1
read pbwt PBW3 file with 133049391 bytes: M, N are 9620, 7687647
read 7687647 sites on chromosome 1 from file
read 4810 sample names
1874940 sites selected from 7687647, pbwt size for 9620 haplotypes is 61442035
built reverse PBWT - size 61408995
written haplotype file: 1874940 rows of 9620
1874940 sites selected from 3738240, pbwt size for 5008 haplotypes is 47999564
Imputation preliminaries: user  397.107193      system  15.921011       max_RSS 692460  Memory  2157093083
Reference impute using maximal matches: ./merge_reciprocal.sh: line 13: 23977 Killed                  $PBWT/pbwt -checkpoint 100000 -check -readVcfGT panelB_chr1_IDfixed.vcf -referenceImpute panelA/pbwt/chr1 -writeVcfGz panelB_chr1_IDfixed_panelA_test.dose.vcf.gz

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants