Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault with 1000G but not with HM3 #76

Open
swvanderlaan opened this issue Mar 30, 2022 · 2 comments
Open

Segmentation fault with 1000G but not with HM3 #76

swvanderlaan opened this issue Mar 30, 2022 · 2 comments

Comments

@swvanderlaan
Copy link

swvanderlaan commented Mar 30, 2022

Hi,

I am following this protocol: https://www.nature.com/articles/nprot.2010.116. At some point it will make use of smartpca. This works beautifully for HapMap 3 data:

perl ~/git/EIG/bin/smartpca.perl \
> -i rawdata/rawdata.hapmap3r2.pruned.bed \
> -a rawdata/rawdata.hapmap3r2.pruned.pedsnp \
> -b rawdata/rawdata.hapmap3r2.pruned.pedind \
> -k 10 \
> -o rawdata/rawdata.hapmap3r2.pruned.pca \
> -p rawdata/rawdata.hapmap3r2.pruned.plot \
> -e rawdata/rawdata.hapmap3r2.pruned.eval \
> -l rawdata/rawdata.hapmap3r2.pruned.log \
> -m 5 \
> -t 10 \
> -s 6.0 \
> -w reference/hapmap3r2_CEU.CHB.JPT.YRI-pca-populations.txt
smartpca -p rawdata/rawdata.hapmap3r2.pruned.pca.par >rawdata/rawdata.hapmap3r2.pruned.log
ploteig -i rawdata/rawdata.hapmap3r2.pruned.pca.evec -c 1:2  -p Case:Control:3:4:5:6  -x  -y  -o rawdata/rawdata.hapmap3r2.pruned.plot.xtxt
evec2pca.perl 10 rawdata/rawdata.hapmap3r2.pruned.pca.evec rawdata/rawdata.hapmap3r2.pruned.pedind rawdata/rawdata.hapmap3r2.pruned.pca

Here's the head, for instance, of the result:

cat rawdata/rawdata.hapmap3r2.pruned.pca.evec | head
           #eigvals:    43.658    23.724     1.738     1.055     1.052     1.046     1.046     1.042     1.038     1.034
                 1:1     0.0050      0.0807     -0.0021     -0.0024     -0.0045      0.0080      0.0033      0.0031     -0.0133      0.0112             Case
                 2:2     0.0045      0.0821     -0.0035      0.0012     -0.0004      0.0122     -0.0046      0.0051     -0.0101     -0.0056          Control
                 3:3     0.0055      0.0813     -0.0026     -0.0064      0.0005      0.0153      0.0010      0.0047     -0.0042      0.0110             Case
                 4:4     0.0049      0.0801      0.0029      0.0037      0.0022      0.0126      0.0022      0.0040     -0.0078      0.0047             Case
                 5:5     0.0047      0.0812     -0.0004      0.0002     -0.0034      0.0124     -0.0003      0.0025     -0.0063      0.0129             Case
                 6:6     0.0045      0.0807     -0.0021      0.0026     -0.0155      0.0168      0.0090      0.0083     -0.0026      0.0086             Case
                 7:7     0.0051      0.0809      0.0007      0.0009     -0.0017      0.0190      0.0068      0.0027     -0.0019      0.0062             Case
                 8:8     0.0050      0.0813      0.0016     -0.0046     -0.0023      0.0152     -0.0028      0.0042     -0.0048     -0.0002             Case
                 9:9     0.0053      0.0816     -0.0024     -0.0001     -0.0009      0.0107      0.0047      0.0046     -0.0095      0.0011             Case

But when I try to do the exact same thing using the 1000G data, it breaks down:

perl ~/git/EIG/bin/smartpca.perl \
> -i rawdata/rawdata.1kg_phase1.pruned.bed \
> -a rawdata/rawdata.1kg_phase1.pruned.pedsnp \
> -b rawdata/rawdata.1kg_phase1.pruned.pedind \
> -k 2 \
> -o rawdata/rawdata.1kg_phase1.pruned.pca \
> -p rawdata/rawdata.1kg_phase1.pruned.plot \
> -e rawdata/rawdata.1kg_phase1.pruned.eval \
> -l rawdata/rawdata.1kg_phase1.pruned.log \
> -t 2 \
> -w reference/1kg_phase1_all/1kg-pca-populations.txt
smartpca -p rawdata/rawdata.1kg_phase1.pruned.pca.par >rawdata/rawdata.1kg_phase1.pruned.log
sh: line 1: 40371 Segmentation fault: 11  smartpca -p rawdata/rawdata.1kg_phase1.pruned.pca.par > rawdata/rawdata.1kg_phase1.pruned.log
ploteig -i rawdata/rawdata.1kg_phase1.pruned.pca.evec -c 1:2  -p   -x  -y  -o rawdata/rawdata.1kg_phase1.pruned.plot.xtxt
evec2pca.perl 2 rawdata/rawdata.1kg_phase1.pruned.pca.evec rawdata/rawdata.1kg_phase1.pruned.pedind rawdata/rawdata.1kg_phase1.pruned.pca

Is this a memory issue? Or something else I am missing?

I should add, that I did have it running, once back in 2019. It then produced this contents: using the exact same input... I add the output here. It used smartpca version 16000.

rawdata.1kg_phase1.pruned.log

Many thanks

Sander

@bumblenick
Copy link

bumblenick commented Mar 30, 2022 via email

@maspil
Copy link

maspil commented Apr 23, 2024

Hi,
I am following the same protocol, and I am having the exact same issue only with hapmap3 data (segmentation failt (core dumped). Did you find a solution for this?
Log:

parameter file: RAW.hapmap3r2.pruned.pca.par
### THE INPUT PARAMETERS
##PARAMETER NAME: VALUE
genotypename: RAW.hapmap3r2.pruned.bed
snpname: RAW.hapmap3r2.pruned.pedsnp
indivname: RAW.hapmap3r2.pruned.pedind
evecoutname: RAW.hapmap3r2.pruned.pca.evec
evaloutname: RAW.hapmap3r2.pruned.eval
altnormstyle: NO
numoutevec: 2
numoutlieriter: 5
numoutlierevec: 2
outliersigmathresh: 6
qtmode: 0
poplistname: pca-populations.txt
## smartpca version: 18140
norm used

lsqproject used
*** warning.  genetic distances are in cM not Morgans
1	rs6688969	119	3743391	T	C

1:Sample1 ignored
2:Sample2 ignored
3:Sample3 ignored
4:Sample4 ignored ...etc etc...

Thanks,
Mari

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants