Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kmds error #30

Closed
shimbalama opened this issue Apr 14, 2016 · 19 comments
Closed

kmds error #30

shimbalama opened this issue Apr 14, 2016 · 19 comments
Assignees
Labels

Comments

@shimbalama
Copy link

Hi,

I tried to run kmds as per the updated instructions and received the following error:

$ kmds -p metadata.pheno --mds_concat subsampled_matrices.txt -o all_structure --threads 16 --write_distances

kmds: control for population structure
Detected binary phenotype
Reading subsampled matrices from subsampled_matrices.txt
Joined matrix 1
Distance matrix calculated in: 0.000732998 s

Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .

Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .

Intel MKL ERROR: Parameter 2 was incorrect on entry to DSYEVD.

Intel MKL ERROR: Parameter 2 was incorrect on entry to DSYEV.
terminate called after throwing an instance of 'std::runtime_error'
what(): Could not calculate eignvalues of B matrix in metric MDS
Aborted (core dumped)

@tseemann
Copy link

I get the same error on the 1.1.1 binaries at the second kmds step:

kmds -p meta.pheno --mds_concat matrices.txt -o all_structure --threads 16 --write_distances
kmds: control for population structure
Detected continuous phenotype
Reading subsampled matrices from matrices.txt
Joined matrix 1
Distance matrix calculated in: 0.0711929 s

Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .

Intel MKL ERROR: Parameter 10 was incorrect on entry to DGEMM .

Intel MKL ERROR: Parameter 2 was incorrect on entry to DSYEVD.

Intel MKL ERROR: Parameter 2 was incorrect on entry to DSYEV.
terminate called after throwing an instance of 'std::runtime_error'
  what():  Could not calculate eignvalues of B matrix in metric MDS
Aborted (core dumped)

@johnlees
Copy link
Owner

I've seen this when either

  • The pheno file was not formatted correctly (first two columns sample id, matching those in the kmer files, third column phenotype)
  • All kmers were filtered out

Do you have an example of meta.pheno and one of the .dsm files in matrices.txt?
Was the distance matrix written? (all_structure.distances.csv)

@johnlees johnlees self-assigned this Apr 14, 2016
@andersgs
Copy link

Hi John.

I am having the same issue.

The distance matrix was written, but no idea if it is correct.

I'll send you some files offline.

Anders.

@johnlees
Copy link
Owner

@andersgs Your distance matrix looks ok to me, and I have been able to project it into three dimensions (attached below)
all_structure.zip

Are you using the dynamically or statically compiled versions in the v1.1.1 release, or have you compiled from source yourself?

@andersgs
Copy link

I am using the statically compiled version (v1.1.1).

While that file is produced, none of the putative others needed to run seer are (*.sample?).

@johnlees
Copy link
Owner

I wonder if this might be an error with the way I have linked the math libraries in that case. Could you try this version:
https://github.com/johnlees/seer/releases/download/v1.1.1/seer_v1.1.1_static_all.tar.gz

and see if it works?

@andersgs
Copy link

Thanks, John.

Still no go, unfortunately.

It seg faults. Even with just --help.

@johnlees
Copy link
Owner

What OS and version are you using?

I'm going to make an alternative script (requiring R) which will be able to make the required files from the distance matrix as a work-around for now

@johnlees
Copy link
Owner

The script should deal with this for practical purposes (see commit e817cee) but ideally this should compile properly, so any more information on your platforms would be appreciated.

@andersgs
Copy link

Thank you for the script, John. I'll give it a go.

I am running RHEL7 (Red Hat).

@mgalardini
Copy link

Don't know if this is related, but I believe that the 'static_all' kmds segfaults even when exiting successfully, or when calling it with '-h'

@johnlees
Copy link
Owner

As far as I can tell the statically compiled versions won't work in RHEL (kmds at least). An alternative would be to use the sanger-pathogens VM (import ftp://ftp.sanger.ac.uk/pub/pathogens/pathogens-vm/pathogens-vm.latest.ova as a resource in virtualbox)

@mgalardini
Copy link

I'm in the process of trying the proposed pipeline on RHEL7; up to the first pass of kmds it seems to be working fine (if you ignore the segfault when the program exits). Will let you know if I can get the second pass to work.
A virtual machine could be cool, but if it can't be run in a cluster it would still crash pretty soon when it starts requiring large amounts of RAM.
Is there anything I (we?) can do to provide more info on this issue and related ones?

@mgalardini
Copy link

Hi, I can confirm that the second pass of kmds segfaults on RHEL6 (but same thing happens with RHEL7).

Output:
kmds: control for population structure
Detected binary phenotype
Reading subsampled matrices from subsampled_matrices.txt
Joined matrix 1
Joined matrix 2
Joined matrix 3
Joined matrix 4
Joined matrix 5
Joined matrix 6
Joined matrix 7
Joined matrix 8
Joined matrix 9
Joined matrix 10
Joined matrix 11
Joined matrix 12
Joined matrix 13
Joined matrix 14
Joined matrix 15
Joined matrix 16
/ebi/lsf/ebi-spool/02/1464773096.2749204: line 8: 37456 Segmentation fault (core dumped)

@johnlees
Copy link
Owner

johnlees commented Jun 1, 2016

I'm afraid that without a RHEL system myself to test on, and no knowledge of cross-compiling I am unlikely to be able to produce a pre-compiled version to work on your system.

If you are unable to use the VM I would suggest compiling from source. I am happy to try and help with any issues you have with this.

Alternatively, you could use mash to produce a k-mer based distance matrix which can then be used as input to the R script referenced above. This would avoid the high RAM usage step

@mgalardini
Copy link

Thank you John, I'll give it a shoot with mash and let you know

@mgalardini
Copy link

Hi John,

just to let you know that mash worked just fine in generating the distance matrix and that now seer is running happily.
In case someone else gets into a similar situation, here's the commands used:

# generate a sketch for each genome
for infile in $(ls genome);
do
  mash sketch genome/$infile -o $(basename $i .fasta);
done
# run pairwise distance calculations
for genome in $(find . -maxdepth 1 -type f -name '*.msh');
do
  mash dist $genome *.msh > $(basename $genome .msh).dist;
done
cat *.dist > distances.txt
# ad-hoc script to convert mash output to square matrix
# no column and row names allowed
# row/columns are sorted alphabetically
# cells are comma separated
./mash2mat distances.txt > distances.csv
# project distance matrix using the script provided by seer
perl R_mds.pl -d distances.csv -p phenotypes.txt -o projection

johnlees added a commit that referenced this issue Jun 7, 2016
@johnlees
Copy link
Owner

johnlees commented Jun 7, 2016

@mgalardini - happy to hear that! Thanks for the commands, I've added them to the wiki along with a mash2distances script in commit ac735f1

@tseemann
Copy link

tseemann commented Jun 7, 2016

@johnlees can you try it on a Centos VM ? either on VirtualBox
or maybe @andrewjpage can advise on a Docker or real VM option available at Sanger?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants