Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use compare with both signatures and *.sbt.json files? #672

Closed
jolespin opened this issue May 3, 2019 · 3 comments · Fixed by #1059
Closed

How to use compare with both signatures and *.sbt.json files? #672

jolespin opened this issue May 3, 2019 · 3 comments · Fixed by #1059
Labels

Comments

@jolespin
Copy link

jolespin commented May 3, 2019

I'm working on a pipeline that uses sourmash. I want to compare the signatures within a database plus [an] addition signature[s]. It doesn't look like I can do this currently. I also can't assume that I will have the signatures available for the sequences in the database.

Is there a flag I can use to do this?
or
Is there a way to extract signatures from the database?

bash-4.1$ sourmash compare -k 31 -o out.txt bin.3.orig.fa.sig db.sbt.json
== This is sourmash version 2.0.0. ==
== Please cite Brown and Irber (2016), doi:10.21105/joss.00027. ==

Error in parsing signature; quitting.
Exception:

warning: no signatures loaded at given ksize/molecule type from db.sbt.json
loaded 1 signatures total.

0-output/M-1...	[ 1.]
min similarity in matrix: 1.000
saving labels to: out.txt.labels.txt
saving distance matrix to: out.txt
@ctb
Copy link
Contributor

ctb commented Aug 24, 2019

hi @jolespin we don't have a canonical way to get the signatures out of an SBT, but they are all stored in individual files that you can retrieve if need be - e.g. for podar-ref.sbt.json, you will find all of the signatures under .sbt.podar-ref/, named as hashes. So you can do something like

find .sbt.podar-ref \! -name internal.\* -print

to get a list.

This is obviously suboptimal :)

Two ideas are --

  • allow compare to take in SBT and LCA files.
  • support a sourmash signature operation to extract signatures from SBT and LCA files.

@ctb
Copy link
Contributor

ctb commented Jun 26, 2020

#1044 provides sourmash sig split which extracts signatures from SBT/LCA files; resolving #875 could provide a single loading API for compare.

@ctb
Copy link
Contributor

ctb commented Jun 28, 2020

This is fixed in #1059.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants