Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

State of streaming sequence IO in khmer #654

Closed
10 of 12 tasks
SensibleSalmon opened this issue Nov 12, 2014 · 11 comments
Closed
10 of 12 tasks

State of streaming sequence IO in khmer #654

SensibleSalmon opened this issue Nov 12, 2014 · 11 comments

Comments

@SensibleSalmon
Copy link
Contributor

Currently: There isn't any. Neither current screed (0.7) nor current read_parseres (not seqan) support streaming.

Soon: When screed 0.7.1/screed 1.rc1 is cut/merged in we will have uncompressed and bz2 streaming functionality across all scripts that use screed. gzip streaming was never backported to python 2 and they've said they won't be porting it later. We could go and port it over ourselves.

When seqan is merged in we still won't have read_parser streaming since their high-level interfaces don't provide any streaming stuff. We can look into using lower-level APIs or bothering the seqan folks to make it work in their high-level stuff; I don't believe there's currently a lean either way.

So:

  • screed uncompressed fasta
  • screed uncompressed fastq
  • screed gzip fasta
  • screed gzip fastq
  • screed bzip fasta
  • screed bzip fastq
  • read_parser uncompressed fasta
  • read_parser uncompressed fastq
  • read_parser gzip fasta
  • read_parser gzip fastq
  • read_parser bzip2 fasta
  • read_parser bzip2 fastq

Comments/thoughts?

Relevant to #633
Space check over-riding mentioned in #618 would be useful here (potentially?)

@ctb
Copy link
Member

ctb commented Nov 12, 2014

cc #393

@ctb
Copy link
Member

ctb commented Nov 12, 2014

Has anyone been in touch with seqan folk to see if they've got it on their roadmap?

@ctb ctb changed the title State of streaming in khmer State of streaming sequence IO in khmer Nov 12, 2014
@mr-c
Copy link
Contributor

mr-c commented Nov 12, 2014

Thank you @bocajnotnef for the testing and writeup. A reminder: we have to add the FASTA / FASTQ dimension to the testing matrix.

@ctb Nope, no contact yet with SeqAn. I guess it is time I wrote a letter of introduction.

@SensibleSalmon
Copy link
Contributor Author

I cover all the matrix cells in testing, I believe. I shall update the relevant comment.

@ctb
Copy link
Member

ctb commented Nov 13, 2014

Agreed re letter of intro.

@ctb
Copy link
Member

ctb commented Jan 18, 2015

Do the checkboxes above need to be updated, @mr-c and @bocajnotnef ?

@mr-c
Copy link
Contributor

mr-c commented Mar 27, 2015

Checkboxes have been updated.

@ctb
Copy link
Member

ctb commented May 12, 2015

@bocajnotnef can you update #700 and close this (or explain to me why we shouldn't)? thanks!

@SensibleSalmon
Copy link
Contributor Author

superseded by #700

@ctb
Copy link
Member

ctb commented May 31, 2015

@bocajnotnef was there anything to be updated on #700?

@SensibleSalmon
Copy link
Contributor Author

AFAIK no, it was accurate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants