Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

Merged
merged 17 commits into from
Sep 26, 2014
Merged

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

merged 17 commits into from
Sep 26, 2014

Conversation

SensibleSalmon
Copy link
Contributor

modified:   scripts/abundance-dist-single.py
modified:   scripts/abundance-dist.py
modified:   scripts/count-median.py
modified:   scripts/count-overlap.py
modified:   scripts/extract-long-sequences.py
modified:   scripts/extract-paired-reads.py
modified:   scripts/fastq-to-fasta.py
modified:   scripts/filter-abund-single.py
modified:   scripts/interleave-reads.py
modified:   scripts/load-graph.py
modified:   scripts/load-into-counting.py
modified:   scripts/make-initial-stoptags.py
modified:   scripts/merge-partitions.py
modified:   scripts/partition-graph.py
modified:   scripts/split-paired-reads.py

scripts that WERE updated && notes:

abundance-dist          already does earlier in script
abundance-dist-single       already does earlier in script
count-overlap
extract-long-sequences
extract-paried-reads        explicitly two output files; listed on one line:
                print('wrote to: ' + outfile + '.se' + ' and ' + outfile + '.pe')
fastq-to-fasta          file output optional; either outputs 'wrote to' message or 'did not write output to file'
filter-abund-single     already does earlier in script
interleave-reads        file output optional; either outputs 'wrote to' message or 'did not write output to file'
load-graph
load-into-counting      creates various partway files, messages for them already exists.
                'wrote to: ' message only prints on successful program end to ensure being printed
                at the end of output.
make-initial-stoptags       uncertain how output is written since script uses "htable.save_stop_tags()" instead
                of explicit file write object. 'wrote out: ' message added anyway.
merge-partitions        No API entry exists for this script!
                Already messages earlier in script
partition-graph         threading is confusing; potentially multiple files
split-paired-reads      explicitly two output files; message is:
                "'wrote to: ' + out1 + ' and ' + out2"

scripts that were NOT updated && notes on why:

annotate-partitions         already does, multiple files
count-median            already does
do-partition            already does, multiple files
extract-paritions       already does, multiple files
filter-abund            already does, multiple files
filter-stoptags         already does, multiple files
find-knots          already does (mostly), multiple files, renames some things without outputting message
normalize-by-median     personally uncertain about logic flow && script appears to warn on output anyway
sample-reads-randomly       already does, multiple files

	modified:   scripts/abundance-dist-single.py
	modified:   scripts/abundance-dist.py
	modified:   scripts/count-median.py
	modified:   scripts/count-overlap.py
	modified:   scripts/extract-long-sequences.py
	modified:   scripts/extract-paired-reads.py
	modified:   scripts/fastq-to-fasta.py
	modified:   scripts/filter-abund-single.py
	modified:   scripts/interleave-reads.py
	modified:   scripts/load-graph.py
	modified:   scripts/load-into-counting.py
	modified:   scripts/make-initial-stoptags.py
	modified:   scripts/merge-partitions.py
	modified:   scripts/partition-graph.py
	modified:   scripts/split-paired-reads.py

scripts that WERE updated && notes:

abundance-dist			already does earlier in script
abundance-dist-single		already does earlier in script
count-overlap
extract-long-sequences
extract-paried-reads 		explicitly two output files; listed on one line:
				print('wrote to: ' + outfile + '.se' + ' and ' + outfile + '.pe')
fastq-to-fasta			file output optional; either outputs 'wrote to' message or 'did not write output to file'
filter-abund-single		already does earlier in script
interleave-reads		file output optional; either outputs 'wrote to' message or 'did not write output to file'
load-graph
load-into-counting		creates various partway files, messages for them already exists.
				'wrote to: ' message only prints on successful program end to ensure being printed
				at the end of output.
make-initial-stoptags		uncertain how output is written since script uses "htable.save_stop_tags()" instead
				of explicit file write object. 'wrote out: ' message added anyway.
merge-partitions		No API entry exists for this script!
				Already messages earlier in script
partition-graph			threading is confusing; potentially multiple files
split-paired-reads		explicitly two output files; message is:
				"'wrote to: ' + out1 + ' and ' + out2"

scripts that were NOT updated && notes on why:

annotate-partitions 		already does, multiple files
count-median 			already does
do-partition 			already does, multiple files
extract-paritions		already does, multiple files
filter-abund			already does, multiple files
filter-stoptags			already does, multiple files
find-knots			already does (mostly), multiple files, renames some things without outputting message
normalize-by-median		personally uncertain about logic flow && script appears to warn on output anyway
sample-reads-randomly		already does, multiple files
@ged-jenkins
Copy link

Can one of the admins verify this patch?

1 similar comment
@ged-jenkins
Copy link

Can one of the admins verify this patch?

@mr-c
Copy link
Contributor

mr-c commented Sep 14, 2014

add to testlist

@ctb
Copy link
Member

ctb commented Sep 14, 2014

@bocajnotnef, looks like the tests failed - this is because 'print' goes to stdout by default, not stderr, so the output statements were being put into output files in some cases (in cases where the scripts output to stdout intentionally).

Also, can you change the statements to match the Python 2 style used elsewhere in the scripts?

For example:

 print >>sys.stderr, 'output placed in file', filename

@@ -82,6 +82,6 @@ def main():
if ksize <= len(seq):
medn, ave, stdev = htable.get_median_count(seq)
print >> output, record.name, medn, ave, stdev, len(seq)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unintentional change in line spacing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed. Odd. Didn't know that happened.

@SensibleSalmon
Copy link
Contributor Author

Acknowledging all. Will work on this evening.

@SensibleSalmon
Copy link
Contributor Author

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

@ctb
Copy link
Member

ctb commented Sep 15, 2014

On Mon, Sep 15, 2014 at 01:45:06PM -0700, bocajnotnef wrote:

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

yes

--t

@ctb
Copy link
Member

ctb commented Sep 15, 2014

On Mon, Sep 15, 2014 at 01:45:06PM -0700, bocajnotnef wrote:

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

I guess "match the script" is OK. I'll take that into account when
reviewing.

@ctb
Copy link
Member

ctb commented Sep 16, 2014

Could you fix the test failures, please?

@ctb
Copy link
Member

ctb commented Sep 16, 2014

@mr-c
Copy link
Contributor

mr-c commented Sep 16, 2014

@ctb The latest getting started guide is at
http://khmer.readthedocs.org/en/latest/dev/getting-started.html
On Sep 16, 2014 10:01 AM, "C. Titus Brown" notifications@github.com wrote:

See development workflow here:
http://khmer.readthedocs.org/en/docs-hackathon/dev/getting-started.html


Reply to this email directly or view it on GitHub
#596 (comment).

@ctb
Copy link
Member

ctb commented Sep 16, 2014

On Tue, Sep 16, 2014 at 07:19:36AM -0700, Michael R. Crusoe wrote:

@ctb The latest getting started guide is at
http://khmer.readthedocs.org/en/latest/dev/getting-started.html

Yep, but that's not necessarily a stable URL. => release, I think.

On Sep 16, 2014 10:01 AM, "C. Titus Brown" notifications@github.com wrote:

See development workflow here:
http://khmer.readthedocs.org/en/docs-hackathon/dev/getting-started.html

???
Reply to this email directly or view it on GitHub
#596 (comment).


Reply to this email directly or view it on GitHub:

#596 (comment)

C. Titus Brown, ctb@msu.edu

@SensibleSalmon
Copy link
Contributor Author

New issue. interleave-reads.py doesn't properly detect if an output file argument is given (in order to tell if it's writing to file or writing to stdout). Obviously the argparser handles all of this but how can I tell from within the main function if a specific argument has been given?

@luizirber
Copy link
Member

test this please

@SensibleSalmon
Copy link
Contributor Author

"new issue" resolved, thanks to Mr C. New question @mr-c: Should I make the test for make-initial-stoptags before finishing this pull?

@mr-c
Copy link
Contributor

mr-c commented Sep 18, 2014

@bocajnotnef Did you paste in the checklist yet?

Yes, item 3 in the checklist requires it

@@ -102,6 +102,7 @@ def main():

if sofar == total:
break
print('wrote to: ' + args.output_histogram_filename)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stderr as well (pretend I say this for every print statement in your diff :-)

@mr-c
Copy link
Contributor

mr-c commented Sep 22, 2014

@@ -38,6 +38,8 @@ def main():
args = get_parser().parse_args()
print >> sys.stderr, ('fastq from ', args.input_sequence)

write_out = sys.stdout
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default values should go in the argparse definition (https://github.com/bocajnotnef/khmer/blob/lowhangingfruit/script-output-file-listing/scripts/fastq-to-fasta.py#L27)

While you are at it you can simplify the code by eliminating the test for args.output and always using write_out

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unsure of how to set the write_out default value; telling the default value of the -o argument to be sys.stdout just breaks everything. Granted, I was working off of the argparser from interleave-reads (in which there isn't a write_out variable, so...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Breakage is expected as it requires a small refactoring. Try pair coding with a teammate or with me if you'd like some assistance.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it! Just had to go through interleave-reads for a bit 'cause it's already implemented there.

@mr-c mr-c added this to the 1.2+ milestone Sep 22, 2014
Updated output file listing to write to sys.stderr (imported sys as well)
Generated files for testing
Changed output listing print statements to print to sys.stderr
@mr-c
Copy link
Contributor

mr-c commented Sep 23, 2014

Yay for tests! Look all that glorious coverage for make-initial-stopgaps.py!

@@ -103,6 +103,8 @@ def main():
if sofar == total:
break

sys.stderr.write("this is some string")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

gorramit. I knew there was something in that commit that wasn't supposed to get comitted.

Updated interleave-reads output listing to point to the proper place

refactored load-graph output listing to drop an else clause
(may have blatantly copied from mr-c here)

removed debug string from abundance-dist that somehow made it into
the last commit

refactored make-initial-stoptags test generate test data from
load-graph.py to reduce payload size of test_data directory
note: this assumes load-graph will work
@SensibleSalmon
Copy link
Contributor Author

restest this please

refactoring make-initial-stoptags test

Fixed pep8 formatting issues

Fixed argparse documentation formatting issues in fasta-to-fastq.py
* scripts/{fastq-to-fasta, interleave-reads}.py:
added output file listing sensitive to optional -o argument
* tests/test_scripts.py: added test for scripts/make-initial-stoptags.py
* tests/test-data/test-reads.{info,pt,stoptags,tagset}: test data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this and the following line

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete the entirety of the changelog entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait. Assuming test-data stuff. Unsure of why that wasn't highligted in the line comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are line comments, so just line 13 (and 14 by reference)

You no longer are including these files so don't mention them in the ChangeLog.

mr-c added a commit that referenced this pull request Sep 26, 2014
…t-file-listing

Resolved issue #577 (mostly); most scripts output filenames written to at end of main
@mr-c mr-c merged commit d1e2003 into dib-lab:master Sep 26, 2014
@mr-c
Copy link
Contributor

mr-c commented Sep 26, 2014

@bocajnotnef Congratulations on your first commit to the khmer project! Your name will be included in the release notes for the next version and you'll be listed amongst our other contributors in the next software release paper.

@mr-c mr-c modified the milestones: 1.3+, 1.3 Dec 19, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants