Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

SensibleSalmon · 2014-09-12T16:43:29Z

modified:   scripts/abundance-dist-single.py
modified:   scripts/abundance-dist.py
modified:   scripts/count-median.py
modified:   scripts/count-overlap.py
modified:   scripts/extract-long-sequences.py
modified:   scripts/extract-paired-reads.py
modified:   scripts/fastq-to-fasta.py
modified:   scripts/filter-abund-single.py
modified:   scripts/interleave-reads.py
modified:   scripts/load-graph.py
modified:   scripts/load-into-counting.py
modified:   scripts/make-initial-stoptags.py
modified:   scripts/merge-partitions.py
modified:   scripts/partition-graph.py
modified:   scripts/split-paired-reads.py

scripts that WERE updated && notes:

abundance-dist          already does earlier in script
abundance-dist-single       already does earlier in script
count-overlap
extract-long-sequences
extract-paried-reads        explicitly two output files; listed on one line:
                print('wrote to: ' + outfile + '.se' + ' and ' + outfile + '.pe')
fastq-to-fasta          file output optional; either outputs 'wrote to' message or 'did not write output to file'
filter-abund-single     already does earlier in script
interleave-reads        file output optional; either outputs 'wrote to' message or 'did not write output to file'
load-graph
load-into-counting      creates various partway files, messages for them already exists.
                'wrote to: ' message only prints on successful program end to ensure being printed
                at the end of output.
make-initial-stoptags       uncertain how output is written since script uses "htable.save_stop_tags()" instead
                of explicit file write object. 'wrote out: ' message added anyway.
merge-partitions        No API entry exists for this script!
                Already messages earlier in script
partition-graph         threading is confusing; potentially multiple files
split-paired-reads      explicitly two output files; message is:
                "'wrote to: ' + out1 + ' and ' + out2"

scripts that were NOT updated && notes on why:

annotate-partitions         already does, multiple files
count-median            already does
do-partition            already does, multiple files
extract-paritions       already does, multiple files
filter-abund            already does, multiple files
filter-stoptags         already does, multiple files
find-knots          already does (mostly), multiple files, renames some things without outputting message
normalize-by-median     personally uncertain about logic flow && script appears to warn on output anyway
sample-reads-randomly       already does, multiple files

modified: scripts/abundance-dist-single.py modified: scripts/abundance-dist.py modified: scripts/count-median.py modified: scripts/count-overlap.py modified: scripts/extract-long-sequences.py modified: scripts/extract-paired-reads.py modified: scripts/fastq-to-fasta.py modified: scripts/filter-abund-single.py modified: scripts/interleave-reads.py modified: scripts/load-graph.py modified: scripts/load-into-counting.py modified: scripts/make-initial-stoptags.py modified: scripts/merge-partitions.py modified: scripts/partition-graph.py modified: scripts/split-paired-reads.py scripts that WERE updated && notes: abundance-dist already does earlier in script abundance-dist-single already does earlier in script count-overlap extract-long-sequences extract-paried-reads explicitly two output files; listed on one line: print('wrote to: ' + outfile + '.se' + ' and ' + outfile + '.pe') fastq-to-fasta file output optional; either outputs 'wrote to' message or 'did not write output to file' filter-abund-single already does earlier in script interleave-reads file output optional; either outputs 'wrote to' message or 'did not write output to file' load-graph load-into-counting creates various partway files, messages for them already exists. 'wrote to: ' message only prints on successful program end to ensure being printed at the end of output. make-initial-stoptags uncertain how output is written since script uses "htable.save_stop_tags()" instead of explicit file write object. 'wrote out: ' message added anyway. merge-partitions No API entry exists for this script! Already messages earlier in script partition-graph threading is confusing; potentially multiple files split-paired-reads explicitly two output files; message is: "'wrote to: ' + out1 + ' and ' + out2" scripts that were NOT updated && notes on why: annotate-partitions already does, multiple files count-median already does do-partition already does, multiple files extract-paritions already does, multiple files filter-abund already does, multiple files filter-stoptags already does, multiple files find-knots already does (mostly), multiple files, renames some things without outputting message normalize-by-median personally uncertain about logic flow && script appears to warn on output anyway sample-reads-randomly already does, multiple files

ged-jenkins · 2014-09-12T16:43:31Z

Can one of the admins verify this patch?

ged-jenkins · 2014-09-12T16:43:31Z

Can one of the admins verify this patch?

mr-c · 2014-09-14T00:11:36Z

add to testlist

ctb · 2014-09-14T11:17:02Z

@bocajnotnef, looks like the tests failed - this is because 'print' goes to stdout by default, not stderr, so the output statements were being put into output files in some cases (in cases where the scripts output to stdout intentionally).

Also, can you change the statements to match the Python 2 style used elsewhere in the scripts?

For example:

 print >>sys.stderr, 'output placed in file', filename

ctb · 2014-09-14T11:17:41Z

scripts/count-median.py

@@ -82,6 +82,6 @@ def main():
        if ksize <= len(seq):
            medn, ave, stdev = htable.get_median_count(seq)
            print >> output, record.name, medn, ave, stdev, len(seq)
-
+    


unintentional change in line spacing?

Indeed. Odd. Didn't know that happened.

SensibleSalmon · 2014-09-14T16:58:41Z

Acknowledging all. Will work on this evening.

SensibleSalmon · 2014-09-15T20:45:06Z

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

ctb · 2014-09-15T20:46:22Z

On Mon, Sep 15, 2014 at 01:45:06PM -0700, bocajnotnef wrote:

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

yes

--t

ctb · 2014-09-15T20:55:19Z

On Mon, Sep 15, 2014 at 01:45:06PM -0700, bocajnotnef wrote:

Should I match python 2 convention even where the majority of the script adheres to python 3? i.e. abundance-dist.py

I guess "match the script" is OK. I'll take that into account when
reviewing.

ctb · 2014-09-16T13:50:25Z

Could you fix the test failures, please?

ctb · 2014-09-16T14:01:12Z

See development workflow here: http://khmer.readthedocs.org/en/docs-hackathon/dev/getting-started.html

mr-c · 2014-09-16T14:19:36Z

@ctb The latest getting started guide is at
http://khmer.readthedocs.org/en/latest/dev/getting-started.html
On Sep 16, 2014 10:01 AM, "C. Titus Brown" notifications@github.com wrote:

See development workflow here:
http://khmer.readthedocs.org/en/docs-hackathon/dev/getting-started.html

—
Reply to this email directly or view it on GitHub
#596 (comment).

ctb · 2014-09-16T14:21:11Z

On Tue, Sep 16, 2014 at 07:19:36AM -0700, Michael R. Crusoe wrote:

@ctb The latest getting started guide is at
http://khmer.readthedocs.org/en/latest/dev/getting-started.html

Yep, but that's not necessarily a stable URL. => release, I think.

On Sep 16, 2014 10:01 AM, "C. Titus Brown" notifications@github.com wrote:

See development workflow here:
http://khmer.readthedocs.org/en/docs-hackathon/dev/getting-started.html

???
Reply to this email directly or view it on GitHub
#596 (comment).

Reply to this email directly or view it on GitHub:

#596 (comment)

C. Titus Brown, ctb@msu.edu

…te_out to sys.stdout by default and cleaned up output file listing

…o file or stdout and lists as such. fixed induced brokenness to fastq-to-fasta due to inattention in editor management.

SensibleSalmon · 2014-09-18T14:10:14Z

New issue. interleave-reads.py doesn't properly detect if an output file argument is given (in order to tell if it's writing to file or writing to stdout). Obviously the argparser handles all of this but how can I tell from within the main function if a specific argument has been given?

… Jenkins to test.

luizirber · 2014-09-18T14:49:10Z

test this please

SensibleSalmon · 2014-09-18T16:56:43Z

"new issue" resolved, thanks to Mr C. New question @mr-c: Should I make the test for make-initial-stoptags before finishing this pull?

mr-c · 2014-09-18T17:03:17Z

@bocajnotnef Did you paste in the checklist yet?

Yes, item 3 in the checklist requires it

mr-c · 2014-09-22T22:13:45Z

scripts/abundance-dist.py

@@ -102,6 +102,7 @@ def main():

        if sofar == total:
            break
+    print('wrote to: ' + args.output_histogram_filename)


stderr as well (pretend I say this for every print statement in your diff :-)

mr-c · 2014-09-22T22:15:44Z

there is no scripts/cake.txt please remove https://github.com/bocajnotnef/khmer/blob/lowhangingfruit/script-output-file-listing/scripts/cake.txt

mr-c · 2014-09-22T22:19:15Z

scripts/fastq-to-fasta.py

@@ -38,6 +38,8 @@ def main():
    args = get_parser().parse_args()
    print >> sys.stderr, ('fastq from ', args.input_sequence)

+    write_out = sys.stdout


Default values should go in the argparse definition (https://github.com/bocajnotnef/khmer/blob/lowhangingfruit/script-output-file-listing/scripts/fastq-to-fasta.py#L27)

While you are at it you can simplify the code by eliminating the test for args.output and always using write_out

unsure of how to set the write_out default value; telling the default value of the -o argument to be sys.stdout just breaks everything. Granted, I was working off of the argparser from interleave-reads (in which there isn't a write_out variable, so...)

Breakage is expected as it requires a small refactoring. Try pair coding with a teammate or with me if you'd like some assistance.

Got it! Just had to go through interleave-reads for a bit 'cause it's already implemented there.

Updated output file listing to write to sys.stderr (imported sys as well) Generated files for testing

Changed output listing print statements to print to sys.stderr

mr-c · 2014-09-23T22:55:41Z

Yay for tests! Look all that glorious coverage for make-initial-stopgaps.py!

mr-c · 2014-09-23T22:57:26Z

scripts/abundance-dist.py

@@ -103,6 +103,8 @@ def main():
        if sofar == total:
            break

+    sys.stderr.write("this is some string")


gorramit. I knew there was something in that commit that wasn't supposed to get comitted.

Updated interleave-reads output listing to point to the proper place refactored load-graph output listing to drop an else clause (may have blatantly copied from mr-c here) removed debug string from abundance-dist that somehow made it into the last commit refactored make-initial-stoptags test generate test data from load-graph.py to reduce payload size of test_data directory note: this assumes load-graph will work

SensibleSalmon · 2014-09-26T16:26:58Z

restest this please

refactoring make-initial-stoptags test Fixed pep8 formatting issues Fixed argparse documentation formatting issues in fasta-to-fastq.py

mr-c · 2014-09-26T18:58:29Z

ChangeLog

+    * scripts/{fastq-to-fasta, interleave-reads}.py: 
+    added output file listing sensitive to optional -o argument
+    * tests/test_scripts.py: added test for scripts/make-initial-stoptags.py
+    * tests/test-data/test-reads.{info,pt,stoptags,tagset}: test data


Delete this and the following line

Delete the entirety of the changelog entry?

Wait. Assuming test-data stuff. Unsure of why that wasn't highligted in the line comment.

These are line comments, so just line 13 (and 14 by reference)

You no longer are including these files so don't mention them in the ChangeLog.

…t-file-listing Resolved issue #577 (mostly); most scripts output filenames written to at end of main

mr-c · 2014-09-26T20:03:57Z

@bocajnotnef Congratulations on your first commit to the khmer project! Your name will be included in the release notes for the next version and you'll be listed amongst our other contributors in the next software release paper.

SensibleSalmon mentioned this pull request Sep 12, 2014

Code in scripts/ should explicitly state output file names #577

Closed

ctb reviewed Sep 14, 2014
View reviewed changes

SensibleSalmon added 5 commits September 16, 2014 11:05

refactored scripts/fastq-to-fasta.py as per CTB instructions; set wri…

36f93a0

…te_out to sys.stdout by default and cleaned up output file listing

Removed redundant file listings. Adjusted spacing.

3a26325

fixed broken interleave-reads.py testing; now detects if outputting t…

0745d00

…o file or stdout and lists as such. fixed induced brokenness to fastq-to-fasta due to inattention in editor management.

Adjusted formatting to comply to pep8 standards.

6399078

Updated changelog to reflect earlier changes

5df0971

Attempting to resolve optional file issue; primarily commiting to get…

09ac168

… Jenkins to test.

Fixed interleave-reads.py's optional file issue.

602f1f5

mr-c reviewed Sep 22, 2014
View reviewed changes

mr-c added this to the 1.2+ milestone Sep 22, 2014

SensibleSalmon added 2 commits September 23, 2014 11:00

Added test for scripts/make-initial-stoptags.py

c15adca

Updated output file listing to write to sys.stderr (imported sys as well) Generated files for testing

Fixed formatting errors

1e64d8e

Changed output listing print statements to print to sys.stderr

mr-c reviewed Sep 23, 2014
View reviewed changes

removed some extraneous files in the test-data directory after

b455eb2

refactoring make-initial-stoptags test Fixed pep8 formatting issues Fixed argparse documentation formatting issues in fasta-to-fastq.py

mr-c reviewed Sep 26, 2014
View reviewed changes

SensibleSalmon added 2 commits September 26, 2014 15:51

Refactored things that made it through the cracks in fastaq-to-fasta.py

cb213f9

updated changelog entry to no longer include nonexistent files.

edb22da

mr-c added a commit that referenced this pull request Sep 26, 2014

Merge pull request #596 from bocajnotnef/lowhangingfruit/script-outpu…

d1e2003

…t-file-listing Resolved issue #577 (mostly); most scripts output filenames written to at end of main

mr-c merged commit d1e2003 into dib-lab:master Sep 26, 2014

SensibleSalmon mentioned this pull request Sep 30, 2014

make-initial-stoptags has no script level tests #392

Closed

mr-c modified the milestones: 1.3+, 1.3 Dec 19, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

SensibleSalmon commented Sep 12, 2014

ged-jenkins commented Sep 12, 2014

ged-jenkins commented Sep 12, 2014

mr-c commented Sep 14, 2014

ctb commented Sep 14, 2014

ctb Sep 14, 2014

SensibleSalmon Sep 15, 2014

SensibleSalmon commented Sep 14, 2014

SensibleSalmon commented Sep 15, 2014

ctb commented Sep 15, 2014

ctb commented Sep 15, 2014

ctb commented Sep 16, 2014

ctb commented Sep 16, 2014

mr-c commented Sep 16, 2014

ctb commented Sep 16, 2014

#596 (comment)

SensibleSalmon commented Sep 18, 2014

luizirber commented Sep 18, 2014

SensibleSalmon commented Sep 18, 2014

mr-c commented Sep 18, 2014

mr-c Sep 22, 2014

mr-c commented Sep 22, 2014

mr-c Sep 22, 2014

SensibleSalmon Sep 23, 2014

mr-c Sep 23, 2014

SensibleSalmon Sep 26, 2014

mr-c commented Sep 23, 2014

mr-c Sep 23, 2014

SensibleSalmon Sep 24, 2014

SensibleSalmon commented Sep 26, 2014

mr-c Sep 26, 2014

SensibleSalmon Sep 26, 2014

SensibleSalmon Sep 26, 2014

mr-c Sep 26, 2014

mr-c commented Sep 26, 2014

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

Resolved issue #577 (mostly); most scripts output filenames written to at end of main #596

Conversation

SensibleSalmon commented Sep 12, 2014

ged-jenkins commented Sep 12, 2014

ged-jenkins commented Sep 12, 2014

mr-c commented Sep 14, 2014

ctb commented Sep 14, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SensibleSalmon commented Sep 14, 2014

SensibleSalmon commented Sep 15, 2014

ctb commented Sep 15, 2014

ctb commented Sep 15, 2014

ctb commented Sep 16, 2014

ctb commented Sep 16, 2014

mr-c commented Sep 16, 2014

ctb commented Sep 16, 2014

SensibleSalmon commented Sep 18, 2014

luizirber commented Sep 18, 2014

SensibleSalmon commented Sep 18, 2014

mr-c commented Sep 18, 2014

Choose a reason for hiding this comment

mr-c commented Sep 22, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mr-c commented Sep 23, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

SensibleSalmon commented Sep 26, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

mr-c commented Sep 26, 2014