-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
some bulk sequence loading tests that nail down current ACGTN behavior. #1633
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1633 +/- ##
=========================================
+ Coverage 69.82% 69.93% +0.1%
=========================================
Files 66 66
Lines 8974 8976 +2
Branches 3060 3062 +2
=========================================
+ Hits 6266 6277 +11
+ Misses 1025 1018 -7
+ Partials 1683 1681 -2
Continue to review full report at Codecov.
|
Wow, these tests are like a showcase of horrible inconsistency in loading sequences. |
tests/test_read_parsers.py
Outdated
kmer = "caggcgcccaccacc".upper() | ||
assert x.get(kmer) == 1 | ||
|
||
# the 2nd read with this k-mer in it has an N in it; 'consume' will ignore. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd rephrase the comment. At the moment I am very puzzled after reading it (ah its being ignored so count should be one, but it is asserted to be two ...??)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe "consume
will ignore the invalid base and continue consuming the read, so this kmer after the N should have abundance 2"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed in fe48787
I'd move these tests to a new file maybe Because it is easy to do I would parametrise the tests (that make sense) on the class so we test all combinations of Count/Node and table/graph. In which case we should get the parametrisation stuff from |
tests/test_read_parsers.py
Outdated
|
||
x.output_partitions(infile, savepath) | ||
|
||
read_names = [ read.name for read in ReadParser(savepath) ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pep8
🚓 doesn't like the extra white space (hence the travis failure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed 756454e
+1 on merging this before 2.1, then #1590 can update the behaviour and fix these tests (remove the "in the future ..." comments) |
Tests moved to |
Linux build fails because of a pep8 violation. The OSX build fails because some tests fail and then we exit ungracefully because there are too many open files. Will take a look at the latter. |
Ha, locally on OSX all tests pass as well and no "too many open files" warnings. |
With the fixture we can explicitly close all the ReadParser which might help with the too many open files error on OSX Travis
Switched to using a fixture for |
🎉 |
This is now ready for review & merge (although #1661 should be merged first :) |
(Upon merge, we should switch #1590 over to be against master.) |
tests/test_sequence_validation.py
Outdated
savepath = utils.get_temp_filename('foo') | ||
|
||
# read this in using "approved good" behavior w/cleaned_seq | ||
x = _Nodegraph(8, PRIMES_1m) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-> graphtype
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, except for that one comment. Nitpick: if you feel like switching graphtype
to Graphtype
so that types start with an upper case letter I would ❤️ that.
tests/test_sequence_validation.py
Outdated
return request.param | ||
|
||
|
||
@pytest.fixture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that I've looked this up: can we change this back to yield
and change the decorator to @pytest.yield_fixture
? Link to the pytest 2.9 docs: http://doc.pytest.org/en/2.9.2/fixture.html#fixture-finalization-executing-teardown-code
done
|
all comments addressed in e419a7b
|
🎉 |
These tests nail down behavior prior to #1590, which, when merged, will alter how we handle non-ACTG characters. Note, no behavior is changed in this PR; it's just (lots of) new tests.
This explicitly puts in place tests for sequences that contain one of lowercase, Ns, and non-ACGTN characters, for:
consume_fasta
and all other bulk-sequence loading functions on Hashtables and derived classes;trim_on_abundance
,trim_below_abundance
andfind_spectral_error_positions
This PR includes #1661.
Adds new test file
tests/test_sequence_validation.py
and data filetests/test-data/valid-read-testing.fq
.make test
Did it pass the tests?make clean diff-cover
If it introduces new functionality inscripts/
is it tested?make format diff_pylint_report cppcheck doc pydocstyle
Is it wellformatted?
additions are allowed without a major version increment. Changing file
formats also requires a major version number increment.
documented in
CHANGELOG.md
? See keepachangelogfor more details.
changes were made?
tested for streaming IO?)