-
Notifications
You must be signed in to change notification settings - Fork 218
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Re-organize tests to avoid shell processes (branch: feature/reorg_tests) #47
Conversation
Tagging @amontalenti @dfdeshom @talentless @mlaprise @msukmanowsky @dan-blanchard looking for input on testing. Don't want this lingering, so if other people agree with the ideas I'll knock this out quickly and move on to other things. |
@@ -29,6 +29,8 @@ | |||
# we'll redirect stdout, but we'll save original for communication to Storm | |||
_stdout = sys.stdout | |||
_stderr = sys.stderr | |||
# this way we don't have to mock sys.stdin for testing, which lets us use pdb | |||
_stdin = sys.stdin |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is actually how the original developer of the Perl library that I've recently taken over (and am reworking to look like Streamparse), IO::Storm, set up their unit tests. It seems like it makes debugging a lot simpler. 👍
👍 This looks great. IMHO, it's definitely the right direction to go for testing. |
+1 to not using shell processes to mock storm. looks good |
@kbourgoin The tests here are intended to replace those in |
Wow. I had forgotten this even existed. Yeah, that was the goal. |
Okay, cool. I definitely prefer the mock approach, so I'll put the finishing touches on this. |
# Make sure the exception gets from the worker thread to the main | ||
for i in range(1): | ||
self.bolt._run() | ||
self.assertRaises(NotImplementedError, lambda: time.sleep(0.1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kbourgoin Why would this work? I can't find anything in the Python docs saying that time.sleep()
would ever throw a NotImplementedError
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dan-blanchard note that this method doesn't run when calling unittest.main()
on the file. If you rename it to test_exception_handling()
, it will run and fail
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dfdeshom Thanks for pointing that out! I'm still confused by it though, since later in the file there are a couple lines like:
try: time.sleep(0.1)
except NotImplementedError: pass
which again imply that somehow time.sleep()
is going to raise a NotImplementedError
, which is strange.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's been a while since I wrote this, but the except is there because the exception is thrown in a background thread. These look like tests for the BatchingBolt, which has a background thread to read and write tuples.
I'm not 100% sure what this thread did, or why I'm even testing for NotImplementedError. At a guess, it was for meaningless test coverage. Other tests with this could be more legit, however.
…ckage. This will eventually turn into the separate "pystorm" package we plan to release. For now, the unit tests aren't passing, so some things have to be tweaked. The main issue is that some of the mocking is no longer necessary, because you can pass input_stream and output_stream paramamters to the Component initializer for testing. They just have to be lists of bytes.
@dan-blanchard i'll take a look Monday. My brain is too far away from this to do a review today. |
…in the future. Attempted to make _handle_worker_exception mock work.
Changes Unknown when pulling 89177d3 on feature/reorg_tests into * on master*. |
Changes Unknown when pulling 77ed198 on feature/reorg_tests into * on master*. |
Changes Unknown when pulling 9522d7f on feature/reorg_tests into * on master*. |
Changes Unknown when pulling 4d81be6 on feature/reorg_tests into * on master*. |
|
||
# Calculate coverage on success | ||
after_success: | ||
- coveralls --config_file .coveragerc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've been deleting them, but we're actually getting duplicate coveralls comments because it's commenting for every single Travis job that succeeds (i.e., 2.7, 3.3, 3.4, and pypy), so we should probably just pick a version and have coveralls only get called for that version of Python.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My vote is either 2.7 or 3.4. If we want to live in the future, we can just do 3.4. The coverage information ought to be the same for them all anyway, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
3.4 sounds good to me. The only reason they would be different is if we had different code paths based on the Python version, but we don't have much of that. For SKLL, the 2.7 and and 3.x versions due do a couple things rather differently, so we record coverage for both, duplicate comments be damned.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, I just fixed that, and once the build succeeds, I'll merge this.
self.exc_info = None | ||
signal.signal(signal.SIGINT, self._handle_worker_exception) | ||
signal.signal(signal.SIGUSR1, self._handle_worker_exception) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the change from SIGINT to SIGUSR1?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was running tests and got irritated that CTRL-C would cause an exception in _handle_exception
, because self.exc_info
wouldn't be set. It made me realize that what we really wanted was a signal that didn't conflict with anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes a lot of sense. I'm cool with that. There's still the fact that a BatchingBolt has to exist on the main thread, or signal handlers don't attach properly, but I guess that hasn't been an issue for anyone yet.
It has been a very long time since I started down this path, but this all looks good to me. Although, for a testing rewrite branch, we should probably get Travis green before we merge. :) |
It was before I merged in master, which is weird, because the changes there were really small. I'll figure out why |
Changes Unknown when pulling ff95cd1 on feature/reorg_tests into * on master*. |
2 similar comments
Changes Unknown when pulling ff95cd1 on feature/reorg_tests into * on master*. |
Changes Unknown when pulling ff95cd1 on feature/reorg_tests into * on master*. |
Changes Unknown when pulling 8e20ede on feature/reorg_tests into * on master*. |
Re-organize tests to avoid shell processes (branch: feature/reorg_tests)
I'm opening this test for a review of the approach, not because all the tests have been reorganized. If there's support for the ideas in here, then I'll finish it up and replace our current tests with this.
Overview
Right now, our tests are a bit brittle, the barrier to entry for adding new tests is too high, and debugging failing tests, or gathering code coverage statistics is fairly difficult. This is by no means to say the current tests are bad. They work, they test things, and they aren't too hard to add new things to. I did this because I want adding new tests to be stupid easy. Everything that makes adding new tests easier to write and debug means that more and better tests will get written.
To do this, the big thing I wanted to do was remove the subprocess calls entirely. While they look like Storm, they aren't Storm, and they add a good bit of complication to the setup. For one, it makes using pdb to debug a test impossible, and the break in process space makes code coverage harder to collect.
Finally, we didn't have true integration tests. True integration tests would integrate with Storm and verify that communication was successful. Instead, it integrated with a mock that we created that mimics Storm. Instead of using a subprocess to mock Storm, I'm opting for the mock library to mock individual components. The mock library is a common Python tool for testing that any contributor walking in will know. Using a common tool like that will remove the barrier or learning how our subprocess storm mock works, which makes writing tests that much easier.
Disclaimer: I've no idea how this will play with python3. Testing that will come later.
Benefits
Next steps
If other people agree this is a good approach, I'll finish test coverage for the current streamparse version. @msukmanowsky has a large Pull Request coming in, so I'll wait until that's resolved and write all the necessary testing for what's there. Once that's complete, I'll hook up coveralls for streamparse to get code coverage stats. I don't know how useful they'll be, but I'm curious about it and it's integration with Travis-CI.