-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Delegate port generation for the tests to the OS #1192
Delegate port generation for the tests to the OS #1192
Conversation
Remove the test with mode 2 ('mode_2': During the download process, the server blocks the download by sending just several characters every few seconds.) from test_slow_retrieval. This test is marked as "expected failure" with the purpose of rewriting it one day, but slow retrievals have been removed from the specification and soon it will be removed from the tuf reference implementation as a whole. That means that the chances of making this test useful are close to 0 if not none. The other test (with mode 1) in test_slow_retrieval is not removed. For reference: - theupdateframework/specification#111 - theupdateframework#1156 Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
By giving 0 to the port argument we ask the OS to give us an arbitrary unused port. This method is much better than if we generate a port, because even though we are generating a random port there is always the chance that the port could be already in use. In this commit, I also make sure that there is no place in the tests where we are manually generating and passing ports. Also, because none of the tests pass a port to the TestServerProcess class and its __init__ function I have removed the "port" argument. Finally, until now slow_retrieval.py couldn't use the TestServerProcess class from utils.py port generation because we were using httpd.handle_request() which handles only ONE request. Then, what happened was that when we use wait_for_server() to make a test connection and verify that the server is up, the slow_retrieval server handles that connection (which it accepts as a request) and exits. We avoided that use-case by passing timeout = 0 and avoiding calling wait_for_server() on this special value. Now, when we use httpd.serve_forever() this problem is resolved and no longer we need to make those checks. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Now, after we can use wait_for_server and the retry mechanism of TestServerProcess in utils.py we no longer need to use sleep in this test file. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
We want to make sure that server are successfully started in the common use cases and that the new port generation works. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
I had some discussions with @jku should we add tests for the One solution I can think of is using the default timeout and moving these tests to a different dir which will be executed only by |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this! The only hack here is parsing server logs to get the port but I believe that's a decent price to pay (and arguably less of a hack than retrying on bind() failure). Tests should be more reliable and faster because of this -- and not just because one very slow test was removed.
Only real question is why _start_server()
still has two loops: I thought the point was that the outer loop was replaced by letting bind() pick a port...
Yes, I was thinking that if we have a problem starting the server we can try again if we haven't finished the timeout? |
Yes: we have no idea why it might have failed so no reason to believe it would be fixed by retrying. |
Oh one more thing about the new tests: Apart from the clean-test I think everything else is already tested by the current tests. The new tests are obviously more "unit-testy" since they only test a single thing ... but still: is there value in adding tests that are basically subsets of existing tests? |
They provide us a way to clearly see when we have a bug while starting any of the subprocesses running a server script, which is not always easy to be seen given that most of those server subprocesses are initialized in The tests on my virtual machine are run for around |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting close... LGTM except for the loop removal that seems incorrect (indent change needed for the elapsed time calculation)
d260a70
to
de8b896
Compare
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
de8b896
to
097c78b
Compare
I realized that now when we had removed the outer loop I can add tests that check the case when a server has exited before the timeout has expired. |
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
As discussed with Jussi, using try and except blocks when instantiating the servers don't bring a lot of value (the "bind failed") message could be useful, but there are other messages explaining almost the same thing which will be logged from the father process as well. Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Man from Del Monte says yes 👍
Signed-off-by: Martin Vrachev <mvrachev@vmware.com>
@@ -37,6 +37,7 @@ | |||
try: | |||
# is defined in Python 3 | |||
TimeoutError | |||
ChildProcessError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will not work:
- if TimeoutError fails, you define class ChildProcessError?
- the code that uses this is in another file and uses utils.ChildProcessError -- that won't be defined in python 3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to just always define an error of your own since ChildProcessError is not in python2 (it's not 100% correct even in python3): TestServerError
or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both TimeoutError
and ChildProcessError
are available on python 3.X and not available on python 2.X, so if TimeoutError
is not found then it makes sense to define both of them.
the code that uses this is in another file and uses utils.ChildProcessError -- that won't be defined in python 3
for that you are right. I was mistaken in my assumption.
I think that ChildProcessError is a suitable name for that error.
I just think that in test_utils.py
I will fix that error with:
child_process_error = None
if sys.version_info.major == 2:
child_process_error = utils.ChildProcessError
else:
child_process_error = ChildProcessError
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Both
TimeoutError
andChildProcessError
are available on python 3.X and not available on python 2.X, so ifTimeoutError
is not found then it makes sense to define both of them.
no. You can either decide things based on python version or you can decide things based on feature testing. Using feature testing as if it is python version testing is wrong even if it happens to work right now.
Since the failures are now hidden in the UI: Earlier test runs (e.g. https://ci.appveyor.com/project/theupdateframework/tuf/builds/36033643/job/0pm05aeiihcn9nad for commit 9e53c4a) had error like this:
That's worrying and needs to be investigated. |
After the findings made by @jku and described here #1196 That's why it makes sense to close this pr and open another pr which uses this new method. |
Fixes #1124, Fixes #1111
Description of the changes being introduced by the pull request:
This pr aims to make the creation of new server subprocesses more stable.
It fixes issues related to the usage of ports already in use like
[Errno 98] Address already in use
orConnectionRefusedError: [Errno 111]
By using
0
as a port argument when we start the servers we ask the OS to give usan arbitrary unused port.
This method is much better than if we generate a port, because
instead of us generating a port, verifying that it's not busy, and use a retry mechanism (if the port is busy)
we can simply rely on the OS to do that job for us.
This means that the server scripts receive their ports when they are created and then send it back to the
father process which had created them.
Other changes/cleanups in this pr include
tests/utils.py
This pr is related to the closed #1169 pr but because
I decided to use a whole different approach I thought it made more sense to close that one and create this one.
PS: Great thanks to @jku with whom we had discussions about this pr on regular occasions!
Please verify and check that the pull request fulfills the following
requirements: