Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support requests.response.raw being a file-like object #1094

Merged
merged 16 commits into from
Jul 6, 2021

Conversation

IlyaSukhanov
Copy link
Contributor

@IlyaSukhanov IlyaSukhanov commented Jun 19, 2021

Previously httpie relied on requests.models.Response.raw being
urllib3.HTTPResponse. The Requests documentation specifies that
requests.models.Response.raw
is a File-like object but allows for other types for internal use.

This change introduces graceful handling for scenarios when
requests.models.Response.raw is not urllib3.HTTPResponse. In such a scenario
httpie now falls back to extracting metadata from requests.models.Response
directly instead of direct access from protected protected members such as
response.raw._original_response. A side effect in this fallback procedure is
that we can no longer determine HTTP protocol version and report it as ??.

This change is necessary to make it possible to implement TransportPlugins
without having to also needing to emulate internal behavior of urlib3 and
http.client.

This pull requests is an attempt to address #1093

httpie/models.py Outdated Show resolved Hide resolved
httpie/client.py Outdated Show resolved Hide resolved
Copy link
Contributor

@BoboTiG BoboTiG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems promising 💪

Let's wait on Jakub for one specific area of the patch.
In the meantime, could you stick with the project quotes guidelines (" -> ')?

httpie/models.py Outdated Show resolved Hide resolved
httpie/models.py Outdated Show resolved Hide resolved
httpie/models.py Outdated Show resolved Hide resolved
httpie/utils.py Outdated Show resolved Hide resolved
@IlyaSukhanov IlyaSukhanov force-pushed the transport_plugin_sans_urlib3 branch 2 times, most recently from db9953e to 2c2cf0d Compare June 22, 2021 12:19
@IlyaSukhanov
Copy link
Contributor Author

IlyaSukhanov commented Jun 22, 2021

Thank you for review and feedback.

Latest changes:

  • fixed quotes for consistency
  • in unknown case http protocol version is set to 1.1
  • removed stray debug comment

@codecov-commenter
Copy link

codecov-commenter commented Jun 22, 2021

Codecov Report

Merging #1094 (4e2e969) into master (7ceb313) will increase coverage by 0.11%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1094      +/-   ##
==========================================
+ Coverage   95.86%   95.98%   +0.11%     
==========================================
  Files          62       64       +2     
  Lines        4040     4107      +67     
==========================================
+ Hits         3873     3942      +69     
+ Misses        167      165       -2     
Impacted Files Coverage Δ
httpie/client.py 100.00% <ø> (+1.39%) ⬆️
httpie/models.py 91.89% <100.00%> (+0.71%) ⬆️
httpie/utils.py 98.03% <100.00%> (+0.26%) ⬆️
tests/test_cookie.py 100.00% <100.00%> (ø)
tests/test_sessions.py 100.00% <100.00%> (ø)
tests/test_transport_plugin.py 100.00% <100.00%> (ø)
tests/test_redirects.py 97.77% <0.00%> (+0.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 7ceb313...4e2e969. Read the comment docs.

@BoboTiG BoboTiG linked an issue Jun 23, 2021 that may be closed by this pull request
2 tasks
Makefile Outdated Show resolved Hide resolved
Previously httpie relied on requests.models.Response.raw being
urllib3.HTTPResponse.  The Requests documentation specifies that
(requests.models.Response.raw)[https://docs.python-requests.org/en/master/api/#requests.Response.raw]
is a File-like object but allows for other types for internal use.

This change introduces graceful handling for scenarios when
requests.models.Response.raw is not urllib3.HTTPResponse. In such a scenario
httpie now falls back to extracting metadata from requests.models.Response
directly instead of direct access from protected protected members such as
response.raw._original_response. A side effect in this fallback procedure is
that we can no longer determine HTTP protocol version and report it as `??`.

This change is necessary to make it possible to implement TransportPlugins
without having to also needing to emulate internal behavior of urlib3 and
http.client.
…response.msg._headers

response.cookies was not utilized as it not possible to construct original
payload from http.cookiejar.Cookie. Data is stored in lossy format. For example
Cookie.secure defaults to False so we cannot distinguish if Cookie.secure was
set to False or was not set at all. Same problem applies to other fields also.
@IlyaSukhanov
Copy link
Contributor Author

  • Make file alterations have been reverted.
  • Simplified envelope extraction added.

httpie/models.py Outdated Show resolved Hide resolved
@BoboTiG BoboTiG requested a review from jkbrzt June 24, 2021 12:18
Comment on lines +366 to +369
(
'hello=world; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly, '
'pea=pod; Path=/ab; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly'
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These should now be a lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cookies were stored as a list in response.raw._original_response.msg._headers however with this change cookies are extracted from response.headers.get('Set-Cookie', '') which is a comma separated string.

get_expired_cookies has been changed from accepting list-of-tuple headers to string of comma separated strings.

I don't see of a way to access more-or-less raw cookie data from requests in any way but parsing them out of the comma separated string without resorting to using protected members. Cookie jar has more formatted cookies but they are stored in a lossy way.

With all this in mind, to keep tests as lists the cookie splitting logic can be moved to outside of get_expired_cookies(), but I'm not sure if this is what you had in mind by your comment.

Thanks
--IAS

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IlyaSukhanov Is it an option to add a test case to ensure response.headers.get('Set-Cookie', '') returns a string with concatenated cookies? It will help us prevent potential regressions for upcoming requests versions and to be more comfortable with those changes too :)

After that, we will be good to merge.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added test to ensure that multiple cookies get extracted correctly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like cicd checks is failing. Is there a tried approach for starting a local http server to test against?

The way it's erroring out it's not clear what the issue is. Few of suspicions I have:

  1. Multiprocess does not start for some reason
  2. Multiprocess does not start fast enough (single threaded machine?)
  3. Flask is unable to open port for some reason (seems unlikely as no error to that effect is presented)

Without knowing what the error is or much experience with GitHub check infrastructure. My only idea is to add a retry functionality in test, to call httpie multiple time until hopefully the background server becomes responsive.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was referring to the incorrect usage of tuples for what by nature is a list.

(  # <= tuple because fixed-length collection where each item is a different thing

    ( # <= list because it’s a simple list of things
        'hello=world; Path=/; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly, '
        'pea=pod; Path=/ab; Expires=Thu, 01-Jan-1970 00:00:00 GMT; HttpOnly'
    ),
    None,
    [  # <= list because it’s a simple list of things
        {'name': 'hello', 'path': '/'},
        {'name': 'pea', 'path': '/ab'}
    ]
),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I see what you mean. But the cookie values are not tuples its a string that is line wrapped, and thus grouped using parenthesis.

Tuple:

(
    'foo',
    'bar'
)

String:

(
    'foo'
    'bar'
)

Note how in this example and in the pr the lines are not separated by comma which is required in tuple definition.

I hope this clears up any confusion.

If project has different style requirement for line wrapping long strings, please let me know and I'll make this change conform to that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@IlyaSukhanov right, I see it now! Yeah, this style of string-wrapping is the preferred one, but in this context, next to a bunch of tuples, it’s easy to misread it (🙋‍♂️), so I’d vote for using a different style here.

CHANGELOG.rst Outdated Show resolved Hide resolved
Copy link
Contributor

@BoboTiG BoboTiG left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to craft the Response without using multiprocessing and flask?

httpie/models.py Outdated Show resolved Hide resolved
@IlyaSukhanov
Copy link
Contributor Author

IlyaSukhanov commented Jul 4, 2021

If the objective is to test internal of requests to ensure that it returns cookies in right format (and does not regress in future), the viable options I can think of are:

  1. Call a local http server and have it respond with desired payload. This is what e57adec does. e57adec makes use of flask and multiprocessing to start the local http server. There are other options besides flask (http.server.HTTPServer) for this task, but it would still need to rely on multiprocess to ensure server and request can run in parallel.. and it would possibly be sort of fragile unless of course there are ready made solutions made for this that are thoroughly tested.
  2. Mock downstream of requests; either urllib3 or socket and pass up a fake payload through requests ensuring request handles it correctly. I suspect this is not a trivial implementation but could be made easier via something like HTTPretty / Mocket

Its also possible to mock requests.Response directly but it would not protect from internal implementation changes in requests down the road.

@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 4, 2021

Maybe something "lighter" like done in https://github.com/getsentry/responses/blob/bdf4c3b1273025338c6a475a556d741552ffa819/responses/test_responses.py#L845 could be interesting?

I would love to see something using "raw" objects (from urllib3/requests/...) to simply tell the response there is several cookies. And then, check they all are processed as expected.

I do not have a strong opinion as soon as tests are green et possibly not too random.

tests/test_cookie.py Outdated Show resolved Hide resolved
tests/test_cookie.py Outdated Show resolved Hide resolved
@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 6, 2021

@IlyaSukhanov I'll handle the cookies test, it is a bonus and you've done so much already 💪

setup.py Outdated Show resolved Hide resolved
tests/test_cookie.py Outdated Show resolved Hide resolved
@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 6, 2021

OK it should be good now. Do you mind having a final look at the test @IlyaSukhanov?

httpie/models.py Outdated Show resolved Hide resolved
httpie/utils.py Outdated Show resolved Hide resolved
@IlyaSukhanov
Copy link
Contributor Author

Elegant!

An observation, and not a problem: The thread is not actively terminated. I'm not sure if pytest kills threads of between each test. Even if it doesn't I cannot think of a scenario where current implementation would become problematic, if more tests of this sort were added each test server would be isolated to its own thread/port. And obviously pytest finishes execution so it must be terminating the thread at least at the completion of all tests.

👍 🎆

@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 6, 2021

Yes, about the zombie thread, it is not an issue right now. And I failed to find a multi-platform solution. It could be improved if we ever need more tests like that.

@BoboTiG BoboTiG merged commit 147a066 into httpie:master Jul 6, 2021
@BoboTiG
Copy link
Contributor

BoboTiG commented Jul 6, 2021

That's awesome, thanks a lot for your work and patience @IlyaSukhanov 🥧 🍾

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unsafe use of protected member response.raw._original_response
4 participants