-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A more memory-efficient HttpResponseProvider
#95
Comments
hey @BurnzZ! I'm not sure it's going to be an issue. The largest field in HttpResponse is Though maybe you're right, as we're converting it to HttpResponseBody instances. I'm not sure what's Python behavior in this case. |
I think the byte string is still copied over by value and not by reference: from web_poet import HttpResponse
text = b"Hello World"
r1 = HttpResponse(url="example.com", body=text)
r2 = HttpResponse(url="example.com", body=text)
r1.body == r2.body # True
r1.body is r2.body # False
id(r1.body) == id(r2.body) # False I think it has something to do with If we define it as something like the following, then the object reference is used: class HttpResponse:
def __init__(self, url, body):
self.url = url
self.body = body
text = b"Hello World"
r1 = HttpResponse(url="example.com", body=text)
r2 = HttpResponse(url="example.com", body=text)
r1.body == r2.body # True
r1.body is r2.body # True
id(r1.body) == id(r2.body) # True So I guess one approach is to modify how we define |
It could be more about HttpResponseBody objects created from the response.body passed to |
https://github.com/scrapinghub/scrapy-poet/blob/master/scrapy_poet/page_input_providers.py#L165-L180
Currently, the
HttpResponseProvider
creates a newHttpResponse
instance each time it's called:From another thread:
It's not a crucial issue for now but it can certainly be made more efficient by having the provider return the same
HttpResponse
instance given a response identifier.HttpResponseProvider
already inherits fromCacheDataProviderMixin
. Perhaps we can use an in-memory cache to determine if we can return the same instance instead of creating a new one.The text was updated successfully, but these errors were encountered: