-
-
Notifications
You must be signed in to change notification settings - Fork 947
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: ASGI support #1573
feat: ASGI support #1573
Conversation
OK folks, this is ready for review! |
afb1d38
to
4a314fa
Compare
Implements #1358 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job, I haven't managed to review it in depth yet so just some quick things I reacted to.
Furthermore, I'll try to just fiddle with the whole thing with semi-real apps so I get a better "feel" of it, and can hopefully provide better feedback.
One general question regarding Cython, do we need to make them [Cython and ASGI] mutually exclusive?
I'm thinking if we couldn't just exclude ASGI files (basically anything involving async
or await
keywords) from cythonization? If async stuff is calling "normal" Python functions, those can be well be cythonized, or can't they?
elif self.text is not None: | ||
block += 'data: ' + self.text + '\n' | ||
elif self.json is not None: | ||
block += 'data: ' + json_dumps(self.json, ensure_ascii=False) + '\n' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Composing string this way from many items (local_var += '...something...'
etc) is optimized on CPython if you keep concatenating towards a local variable, but there is no guarantee it wouldn't perform poorly on PyPy (unless you benchmarked it?)
See also here -- String concatenation is expensive, the recommended way is
parts = []
for x in mylist:
parts.append(foo(x))
s = "".join(parts)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC PyPy actually optimizes string concatenation pretty well, at least for small numbers of concatenations. In the past when I've benchmarked this strategy, concatenation tended to be faster for at least up to ~5 concatenations (at the expense of a tiny bit of memory overhead), and it doesn't seem likely that we would go beyond that here. That being said, I will do some specific benchmarking with this method as-is vs. using join()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 small string concatenations would definitely compare in favour or +=
.
I was just thinking here we actually had more concatenations, like block += 'data: ' + self.text + '\n'
is actually 3 +=
concatenations (although the two first reduce to BINARY_ADD
, but the final concat to block is INPLACE_ADD
) and so on.
But I wouldn't be surprised it still outperforms join()
(at least on CPython).
falcon/util/misc.py
Outdated
@@ -305,3 +311,89 @@ def get_http_status(status_code, default_reason='Unknown'): | |||
except AttributeError: | |||
# not found | |||
return str(code) + ' ' + default_reason | |||
|
|||
|
|||
@functools.lru_cache(maxsize=64) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've commented on #1135 (comment), functools.lru_cache
had thread-safety issues which where fixed post- the CPython 3.6+ release.
Maybe the issue is less serious here with async being mixed-and-matched with threads less often, and us mandating Python 3.6+, but still thought it might be worth bringing attention to the issue. Maybe we should check which CPython 3.5.x and 3.6.x versions are susceptible to the issue, and make a decision based on the findings. And check PyPy 3.6 if we are going to support ASGI on it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch. I think we will have to do something about this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted to just use a nop for older pythons. The performance decrease probably won't be noticeable for most people, and over time those platforms will be used less and less.
I started the whole thing (testing in progress), issues found so far: Serious issues: Cosmetic issues/desired improvements etc:
|
""" | ||
|
||
data = await stream.read() | ||
return self.deserialize(io.BytesIO(data), content_type, content_length) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to override content_length
here ➡️ len(data)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To explain a bit more how I was reasoning here.
Although we'll be changing the media API to not always mandate the content_length
, but this might be helpful to legacy handlers that may be expecting the perfect content_length
.
OTOH multipart form part media handling has the same requirement even on the WSGI side, so maybe there is no point in shimming this way if we cannot consistently guarantee it anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't hurt. I went ahead and made the change.
I've rebased this on master (non-trivial!) and made a few tweaks. I still need to take care of the following:
|
tests/asgi/test_asgi.py
Outdated
import falcon.testing as testing | ||
|
||
_SERVER_HOST = '127.0.0.1' | ||
_SERVER_PORT = 9000 + os.getpid() % 100 # Facilitates parallel test execution |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that while OS PIDs are normally assigned sequentially on Unix-like systems, one needs to walk around existing PIDs for other programs, furthermore, allocation is restarted (often to PID 300) at some point to recycle IDs. The approach presented here is probably good enough for practical use, but could it be worth adding a disclaimer note?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I went ahead and just switched to using a simple RNG with a fairly large range. That should be sufficient and be easier to reason about.
a45ae55
to
2331955
Compare
374b655
to
c7b5cf9
Compare
@vytas7 OK, I think I've addressed all your feedback so far. |
I'm going to start working on docstrings and towncrier fragments. Everyone please let me know if you see any other implementation/design issues. |
Codecov Report
@@ Coverage Diff @@
## master #1573 +/- ##
======================================
Coverage 100% 100%
======================================
Files 45 45
Lines 3102 3102
Branches 479 479
======================================
Hits 3102 3102 Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM for the alpha 🎉
One area that may need improvement is recipes how to test SSE events in a test suite employing simulated requests, but I don't perceive this as a blocker for the alpha. I also need to experiment more with SSE to provide meaningful suggestions.
To be more specific, it is quite common that SSE events are emitted due to server state changes caused by other requests. Is it possible to simulate parallel ASGI requests towards an asgi.App
? I.e. that I keep the simulated SSE response stream open while simulating PATCH
, POST
, DELETE
etc and validating that I'm receiving the corresponding SSE messages?
tests/test_headers.py
Outdated
def cors_client(): | ||
app = falcon.App(cors_enable=True) | ||
def cors_client(asgi): | ||
app = create_app(asgi, cors_enable=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get CompatibilityError
when using cors_enable=True
with asgi App.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
@vytas7 re SSE testing that may be worth creating a standalone issue so we don't forget about it? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Due to the cython bug re In some cases we could simply not do the check (when it is just to highlight a developer mistake), but in other cases it is less clear how we would avoid using See also: |
Yeah, I guess we'll just have to be transparent in the docs and make sure the warnings are easily discoverable. OTOH we could make everything explicit, and always expect coroutines where applicable, i.e.
Going this path would make porting clumsier, and easier to make mistakes due to lacking checks though... FWIW I was reading more on the subject, and although Cython maintainers state on cython/cython#2273 they would accept a PR exposing a faux |
Agreed. It seems that the proposed solution will only work for ...hence the proposal to try and create an attribute-based protocol for it in core python. If a PR does land that makes Regardless, I think it does make sense to provide an escape hatch for anyone using cythonized async functions in their app. I could provide alternative functions that just assume the thing is a coroutine. I would then sprinkle the docs with warnings re the issue and the workaround (using the alternative function names). I could even catch any TypeErrors that are raised while attempting to await the result of calling a non-coroutine function, check for cython in the environment, and finally raise an error with a message that talks about the workaround. If we are able to come to a rough consensus on this, I can tackle it following 3.0.0a1. |
# hasn't cleaned up yet. | ||
# NOTE(kgriffs): Use our own Random instance because we don't want | ||
# pytest messing with the seed. | ||
server_port = _random.randint(50000, 60000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll probably need to make this more robust in the future, because it may randomly clash with other stuff on the system; although the probability is low, it may clash between tests as well.
As evidenced in CI (I have now rerun the job in question):
[Server process start succeeded]
---------------------------- Captured stderr setup -----------------------------
2020-01-20 08:12:03,049 INFO Starting server at tcp:port=53348:interface=127.0.0.1
2020-01-20 08:12:03,049 INFO HTTP/2 support not enabled (install the http2 and tls Twisted extras)
2020-01-20 08:12:03,050 INFO Configuring endpoint tcp:port=53348:interface=127.0.0.1
2020-01-20 08:12:03,050 CRITICAL Listen failure: Couldn't listen on 127.0.0.1:53348: [Errno 98] Address already in use.
This patch adds the much-anticipated async support to Falcon by way of a
new ASGI interface, additional testing helpers, and updated internals.
Note that only the HTTP ASGI interface is implemented. WebSocket support
is planned to follow.
Docs will be fully fleshed-out in a follow-up PR.
Changelog snippets will also be added in a follow-up PR.
In order to reconcile differences between the WSGI and ASGI interfaces,
several breaking changes were made in this patch, as follows:
BREAKING CHANGE: create_environ no longer sets a default user agent header
BREAKING CHANGE: Renamed
protocol
kwarg for create_environ() tohttp_version
and also the renamed kwarg only takes the version string(no longer prefixed with "HTTP/")
BREAKING CHANGE: Renamed
app
kwarg for create_environ() toroot_path
.and deprecated, may be removed in a future release.
BREAKING CHANGE: get_http_status() is deprecated, no longer accepts floats
BREAKING CHANGE: BoundedStream.writeable() changed to writable() per the
standard file-like I/O interface (the old name was a misspelling).
BREAKING CHANGE: api_helpers.prepare_middleware() no longer accepts a single
object; the value that is passed must be an iterable.
BREAKING CHANGE: Removed outer "finally" block from API and APP; add an
exception handler for the base Exception type if you need to deal with
unhandled exceptions.
BREAKING CHANGE: falcon.request.access_route will now include the value of
the remote_addr property as the last element in the route, if not already present
in one of the headers that are checked.
BREAKING CHANGE: When the 'REMOTE_ADDR' field is not present in the WSGI
environ, Falcon will assume '127.0.0.1' for the value, rather than
simply returning
None
for Request.remote_addr.