Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Alternative approach to streaming, using async generators. #1645

Closed
wants to merge 1 commit into from

Conversation

Tronic
Copy link
Member

@Tronic Tronic commented Jul 26, 2019

A proof of concept test. I wanted to avoid the use of a callback in an otherwise async framework.

@app.route("/")
async def index(req):
    res = yield sanic.response.text(None)
    await asyncio.sleep(1)
    await res.write("Hello\n")
    await asyncio.sleep(1)
    await res.write("World\n")
    await asyncio.sleep(1)

This approach would also be compatible with HTTP/2 Server Push: multiple responses could be yielded and even sent to in parallel because each yield returns a response object associated with that stream.

…a callback (still built on StreaminHTTPResponse).
@ahopkins ahopkins changed the title Alternative approach to streaming, using async generators. WIP: Alternative approach to streaming, using async generators. Aug 11, 2019
@ahopkins
Copy link
Member

I like this idea. I think we need to flesh it out some and provide some tests to see how it would look.

Copy link
Member

@ahopkins ahopkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add tests for both Sanic server (app.test_client) and ASGI (app.asgi_client).

@abuckenheimer
Copy link
Contributor

This is a clever little change! Reducing the boilerplate in setting up a streaming response is definitely worth taking a look at. Use of a generator here does feel a little unnatural to me though as I would expect we'd be using it to yield the data that we want to stream instead we are using it exactly once to set just the response params. res = yield sanic.response.text(None) is also a bit magical and doesn't lead users to an explicit sanic api call making it harder to understand whats going on here.

Just for fun to look at an alternative approach I came up with

@app.stream("/stream")
async def index(request, response):
    await asyncio.sleep(1)
    await response.write("Hello\n")
    await asyncio.sleep(1)
    await response.write("World\n")
    await asyncio.sleep(1)

we could implement this rather easily with something like:

def stream(self, *args, headers=None, content_type='text/plain; charset=utf-8"', **kwargs):
        route_dec = self.route(*args, **kwargs)

        def dec(func):
            def wrapper(request):
                async def inner_wrapper(response):
                    return await func(request, response)
                return StreamingHTTPResponse(inner_wrapper, headers=headers, content_type=content_type)
            return route_dec(wrapper)
        return dec

to me this has even less boilerplate and is a bit more intuitive but its worth nothing that this is strictly less powerful than the async generator approach as the user can't change the response params (headers/content_type/status) after inspecting the request

@Tronic
Copy link
Member Author

Tronic commented Aug 22, 2019

@abuckenheimer I also considered something along those lines (many other frameworks always pass separate request and response objects to your handler). However, as you correctly state, that approach is not very powerful, as it is tied to that one response.

Another way to do it, maybe less invasive for user code, is to make the request object aware of the server connection. Then one could do something like res = await req.stream(response.json()) which would immediately send response headers and return an object that can be used to send body chunks.

HTTP/2 pushes are always based on a client-made request. The push promise, sent by server, contains stream ID of the client request, a server-generated new stream ID and new/modified request headers (request body is not allowed). Typically all header fields are copied off the original request, with minor changes (like new path). After this the server "responds" its own request on this newly created stream as if it was a normal request from the client. The client may at any point cancel this stream but it is also likely that the server has already sent the entire response before the client even sees the push frame.

Extending the request object would be natural for await req.push_promise("/some/other/handler") (no-op on HTTP/1.1 and on HTTP/2 delegate to another handler working in parallel; do not await for handler to be called), or pushres = await req.push_stream(path, overridden request headers..., also a response object?) returning a new stream where the pushed response may be sent by the same handler. It is essential that awaiting these causes the PUSH_PROMISE frame be sent prior to later code that may be sending HTML or other code that refers to these resources, to avoid the browser issuing new requests for files that are already being pushed.

If such push APIs were implemented, I would expect for push_promise to deliver the responses in parallel, while push_stream would probably be used in strictly sequential manner (first make all promises, then send main response header & body, then send pushed responses), which is probably slower overall (unless big gains come from avoided session processing or other such things since other handlers need not be called).

An added complexity with push streams (in addition to Sanic not yet having any HTTP/2 support) is that any of the streams may be closed by the client without affecting others (if it doesn't want that file). As such, code writing to one stream should not exit but finish the other streams even if it encounters a write error on a push stream, and Sanic should ensure that all streams are closed when the responsible handler exits, because the HTTP/2 server connection is probably still being used for entirely unrelated requests and it has a limited number of active streams allowed.

I've also investigated possible issues with asyncgens and the most relevant one is PEP533 which would be quite necessary for safe use of asyncgens. Meanwhile, whoever is iterating that asyncgen (i.e. Sanic core) must make sure that it is iterated to its end even in exceptional situations. For instance, if the connection is lost, throw an exception into the asyncgen and then keep iterating until it ends. Otherwise finalizers and other cleanup code within user code (anywhere inside that generator) may get run at some random point later on (when the generator object is garbage collected). TL;DR: Somewhat problematic but can be done.

@Tronic
Copy link
Member Author

Tronic commented Aug 22, 2019

@ahopkins The current implementation is a least-amount-of-work hack, as doing it properly would require some architectural changes. If you wish, I can see what can be done, but in particular I do not wish to depend on the callback-based streaming response but rather reimplement it on top of the asyncgen-based approach (if needed for backward compatibility).

What is your take on extending the request object, as discussed above, for awaitable response API, instead of asyncgens, any strong opinions for or against?

@abuckenheimer
Copy link
Contributor

so this is still a bit ugly but I mocked up a comparison using some of the ideas you mention without an asyncgen #1661

@Tronic
Copy link
Member Author

Tronic commented Aug 27, 2019

I couldn't even begin to think of how to implement this in the asyncio-protocol-callback mess of server.py that somehow calls async code in app.py, that then calls the request handler and somehow returns the answer by a pair of callback functions.

As a result, I rewrote the whole thing in Trio, and will be experimenting on top of that. #1662

@Tronic
Copy link
Member Author

Tronic commented Sep 2, 2019

This and @abuckenheimer version have evolved into something that I am now sketching in the Trio branch. I believe that it is best to avoid asyncgens for now, so I am closing this pull request.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants