Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Gracefully hanlde exceptions that happen after successful request. #214

Closed
wants to merge 1 commit into from

Conversation

tasn
Copy link

@tasn tasn commented Dec 22, 2021

With the current implementation, any exception in the ASGI app will
make mangum return a response with 500, even though the application may
have already correctly generated a valid response. In those cases we
want to return the valid response and log the error, rather than fail.

One example where this would manifest: failures in BackgroundTasks in
FastAPI.

This makes it inline with running FastAPI (and other ASGI) applications
standalone where failures in the background tasks will return not make
the whole request fail.

With the current implementation, any exception in the ASGI app will
make mangum return a response with 500, even though the application may
have already correctly generated a valid response. In those cases we
want to return the valid response and log the error, rather than fail.

One example where this would manifest: failures in BackgroundTasks in
FastAPI.

This makes it inline with running FastAPI (and other ASGI) applications
standalone where failures in the background tasks will return not make
the whole request fail.
@tasn
Copy link
Author

tasn commented Dec 31, 2021

Oops, fixed formatting, linting should now pass.

@jordaneremieff
Copy link
Collaborator

Thanks. I’m traveling atm and can look at this next week.

@tasn
Copy link
Author

tasn commented Jan 2, 2022

Great, thanks!

Btw, a side comment: I wonder if there's a way to get lambda to flush the response to the user and still continue running in the background. So that the user will get the response immediately rather than what's going on now that lambda is waiting for the background tasks to complete as well before returning the HTTP response.

@jordaneremieff
Copy link
Collaborator

jordaneremieff commented Jan 20, 2022

Hi @tasn, I'm able to starting looking at this, could you provide a minimal example app that I could look at to better understand what you're trying to achieve here?

I've not used BackgroundTasks extensively, but based on my current understanding I don't think this feature would be useful in AWS Lambda because it is intended to run operations after the response has already been returned for the request, but AWS Lambda would return the response before and not run these tasks. There may be different architectural solutions for what you are trying to accomplish with BackgroundTasks that are better suited for AWS Lambda deployments, though I cannot advise anything specific.

@tasn
Copy link
Author

tasn commented Jan 20, 2022

Example fastapi app:

# Setup: pip install fastapi uvicorn[standard]
# Running: uvicorn main:app
from fastapi import FastAPI, BackgroundTasks

app = FastAPI()

@app.get("/")
async def root(background_task: BackgroundTasks):
    def background_fail():
        raise RuntimeError("Background task failed")
    background_task.add_task(background_fail)

    return {"message": "Hello World"}

If you run curl localhost:8000 you'll get a 200 response with the correct json, and the error will trigger still, but just not affect the response of the message.
This is with uvicorn.
The same code with mangum will trigger a 500 before my change, with my change it behaves the same as uvicorn.

So it's just a matter of maintaining compatibility between mangum and other ASGI applications.

Is it useful? Well yeah, background tasks are less useful on lambda because they still block the returning of the request so they are not really "background", though having mangum behave like uvicorn is important for compatibility. Especially when one is migrating slowly to lambda. You don't want the two to behave differently in hard to spot ways (or at all).

I hope this helps.

@jordaneremieff
Copy link
Collaborator

The same code with mangum will trigger a 500 before my change, with my change it behaves the same as uvicorn.

Did you test the example you shared in a deployment? I tried to reproduce in a fresh deployment, but the code you provided successfully returned the expected JSON response message and also logged the background task error. The existing exception behavior should be handling the response correctly according to the ASGI cycle state when the exception occurs already.

@tasn
Copy link
Author

tasn commented Jan 20, 2022

We found out about it in production (were getting 5xx and couldn't understand what happened), that's why I made this change in the first place.

@jordaneremieff
Copy link
Collaborator

jordaneremieff commented Jan 20, 2022

My last question was regarding the example provided in your previous comment. I did not encounter a 500 error response using that example, and the RuntimeError was logged.

The ASGI HTTP cycle state (https://mangum.io/http/#httpcyclestate) is intended to comply with the ASGI HTTP specification (https://asgi.readthedocs.io/en/latest/specs/www.html) as closely as possible. If there is some adjustment required to accommodate some specific behavior of AWS Lambda, then I must be able to reproduce the issue to determine if it should actually be handled in Mangum or in your implementation.

@tasn
Copy link
Author

tasn commented Jan 21, 2022

I don't know why you are unable to verify it. I haven't tested this example in prod though, no.

Look at the actual code though. What happens if a request was completed (as per ASGI) but app still throws? See the flow, it won't actually return, it'll throw itself. Which is exactly the issue I'm talking about.

@jordaneremieff
Copy link
Collaborator

jordaneremieff commented Jan 21, 2022

I am unable to verify it because the example provided did not reproduce the problem as described in this issue. I need a reliable means to debug the conditions of the event flow to understand what problem we're actually trying to solve and ensure this does not introduce either unnecessary/overly-specific changes or potential side-effects. Based solely on the available information in this PR this change appears to introduce a coupling of the protocol state and error-handling behavior with a feature that is supported by a specific framework but not included in ASGI specification.

It was also mentioned that a motivation for this change is to support a gradual transition from Uvicorn to AWS Lambda (via Mangum), but it is not necessarily an objective of Mangum to maintain feature parity with traditional ASGI servers or to support the extended capabilities of any particular ASGI framework - this is touched on briefly in the documentation. This library intends to be as compliant as possible with the ASGI spec in a way that both balances the limitations of AWS Lambda and considers framework support more broadly.

Consider that, for example, a fairly prominent ASGI framework Quart has implemented its own concept of background tasks, would this solution also apply to Quart or require some adjustment to support a workaround for both? Alternatively, what if this change solves the problem for background tasks in Starlette but produces some side-effect that causes the implementation in Quart, some other framework, or even future versions of Starlette to behave unexpectedly? I want to avoid going down a path of routinely patching critical areas of this project to accommodate/workaround features in individual frameworks.

I am not outright rejecting your solution, but in addition to my concerns above I was very deliberate when implementing the ASGI protocol event flow so any proposed change that references the application state, even if it may seem minor or obvious, needs to be justified with more information than has already been presented.

@jordaneremieff
Copy link
Collaborator

I will close this, but please open an issue with more details if you still want this to be considered.

@jordaneremieff
Copy link
Collaborator

Just wanted to add a follow-up on this after playing around with background tasks recently. I discovered one potential way to reproduce the 500 which was to have the background task block until it exceeded the Lambda timeout.

Another way might be encountering encode/starlette#919 when using some middleware.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants