Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for form data request #225

Merged
merged 4 commits into from
Aug 4, 2022

Conversation

yedpodtrzitko
Copy link
Collaborator

@yedpodtrzitko yedpodtrzitko commented May 28, 2022

taking over where #185 was left. This is still very much WIP

  • the previous PR made form and json exclusive (ie. only one can be defined at the time), which I dont think it should be enforced. Here's a use-case where we have right now: we have /login/ endpoint, which can be called either from a legacy web which submits <form>, and it can be also used on a new frontend, which submits JSON payload. As such, one route can have both form and json.

  • previous PR is adding usage of cgi, which is about to be deprecated, so it is again removed here

@yedpodtrzitko yedpodtrzitko force-pushed the yed/form-data-support branch 6 times, most recently from b6e257d to 05b0d98 Compare June 26, 2022 14:36
# TODO - possible to pass the BodyPart here?
# req_form = {x.name: x for x in req.get_media()}
req_form = {x.name: x.stream.read() for x in req.get_media()}
req.context.form = form.parse_obj(req_form)
Copy link
Collaborator Author

@yedpodtrzitko yedpodtrzitko Jun 26, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vytas7 hi, you informed me previously about multipart forms support in Falcon here in this PR. I was looking into it, but I cant make it work in a way I'd be happy with.

Here we're parsing the request data & passing it into Pydantic model for validation, which is then making the data available in a view. I'd like to pass into Pydantic model directly the items returned from get_media() (is it BodyPart right?), but I think their content is getting evaluated while passing through Pydantic model, so when trying to read them again in the view later, it's returning just an empty bytes sequence. Which is what the docstring in get_media() says, but I dont know how to access the cached version of the data. Any advice please?

reading in view:

# TODO - pass the BodyPart here?
# file_content = form.file.stream.read()
file_content = form.file
resp.media = {"file": file_content.decode("utf-8")}

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again @yedpodtrzitko!
Yes, Falcon's multipart support is designed for parsing the form on-the-fly, without buffering any data into the memory or temp files behind the user's back (because the former can make you run out of memory, and the latter kill performance, depending on what you're doing).

I'm not familiar with Spectree's codebase, what would you like to send into Pydantic, only the parts serialized as JSON(/YAML/msgpack/etc) and text fields, or also files? Would you like to spool file parts to tempfiles?

In general, if you want to pass a synchronous data structure into Pydantic, you'll need it to create a wrapper to store that data into when parsing.

res_data = {}
async for x in form_data:
res_data[x.name] = x
await x.data # TODO - how to avoid this?
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vytas7 here in async version I dont know how to avoid passing the object through Pydantic model without awaiting it here. Any advice appreciated.

Copy link

@vytas7 vytas7 Aug 2, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yedpodtrzitko It depends what you are trying to achieve. Do you want to buffer whole files in memory? Or do you want to spool them to temp files like some other frameworks do?

Falcon even puts a cap on a maximum amount of data that can be referenced this way (await part.get_data()) in order to avoid surprises such as running out of memory.

Use await part.stream.read() to read the whole part as a bytestring, or await part.stream.pipe(async_file), or read by chunks, and store the result somewhere. You'll probably need to introduce some new object type to hold these attributes.

@yedpodtrzitko
Copy link
Collaborator Author

yedpodtrzitko commented Jul 22, 2022

@kemingy how would you feel about making a breaking change and drop support for Falcon 2? It was released 2019 and never got any patch above 2.0.0 (see https://pypi.org/project/falcon/#history ), so it seems unsupported anyway.

The reason why I ask is it looks like it does not support form data. The pipeline right now fails because of get_media() does not exist in Falcon 2, so I tried to use req.media instead, and I got the following error (with Falcon 2)

AssertionError: {"title": "Unsupported media type", "description": "multipart/form-data is an unsupported media type."}

or maybe there's a way how to make it work I dont know about.

@kemingy
Copy link
Member

kemingy commented Jul 24, 2022

Hi @yedpodtrzitko. I need to take some time to check whether it's possible to support this in falcon 2.

My concern is that falcon 2 will be the default one if users type pip install falcon between 2019-04-27 and 2021-04-06 (almost two years). There may be lots of legacy projects using falcon 2. If we are going to deprecate it, we'd better create a release with deprecate warning about falcon 2 before the final deprecation.

@yedpodtrzitko
Copy link
Collaborator Author

we'd better create a release with deprecate warning about falcon 2 before the final deprecation.

yes that's the standard practice and under normal circumstances I'd go that way too. However it seems like Falcon 2 is dead in the water - I haven't found any mention about its support anywhere and it didnt get any patch above 2.0.0. So if someone installed it in the timespan you mentioned and they havent upgraded by now, then Spectree's breaking change is the latest problem they could have.

Here's the docs for supported media types in Falcon 2 and Falcon 3. v3 has MEDIA_MULTIPART there, v2 has nothing even remotely close to it.

So I can understand the appeal to do things The Right Way, but I dont think it's worth it in this case. If this will be a breaking change, the next version should be 1.0.0, since breaking change should increase the major version number, right? And for the user who are really stuck with Falcon 2, the branch 0.10.x can still receive bugfixes for a period of time.

@kemingy
Copy link
Member

kemingy commented Jul 27, 2022

So I can understand the appeal to do things The Right Way, but I dont think it's worth it in this case. If this will be a breaking change, the next version should be 1.0.0, since breaking change should increase the major version number, right? And for the user who are really stuck with Falcon 2, the branch 0.10.x can still receive bugfixes for a period of time.

I see. I'm okay with this breaking change in 1.0.0.

@yedpodtrzitko
Copy link
Collaborator Author

I see. I'm okay with this breaking change in 1.0.0.

Ok. Do you feel ready (feature-wise) to have the project bumped to version 1.0? I assume that's how you'd do it anyway, but will there be at least one beta/rc release before having final v1.0.0?

@kemingy
Copy link
Member

kemingy commented Jul 28, 2022

I see. I'm okay with this breaking change in 1.0.0.

Ok. Do you feel ready (feature-wise) to have the project bumped to version 1.0? I assume that's how you'd do it anyway, but will there be at least one beta/rc release before having final v1.0.0?

Yep. I think we can have an RC release before the final v1.0.0.

@yedpodtrzitko yedpodtrzitko force-pushed the yed/form-data-support branch from 61147e3 to a44af02 Compare July 29, 2022 06:18
@yedpodtrzitko
Copy link
Collaborator Author

yedpodtrzitko commented Jul 29, 2022

Seems like Jinja 3.1+ has some breaking changes, so it needs to be pinned too. Probably in main branch instead of here.

edit: this seems to be issue only with Flask 1.x, but then it's up to the project to pin the jinja dependency as needed.

Pypy tests seems to be broken, will take a look later.

@yedpodtrzitko yedpodtrzitko force-pushed the yed/form-data-support branch 2 times, most recently from 8effa18 to 6decd6e Compare July 30, 2022 08:35
@yedpodtrzitko yedpodtrzitko force-pushed the yed/form-data-support branch from 6decd6e to 5a2e0df Compare July 30, 2022 08:47
@yedpodtrzitko yedpodtrzitko changed the title WIP: form data support add support for form data request Jul 30, 2022
@yedpodtrzitko
Copy link
Collaborator Author

I had to revert some changes from #185 which were making PyPy pipeline fail, but it seems to be all passing now.

I would prefer to not squash the commits, so the work of folks who were working on this previously is properly attributed to them (I squashed it to one commit per each author already).

@yedpodtrzitko yedpodtrzitko requested a review from kemingy July 30, 2022 09:00
@yedpodtrzitko yedpodtrzitko force-pushed the yed/form-data-support branch from 5a2e0df to af3fff0 Compare July 30, 2022 14:01
@@ -101,7 +94,22 @@ def on_post(self, req, resp, source, target):
resp.media = {"loc": "unknown", "msg": "bad luck", "typ": "random"}
return
resp.media = {"label": int(10 * random()), "score": random()}
# resp.media = Resp(label=int(10 * random()), score=random())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's another way to return the response (#212). I guess we can make it more clear in the comments.

@kemingy
Copy link
Member

kemingy commented Aug 4, 2022

  • So far, most of the code changes look good to me. I'm going to merge this PR(also the original Form data support #185) since it's opened for a very long time.
  • Really appreciate your contributions @Michae1Weiss @yedpodtrzitko @vytas7
  • Also, I'll open another issue to keep track of the falcon form data read part. I'm not going to release a stable version until it's well handled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants