-
-
Notifications
You must be signed in to change notification settings - Fork 230
Fix multipart body arrays #938
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix multipart body arrays #938
Conversation
651fddc
to
fe841d7
Compare
Thanks for adding this support! I believe this is a breaking change, since it's possible someone was using the JSON serialization behavior before (even though it seems wrong in general). I'll need to do some manual testing just to make sure I know how to describe the breaking change 😅. I'd also love to get someone to test a real running API and verify it all functions (maybe you've already done this?) |
Yes, I already use the updated model template as a custom template in a project. Unfortunately this project is not public yet and the API it is developed for, isn't publicly accessible, too |
Hi, just tested this out with an api I am working on. It works great. |
@dbanty @micha91 Thank you so much for providing the I tested it in the following way:
"/upload": {
"post": {
"tags": [
"Microsoft.KernelMemory.ServiceAssembly"
],
"summary": "Upload a new document to the knowledge base",
"description": "Upload a document consisting of one or more files to extract memories from. The extraction process happens asynchronously. If a document with the same ID already exists, it will be overwritten and the memories previously extracted will be updated.",
"operationId": "UploadDocument",
"requestBody": {
"description": "Document to upload and extract memories from",
"content": {
"multipart/form-data": {
"schema": {
"type": "object",
"properties": {
"index": {
"type": "string",
"description": "Name of the index where to store memories generated by the files."
},
"documentId": {
"type": "string",
"description": "Unique ID used for import pipeline and document ID."
},
"tags": {
"type": "object",
"additionalProperties": {
"type": "string"
},
"description": "Tags to apply to the memories extracted from the files."
},
"steps": {
"type": "array",
"items": {
"type": "string"
},
"description": "How to process the files, e.g. how to extract/chunk etc."
},
"files": {
"type": "array",
"items": {
"type": "string",
"format": "binary"
},
"description": "Files to process and extract memories from."
}
}
}
}
}
}, In general it works! @dbanty What do you think about an alternative where a user could just overwrite how a "files": {
"type": "array",
"items": {
"type": "string",
"format": "binary"
}, is handled? This could either generate: files: Union[Unset, tuple[None, bytes, str]] = UNSET
if not isinstance(self.files, Unset):
_temp_files = []
for files_item_data in self.files:
files_item = files_item_data.to_tuple()
_temp_files.append(files_item)
files = (None, json.dumps(_temp_files).encode(), "application/json") or files: Union[Unset, tuple[None, bytes, str]] = UNSET
if not isinstance(self.files, Unset):
files = files The later works well in the case and should be just a minimal, non-breaking and optional change? |
453d017
to
5e8ad7a
Compare
I reimplemented the changes based on the latest main branch. I tried to merge changes from main in here, but there were just too many changes in the end and the review view was just messed up completely. |
Arrays of files are being tested in other endpoints, and this one was slightly malformed.
@micha91 thanks for updating! I'm slowing making some changes as I get time. A couple things still left to look into:
Hoping to get this all wrapped up soon. Thanks for your hard work and patience on this 🙇 |
Integration tests are updated, so left to do:
|
@micha91 (and all following) thanks for all the work, I think this is almost ready! One thing I've noticed as I go through and test/fix more stuff around multipart is that having a top-level schema of an array really didn't work anyway. The test we had with this schema: "multipart/form-data": {
"schema": {
"type": "array",
"items": {
"type": "string",
"format": "binary"
}
}
} Produced code like this: for body_item_data in body:
body_item = body_item_data.to_tuple()
_body.append(body_item)
_kwargs["files"] = _body Which as far as I can tell is completely invalid, since each Any reason you can think of that we should keep some sort of top-level array-of-files support? How would that even work in multipart? |
@dbanty Thanks for your additional efforts on this! I will give the current implementation a try today to see if it also works in our real world client 👍
I think it is simply unsupported when using multipart, so it should be absolutely correct to drop it |
Works as expected. I still need custom some custom templates, but that's due to issues in the specification |
> [!IMPORTANT] > Merging this pull request will create this release ## Breaking Changes - Raise minimum httpx version to 0.23 ### Removed ability to set an array as a multipart body Previously, when defining a request's body as `multipart/form-data`, the generator would attempt to generate code for both `object` schemas and `array` schemas. However, most arrays could not generate valid multipart bodies, as there would be no field names (required to set the `Content-Disposition` headers). The code to generate any body for `multipart/form-data` where the schema is `array` has been removed, and any such bodies will be skipped. This is not _expected_ to be a breaking change in practice, since the code generated would probably never work. If you have a use-case for `multipart/form-data` with an `array` schema, please [open a new discussion](https://github.com/openapi-generators/openapi-python-client/discussions) with an example schema and the desired functional Python code. ### Change default multipart array serialization Previously, any arrays of values in a `multipart/form-data` body would be serialized as an `application/json` part. This matches the default behavior specified by OpenAPI and supports arrays of files (`binary` format strings). However, because this generator doesn't yet support specifying `encoding` per property, this may result in now-incorrect code when the encoding _was_ explicitly set to `application/json` for arrays of scalar values. PR #938 fixes #692. Thanks @micha91 for the fix, @ratgen and @FabianSchurig for testing, and @davidlizeng for the original report... many years ago 😅. Co-authored-by: knope-bot[bot] <152252888+knope-bot[bot]@users.noreply.github.com>
As described in #692 arrays of files are not handled correctly, if they are part of multipart/form-data. This is fixed in this PR by letting
to_multipart
return aList[Tuple[str, Any]]
instead of aDict[str, Any]
.