Skip to content

Array of files in multipart/form-data is not handled correctly #692

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
davidlizeng opened this issue Oct 21, 2022 · 3 comments · Fixed by #1267
Closed

Array of files in multipart/form-data is not handled correctly #692

davidlizeng opened this issue Oct 21, 2022 · 3 comments · Fixed by #1267
Labels
🐞bug Something isn't working 🆘 help wanted Extra attention is needed

Comments

@davidlizeng
Copy link

Describe the bug
For multipart/form-data with an array of files, generated code tries to serialize the array of files as JSON.

To Reproduce
Steps to reproduce the behavior:

  1. Using the spec included in this bug report, run openapi-python-client --path spec.json
  2. Try to run the following code:
from multiple_upload_client.client import Client
from multiple_upload_client.models import UploadMultipleMultipartData
from multiple_upload_client.api.files import upload_multiple
from multiple_upload_client.types import File

client = Client(base_url="http://localhost:8080")
upload_multiple.sync_detailed(
  client=client,
  multipart_data=UploadMultipleMultipartData(
    files=[
      File(
        payload=open("path to some local file", "rb"),
        file_name="sample.jpeg",
        mime_type="image/jpeg"
      )
    ]
  )
)

The following error occurs:

Traceback (most recent call last):
  File "test_multiple_upload.py", line 7, in <module>
    upload_multiple.sync_detailed(
  File "/Users/davidzeng/butler/src/experimental/test-codegen/opc/multiple-upload-client/multiple_upload_client/api/files/upload_multiple.py", line 62, in sync_detailed
    kwargs = _get_kwargs(
  File "/Users/davidzeng/butler/src/experimental/test-codegen/opc/multiple-upload-client/multiple_upload_client/api/files/upload_multiple.py", line 20, in _get_kwargs
    multipart_multipart_data = multipart_data.to_multipart()
  File "/Users/davidzeng/butler/src/experimental/test-codegen/opc/multiple-upload-client/multiple_upload_client/models/upload_multiple_multipart_data.py", line 47, in to_multipart
    files = (None, json.dumps(_temp_files).encode(), "application/json")
  File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/__init__.py", line 231, in dumps
    return _default_encoder.encode(obj)
  File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 199, in encode
    chunks = self.iterencode(o, _one_shot=True)
  File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 257, in iterencode
    return _iterencode(o, 0)
  File "/usr/local/opt/python@3.8/Frameworks/Python.framework/Versions/3.8/lib/python3.8/json/encoder.py", line 179, in default
    raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type BufferedReader is not JSON serializable

Expected behavior
The generated code for handling an array of files seems to be trying to serialize the files as json:

    def to_multipart(self) -> Dict[str, Any]:
        files: Union[Unset, Tuple[None, bytes, str]] = UNSET
        if not isinstance(self.files, Unset):
            _temp_files = []
            for files_item_data in self.files:
                files_item = files_item_data.to_tuple()

                _temp_files.append(files_item)
            files = (None, json.dumps(_temp_files).encode(), "application/json")

        field_dict: Dict[str, Any] = {}
        field_dict.update(
            {key: (None, str(value).encode(), "text/plain") for key, value in self.additional_properties.items()}
        )
        field_dict.update({})
        if files is not UNSET:
            field_dict["files"] = files

        return field_dict

Based on https://www.python-httpx.org/advanced/#multipart-file-encoding, we should probably be doing something more like the following, treating the multipart data as a list of tuples, with field keys that can repeat. Each file is added to the list under the same "files" key:

    def to_multipart(self) -> List[Tuple[str, FileJsonType]]:
        field_list = []
        if not isinstance(self.files, Unset):
            for files_item_data in self.files:
                files_item = files_item_data.to_tuple()
                field_list.append(("files", files_item))

        for key, value in self.additional_properties.items():
            field_list.append((key, (None, str(value).encode(), "text/plain")))

        return field_list

OpenAPI Spec File

{
  "openapi": "3.0.0",
  "paths": {
    "/api/files/upload_multiple": {
      "post": {
        "operationId": "uploadMultiple",
        "summary": "Uploads multiple files",
        "requestBody": {
          "content": {
            "multipart/form-data": {
              "schema": {
                "type": "object",
                "properties": {
                  "files": {
                    "type": "array",
                    "items": {
                      "type": "string",
                      "format": "binary"
                    }
                  }
                }
              }
            }
          },
          "required": true
        },
        "parameters": [],
        "responses": {
          "201": {
            "description": "Returns some random string",
            "content": {
              "application/json": {
                "schema": {
                  "type": "string"
                }
              }
            }
          }
        },
        "tags": [
          "files"
        ]
      }
    }
  },
  "info": {
    "title": "Multiple Upload",
    "description": "Test spec for array of files in multipart/form-data",
    "version": "0.0.1",
    "contact": {}
  },
  "tags": [],
  "servers": [],
  "components": {
    "schemas": {}
  }
}

Desktop (please complete the following information):

  • OS: macOS 10.15.7
  • Python Version: 3.8.13
  • openapi-python-client version: 0.11.6

Additional context
Add any other context about the problem here.

@davidlizeng davidlizeng added the 🐞bug Something isn't working label Oct 21, 2022
@davidlizeng
Copy link
Author

BTW, I'd be happy to contribute a fix for this if this is indeed a bug.

But wanted to first verify to see if there was a reason that the list of files was being serialized as json.

@hegdeashwin
Copy link

I'm facing the same issue. Any plans to merge the fix?

@dbanty
Copy link
Collaborator

dbanty commented Jan 16, 2023

This will work today if you have a schema like this:

"schema": {
    "type": "array",
    "items": {
        "type": "string",
        "format": "binary"
    }
}

You're right though, that the object format does need to be supported so that explicit file names can be set via the object keys. This probably needs to be a different object (not File) since the consumer should not set the key 🤔.

Also, for completeness, the encoding object will give better control over how this works. For an initial fix, though, using all of the defaults it defines (e.g., arrays inherit, application/json is the default for object).

I'm totally open to a PR implementing a solution here!

@dbanty dbanty added the 🆘 help wanted Extra attention is needed label Jan 16, 2023
github-merge-queue bot pushed a commit that referenced this issue Jun 6, 2025
As described in #692 arrays of files are not handled correctly, if they
are part of multipart/form-data. This is fixed in this PR by letting
`to_multipart` return a `List[Tuple[str, Any]]` instead of a `Dict[str,
Any]`.

---------

Co-authored-by: Dylan Anthony <dbanty@users.noreply.github.com>
Co-authored-by: Dylan Anthony <43723790+dbanty@users.noreply.github.com>
@knope-bot knope-bot bot mentioned this issue Jun 6, 2025
github-merge-queue bot pushed a commit that referenced this issue Jun 6, 2025
> [!IMPORTANT]
> Merging this pull request will create this release

## Breaking Changes

- Raise minimum httpx version to 0.23

### Removed ability to set an array as a multipart body

Previously, when defining a request's body as `multipart/form-data`, the
generator would attempt to generate code
for both `object` schemas and `array` schemas. However, most arrays
could not generate valid multipart bodies, as
there would be no field names (required to set the `Content-Disposition`
headers).

The code to generate any body for `multipart/form-data` where the schema
is `array` has been removed, and any such
bodies will be skipped. This is not _expected_ to be a breaking change
in practice, since the code generated would
probably never work.

If you have a use-case for `multipart/form-data` with an `array` schema,
please [open a new
discussion](https://github.com/openapi-generators/openapi-python-client/discussions)
with an example schema and the desired functional Python code.

### Change default multipart array serialization

Previously, any arrays of values in a `multipart/form-data` body would
be serialized as an `application/json` part.
This matches the default behavior specified by OpenAPI and supports
arrays of files (`binary` format strings).
However, because this generator doesn't yet support specifying
`encoding` per property, this may result in
now-incorrect code when the encoding _was_ explicitly set to
`application/json` for arrays of scalar values.

PR #938 fixes #692. Thanks @micha91 for the fix, @ratgen and
@FabianSchurig for testing, and @davidlizeng for the original report...
many years ago 😅.

Co-authored-by: knope-bot[bot] <152252888+knope-bot[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🐞bug Something isn't working 🆘 help wanted Extra attention is needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants