Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.msg files cannot be uploaded #4795

Closed
LaurensBurger opened this issue Oct 25, 2024 · 2 comments · Fixed by #4961 or #4976
Closed

.msg files cannot be uploaded #4795

LaurensBurger opened this issue Oct 25, 2024 · 2 comments · Fixed by #4961 or #4976
Assignees
Labels
bug Something isn't working needs-backport Fix must be backported to stable release branch owner: utrecht

Comments

@LaurensBurger
Copy link
Collaborator

LaurensBurger commented Oct 25, 2024

Product versie / Product version

tested on 2.7.x 2.8.x

Customer reference

Utr 306

Omschrijf het probleem / Describe the bug

Sentry: 313314

When uploading a .msg file and after navigating to the next step the upload component shows a error:
image

ValidationError {'data': {'bestandsupload': {0: {'type': [ErrorDetail(string="Het bestandstype kon niet bepaald worden. Controleer of de bestandsnaam met een extensie eindigt (bijvoorbeel '.pdf' of '.png').", code='blank')]}}}}

Tested with multiple OSes, and multiple files and environments.

@LaurensBurger LaurensBurger added bug Something isn't working triage Issue needs to be validated. Remove this label if the issue considered valid. labels Oct 25, 2024
@joeribekker
Copy link
Contributor

Resembles #4773

@joeribekker joeribekker removed the triage Issue needs to be validated. Remove this label if the issue considered valid. label Oct 28, 2024
@sergei-maertens sergei-maertens self-assigned this Dec 19, 2024
@sergei-maertens
Copy link
Member

I'll provide a .msg file

@robinmolen robinmolen moved this from Todo to In Progress in Development Dec 19, 2024
@sergei-maertens sergei-maertens removed their assignment Dec 19, 2024
robinmolen added a commit that referenced this issue Dec 19, 2024
The sdk cannot determine which content type belongs to a .msg file. This is because (at least) Linux and MacOS don't know this file type.

To make sure these files can be uploaded, the type property on the FileSerializer is now optional. For .smg files a new rule has been added to the MimeTypeValidator
@robinmolen robinmolen mentioned this issue Dec 19, 2024
10 tasks
robinmolen added a commit that referenced this issue Dec 19, 2024
@robinmolen robinmolen moved this from In Progress to Implemented in Development Dec 19, 2024
@robinmolen robinmolen moved this from Implemented to In Progress in Development Dec 19, 2024
robinmolen added a commit that referenced this issue Dec 23, 2024
The sdk cannot determine which content type belongs to a .msg file. This is because (at least) Linux and MacOS don't know this file type.

To make sure these files can be uploaded, the type property on the FileSerializer is now optional. For .smg files a new rule has been added to the MimeTypeValidator
robinmolen added a commit that referenced this issue Dec 23, 2024
robinmolen added a commit that referenced this issue Dec 23, 2024
The sdk cannot determine which content type belongs to a .msg file. This is because (at least) Linux and MacOS don't know this file type.

To make sure these files can be uploaded, the type property on the FileSerializer is now optional. For .smg files a new rule has been added to the MimeTypeValidator
robinmolen added a commit that referenced this issue Dec 23, 2024
robinmolen added a commit that referenced this issue Dec 23, 2024
robinmolen added a commit that referenced this issue Dec 23, 2024
robinmolen added a commit that referenced this issue Dec 23, 2024
@robinmolen robinmolen moved this from In Progress to Implemented in Development Dec 23, 2024
sergei-maertens pushed a commit that referenced this issue Dec 27, 2024
The sdk cannot determine which content type belongs to a .msg file. This is because (at least) Linux and MacOS don't know this file type.

To make sure these files can be uploaded, the type property on the FileSerializer is now optional. For .smg files a new rule has been added to the MimeTypeValidator
sergei-maertens pushed a commit that referenced this issue Dec 27, 2024
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The sdk cannot determine which content type belongs to a .msg file. This is because (at least) Linux and MacOS don't know this file type.

To make sure these files can be uploaded, the type property on the FileSerializer is now optional. For .smg files a new rule has been added to the MimeTypeValidator
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
@sergei-maertens sergei-maertens moved this from Implemented to In Progress in Development Dec 30, 2024
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The SDK cannot reliably determine which content type belongs to a .msg
file, most notably on Linux and MacOS because the extension is not in
the mime type database. This manifests as a file being uploaded with empty
content-type.

To allow these files to go through, the serializer must allow empty
values for the 'type' field which contains the detected content type,
and the backend must perform additional processing to determine the file
type. We can do this by falling back to the generic case of 'binary
file' (application/octet-stream) content type, and let libmagic figure
out which extensions belong to the magic bytes, i.e. we look at the
magic bytes to figure out what kind of file was provided, and we check
the provided file extensions against the list of valid extensions for
the detected file type.
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
@sergei-maertens sergei-maertens added the needs-backport Fix must be backported to stable release branch label Dec 30, 2024
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The validator now rejects (temporary) file uploads that don't have an
extension, as this prevents validating that the extension and content
type match. This used to pass and would then be caught later when
linking the file upload component and temporary file upload.
@sergei-maertens sergei-maertens moved this from In Progress to Implemented in Development Dec 30, 2024
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The SDK cannot reliably determine which content type belongs to a .msg
file, most notably on Linux and MacOS because the extension is not in
the mime type database. This manifests as a file being uploaded with empty
content-type.

To allow these files to go through, the serializer must allow empty
values for the 'type' field which contains the detected content type,
and the backend must perform additional processing to determine the file
type. We can do this by falling back to the generic case of 'binary
file' (application/octet-stream) content type, and let libmagic figure
out which extensions belong to the magic bytes, i.e. we look at the
magic bytes to figure out what kind of file was provided, and we check
the provided file extensions against the list of valid extensions for
the detected file type.
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The validator now rejects (temporary) file uploads that don't have an
extension, as this prevents validating that the extension and content
type match. This used to pass and would then be caught later when
linking the file upload component and temporary file upload.
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The SDK cannot reliably determine which content type belongs to a .msg
file, most notably on Linux and MacOS because the extension is not in
the mime type database. This manifests as a file being uploaded with empty
content-type.

To allow these files to go through, the serializer must allow empty
values for the 'type' field which contains the detected content type,
and the backend must perform additional processing to determine the file
type. We can do this by falling back to the generic case of 'binary
file' (application/octet-stream) content type, and let libmagic figure
out which extensions belong to the magic bytes, i.e. we look at the
magic bytes to figure out what kind of file was provided, and we check
the provided file extensions against the list of valid extensions for
the detected file type.

Backport-of: #4961
sergei-maertens pushed a commit that referenced this issue Dec 30, 2024
The SDK cannot reliably determine which content type belongs to a .msg
file, most notably on Linux and MacOS because the extension is not in
the mime type database. This manifests as a file being uploaded with empty
content-type.

To allow these files to go through, the serializer must allow empty
values for the 'type' field which contains the detected content type,
and the backend must perform additional processing to determine the file
type. We can do this by falling back to the generic case of 'binary
file' (application/octet-stream) content type, and let libmagic figure
out which extensions belong to the magic bytes, i.e. we look at the
magic bytes to figure out what kind of file was provided, and we check
the provided file extensions against the list of valid extensions for
the detected file type.

Backport-of: #4961
@github-project-automation github-project-automation bot moved this from Implemented to Done in Development Dec 30, 2024
sergei-maertens added a commit that referenced this issue Dec 30, 2024
This makes the cases that need to be handled a bit more readable rather
than the contrived if/elif flows.
sergei-maertens added a commit that referenced this issue Dec 30, 2024
Validated with Sonny who's also using Arch - since libmagic 5.46 the
detected content type for these 'exotic' zip formats no longer reports
application/zip, but instead it returns application/octet-stream, or
otherwise said: it doesn't know those magic bytes (anymore).

Given the earlier patches, all we can do is allow these files to go
through.

Our Docker images are based on Debian bookworm, which ships libmagic
5.44. Debian unstable currently still has 5.44.
sergei-maertens added a commit that referenced this issue Dec 31, 2024
Validated with Sonny who's also using Arch - since libmagic 5.46 the
detected content type for these 'exotic' zip formats no longer reports
application/zip, but instead it returns application/octet-stream, or
otherwise said: it doesn't know those magic bytes (anymore).

Given the earlier patches, all we can do is allow these files to go
through.

Our Docker images are based on Debian bookworm, which ships libmagic
5.44. Debian unstable currently still has 5.44.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-backport Fix must be backported to stable release branch owner: utrecht
Projects
Status: Done
4 participants