Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ogg vorbis files recognized as audio/vorbis #64

Closed
ClearlyClaire opened this issue Sep 12, 2021 · 1 comment · Fixed by #65
Closed

ogg vorbis files recognized as audio/vorbis #64

ClearlyClaire opened this issue Sep 12, 2021 · 1 comment · Fixed by #65
Assignees

Comments

@ClearlyClaire
Copy link

ClearlyClaire commented Sep 12, 2021

Trying to detect the MIME type of an ogg vorbis file returns audio/vorbis, which according to https://wiki.xiph.org/MIME_Types_and_File_Extensions is for vorbis streams without containers, instead of audio/ogg.

irb(main):003:0> Marcel::Magic.by_magic(File.open('spec/fixtures/files/boop.ogg'))
=> #<Marcel::Magic:0x0000560870431ec0 @mediatype="audio", @subtype="vorbis", @type="audio/vorbis">
irb(main):002:0> Marcel::MimeType.for(Pathname.new('spec/fixtures/files/boop.ogg'))
=> "audio/vorbis"
irb(main):004:0> Marcel::MimeType.for(File.open('spec/fixtures/files/boop.ogg'), name: 'boop.ogg')
=> "audio/vorbis"

This is a bit surprising and might throw some tools off, as audio/vorbis is generally not expected nor associated with any file format of extension (since it's for streams themselves and not files/containers).

Passing declared_type works but it might not be provider, or provided by an untrusted source:

irb(main):006:0> Marcel::MimeType.for(File.open('spec/fixtures/files/boop.ogg'), name: 'boop.ogg', declared_type: 'audio/ogg')
=> "audio/ogg"

EDIT: this seems to come from https://github.com/rails/marcel/blob/main/data/tika.xml#L5135-L5146 and introduced in Apache Tika by apache/tika@41c6749 but I do think it's wrong, as audio/vorbis seem to be defined by RFC5215 and specific to RTP streams.

@gmcgibbon
Copy link
Member

gmcgibbon commented Sep 22, 2021

This is mentioned on #48 as a regression between 0.3.3 and 1.0.0. I think regardless of if it is right or wrong, this is still a regression from the other mime DB we were using. I'll try fixing it!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants