Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MimeTypeDetector fails for local URIs #824

Closed
henini opened this issue Feb 16, 2016 · 8 comments
Closed

MimeTypeDetector fails for local URIs #824

henini opened this issue Feb 16, 2016 · 8 comments
Assignees
Labels
bug Confirmed bugs or reports that are very likely to be bugs

Comments

@henini
Copy link

henini commented Feb 16, 2016

When a URL encoded local URI is passed to the function

MimeTypeDetector.isPdfContentType(String url)

the return value turns out to be False if url is a local URI.
If one likes to implement FullTextFinder using a temporary copy of some fetched PDF this needs to be changed.

@stefan-kolb stefan-kolb self-assigned this Feb 16, 2016
@stefan-kolb stefan-kolb added the bug Confirmed bugs or reports that are very likely to be bugs label Feb 16, 2016
henini pushed a commit to henini/jabref that referenced this issue Feb 16, 2016
@stefan-kolb
Copy link
Member

You are thinking of URIs like file:/C:/Users/Stefan/Desktop/Github/jabref/build/resources/test/net/sf/jabref/logic/io/empty.pdf?

@henini
Copy link
Author

henini commented Feb 17, 2016

Exactly, the current implementation seems to work for http URLs only.

I also tried to write a test, but trying to get absolute paths in Java annoyed me a bit.

@stefan-kolb
Copy link
Member

The question is what URLs/Paths should MimeTypeDetector be able to categorize.
Is C:/Users/Stefan/Desktop/Github/jabref/build/resources/test/net/sf/jabref/logic/io/empty.pdf also a valid input here? Do we really need to check local URLs and paths? At the moment there is no use case (FullTextFinder) for this.

@henini
Copy link
Author

henini commented Feb 17, 2016

URLConnection works for local URIs as well as http URLs.
Some institutional APIs return a PDF file instead of a link to a PDF. Generalising MimeTypeDetector eases writing custom fetchers a bit.

@simonharrer
Copy link
Contributor

👍 for adding a test :)

@tobiasdiez
Copy link
Member

Mhhh... the code checks the HTTP header Content-Type and thus fails for local URIs since these don't have a HTTP header.

To be honest, I don't see the use-case to allow local URIs. Are you trying to check that a file with .pdf ending is really a pdf? Can you please elaborate a bit more on what you are trying to achieve.

@henini
Copy link
Author

henini commented Feb 17, 2016

java.net.URLConnection has getContentType(), which returns the Content-Type of a file:// URI as well as the Content-Type of a http:// URL, replacing the code in MimeTypeDetector does not break the current tests

@stefan-kolb
Copy link
Member

I already created a test, but will only push it later so it doesn't break the build.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Confirmed bugs or reports that are very likely to be bugs
Projects
None yet
Development

No branches or pull requests

4 participants