-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make require case-sensitive on all platforms #13542
Conversation
I'd expect readdir to be really slow here. |
I don't think there's any other reliable solution. I wonder how much of a concern it is in practice - the check can be disabled on linux, which is the OS where it's most likely there will remote volumes where If the folders are local, then I'm sceptical the cost of ~20 readdir's on package src directories is meaningful compared the total runtime of importing a package. |
@simonster suggested using |
Simon correctly points out that the |
But you could still use it as long as |
OK, the only case where the pr falls back to listing a directory is if the file is a symbolic link on OS X. I think I have a working solution for Windows - at least it works on my Windows 7 x64 box. https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man2/getattrlist.2.html might help with OS X. |
@osx_only function isfile_casesensitive(path) | ||
isfile(path) || return false | ||
islink(path) && return isfile_casesensitive_slow(path) | ||
realpath(path) == path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you can find a better solution, I think this should check basename(realpath(path)) == basename(path)
in case path
is not a symlink but is in a symlinked directory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
Excellent – thanks for tackling this. |
needs a test |
Good news - I've got working C code for getting the canonical capitalization in OS X. Just a matter of wrapping it in Julia now. |
OK, all the functionality is here. I'll clean it up and tests and then it should be g2g. |
@@ -2,6 +2,52 @@ | |||
|
|||
# Base.require is the implementation for the `import` statement | |||
|
|||
# Generic case-sensitive version of isfile | |||
function isfile_casesensitive_slow(path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like it would be cleaner to do
if OS_NAME == :Windows
function isfile_casesensitive(path)
...
end
elseif OS_NAME == :Linux
...
else
...
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed
Also fixes #9007. |
See also how Python 2.7 does it. |
Looks like it's walking the directory tree. |
Yes, looks like it. Of course, they had to support OS/2 and other ancient systems, so maybe the things you are doing were not available at some point. |
buf_size *= 2 | ||
continue | ||
end | ||
canon_path = bytestring(pointer(buf, 13)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couldn't you just do canon_path = UTF8String(buf)
in order to avoid making a copy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The path is stored at an offset from the beginning of the buffer, so I had to use a somewhat inelegant workaround.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see.
I'm not sure this is handling Unicode normalization properly. Suppose you have a file named On MacOS, On Windows, I'm not 100% sure of the situation. The documentation says There is no need to perform any Unicode normalization on path and file name strings for use by the Windows file I/O API functions because the file system treats path and file names as an opaque sequence of WCHARs. This makes me think that (On Linux and probably most other Unix-like systems, the NFC and NFD versions are distinct filenames.) |
Yes, I can confirm |
Great catch @stevengj . |
And what happens on OS X when a different filesystem from HFS+ is used? Should we even care about that scenario? |
You're right, that's a real concern. If a file is stored on a mounted non-hfs+ directory, has a unicode character, and isn't NFD-normalized, it would be invisible to |
The same issue applies to linux: What if the file happens to be on a case-insensitive filesystem? Then loading.jl:9 is wrong. Although at least it's not as serious, since it would be no more wrong than the current behavior of |
Alright, I think I worked around the OS X non-HFS+ concern wrt utf normalization. |
if normed_path == path | ||
false | ||
else | ||
isfile_casesensitive(normed_path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just:
isascii(path_basename) && return false # shortcut for common case of ASCII paths
normed_basename = normalize_string(path_basename, :NFD)
return normed_basename.data == casepreserved_basename
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed.
LGTM once Travis is green and the commits are squashed. |
@@ -10,3 +10,17 @@ thefname = "the fname!//\\&\0\1*" | |||
@test include_string("Base.source_path()", thefname) == Base.source_path() | |||
@test basename(@__FILE__) == "loading.jl" | |||
@test isabspath(@__FILE__) | |||
|
|||
let true_filename = "cAsEtEsT.jl", lowered_filename="casetest.jl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add a comment here:
# Issue #5789 and PR #13542:
d61579f
to
b78f560
Compare
OK, squashed and tests are green. |
Make require case-sensitive on all platforms
Thanks, @malmaud, for pushing this through! |
Thanks for the reviewing!
|
Can you submit a quick PR for |
# GetLongPathName Win32 function returns the case-preserved filename on NTFS. | ||
function isfile_casesensitive(path) | ||
isfile(path) || return false # Fail fast | ||
longpath(path) == path |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly this should be basename(longpath(path)) == basename(path)
? Two reasons:
- On MacOS, we only check that the basename matches, so this is inconsistent.
- The intent of this was that
import Foo
would check forFoo
in a case-sensitive way. But this does more than that, and checks the whole path. This means that usersLOAD_PATH
andHOME
variables becomes case-sensitive, for example, which is unexpected on case-insensitive systems, and has led to at least one confused user on discourse.
Resolves #5789.