Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Normalize case of context root under Win32 #3966

Merged
merged 2 commits into from
Jan 27, 2021
Merged

Conversation

nojb
Copy link
Collaborator

@nojb nojb commented Nov 23, 2020

As discussed on Slack, the case of Sys.getcwd can vary from invocation to invocation of dune depending on the environment. This can result in gratuitous recompilations.

To avoid this, this PR proposes to normalize the case of Sys.getcwd (as exposed by Path) to lowercase (under Windows only).

cc @dra27 @rgrinberg

@nojb nojb requested a review from rgrinberg as a code owner November 23, 2020 18:22
@rgrinberg
Copy link
Member

One thing I didn't understand from the slack conversation is if its the drive letter that is case insensitive or the entire path? This change is only right if it's the latter.

@nojb
Copy link
Collaborator Author

nojb commented Nov 23, 2020

One thing I didn't understand from the slack conversation is if its the drive letter that is case insensitive or the entire path? This change is only right if it's the latter.

I think it can be the whole path. I am trying to come up with a reproduction case.

@dra27
Copy link
Member

dra27 commented Nov 23, 2020

It's true for the entire path, but I don't think you want to do this routinely - it can break if you happen to be on a case sensitive file system which is mounted and it looks suspicious when you start seeing c:\documents and settings\username\my documents. It's that comparisons want to be case insensitive.

Actually get the correct path involves a fairly simply wrapping of FindFirstFile IIRC.

@nojb
Copy link
Collaborator Author

nojb commented Nov 23, 2020

It's true for the entire path, but I don't think you want to do this routinely - it can break if you happen to be on a case sensitive file system which is mounted and it looks suspicious when you start seeing c:\documents and settings\username\my documents. It's that comparisons want to be case insensitive.

But I think that changing Path so that comparisons are made case-insensitive on Windows is a big change, and it is not even clear to me which notion of "case-insensitive" we would be talking about...

Actually get the correct path involves a fairly simply wrapping of FindFirstFile IIRC.

This sounds simple enough. Do I understand your suggestions correctly to mean that we should call FindFirstFile on the result of Sys.getcwd?

@dra27
Copy link
Member

dra27 commented Nov 23, 2020

It's true for the entire path, but I don't think you want to do this routinely - it can break if you happen to be on a case sensitive file system which is mounted and it looks suspicious when you start seeing c:\documents and settings\username\my documents. It's that comparisons want to be case insensitive.

But I think that changing Path so that comparisons are made case-insensitive on Windows is a big change, and it is not even clear to me which notion of "case-insensitive" we would be talking about...

Yeah, I realised that was likely to be the case... but equally the lowercase trick might just defer the problem to another day 🙂

Actually get the correct path involves a fairly simply wrapping of FindFirstFile IIRC.

This sounds simple enough. Do I understand your suggestions correctly to mean that we should call FindFirstFile on the result of Sys.getcwd?

Yes I believe that it will do the conversion - although FindFirstFile is expensive, you I think you'd want to check whether Sys.getcwd has changed and only if it has call FindFirstFile (i.e. store the pair (last_getcwd_result, last_normalised_getcwd_result) and just return snd if Sys.getcwd () returns fst.

@nojb
Copy link
Collaborator Author

nojb commented Dec 5, 2020

@dra27 Coming back to this, the FindFirstFile trick would serve to normalize the different path components, but not the drive letter, right? For the drive letter, shouldn't we just decide to normalize to lowercase or uppercase?

@dra27
Copy link
Member

dra27 commented Dec 5, 2020

I'm not sure - I'd expect that it would normalize it to uppercase, to be honest. It'd obviously be quite safe to do that in Dune to be sure. Some prior art (from a brief look around - the pull requests are old):

  • PowerShell API (not sure if it's current) solving exactly this problem: https://github.com/PowerShell/PowerShell/blob/1e940f55b1f265c9640d622937383208e13a849b/src/System.Management.Automation/namespaces/FileSystemProvider.cs#L124
  • Proposal to add a similar function to .NET: https://github.com/dotnet/corefx/pull/2219/files

(links as code to avoid GitHub spamming other repos!)

@nojb
Copy link
Collaborator Author

nojb commented Dec 5, 2020

I'm not sure - I'd expect that it would normalize it to uppercase, to be honest.

The problem I have is that FindFirstFile only returns the normalized basename, so one has to call it recursively to normalize every path component, but it fails when passed a path consisting of only a drive letter, eg c:/.

It'd obviously be quite safe to do that in Dune to be sure. Some prior art (from a brief look around - the pull requests are old):

  • PowerShell API (not sure if it's current) solving exactly this problem: https://github.com/PowerShell/PowerShell/blob/1e940f55b1f265c9640d622937383208e13a849b/src/System.Management.Automation/namespaces/FileSystemProvider.cs#L124

Looks like it calls FindFirstFile recursively, will take a closer look.

  • Proposal to add a similar function to .NET: https://github.com/dotnet/corefx/pull/2219/files

This one uses GetLongPathName; I tried that, it normalizes the path components, but not the drive letter.

Base automatically changed from master to main January 14, 2021 17:09
@nojb nojb changed the title Normalize case of Sys.getcwd under Win32 Normalize case of context root under Win32 Jan 15, 2021
@nojb
Copy link
Collaborator Author

nojb commented Jan 15, 2021

Coming back to this as this issue is blocking for us as it causes a lot of needless recompilation. I pushed a simpler fix which just normalizes the context root to lowercase under Windows. This fixes the recompilation problem for us.

Would appreciate a review! cc @dra27 @rgrinberg

Copy link
Member

@rgrinberg rgrinberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. There are other uses of Sys.getcwd () in the codebase, but if this fix is enough then so be it.

This needs a CHANGES entry as well.

Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
@nojb
Copy link
Collaborator Author

nojb commented Jan 20, 2021

Looks good to me. There are other uses of Sys.getcwd () in the codebase, but if this fix is enough then so be it.

This needs a CHANGES entry as well.

Done. I also cleaned the patch up a bit by adding Path.External.lowercase_ascii. @dra27 any objections to merging this?

@nojb
Copy link
Collaborator Author

nojb commented Jan 27, 2021

I believe @dra27 it busy these days, so unless someone objects, I am planning to merge this as it is in the next few hours since it solves an actual problem and the change is small.

@nojb
Copy link
Collaborator Author

nojb commented Jan 27, 2021

We discussed the PR with @dra27 on Slack, and he didn't see anything blocking (even if maybe the patch can be improved later on), so I will merge as soon as the CI is green.

@nojb nojb merged commit cabf7eb into ocaml:main Jan 27, 2021
@nojb nojb deleted the win32_getcwd branch January 27, 2021 12:00
@emillon
Copy link
Collaborator

emillon commented Feb 3, 2021

I was looking at how to make cram tests work better on windows and I was surprised to see that now the result of Path.to_absolute_filename passed to cygpath is not detected as a cygwin path anymore:

$ cygpath 'c:\ocaml64\home'
/cygdrive/c/ocaml64/home
$ cygpath 'c:\OCaml64\home'
/home

@nojb
Copy link
Collaborator Author

nojb commented Feb 3, 2021

I was looking at how to make cram tests work better on windows and I was surprised to see that now the result of Path.to_absolute_filename passed to cygpath is not detected as a cygwin path anymore:

Uf for once I thought this was going to be painless, but I was wrong...

@nojb
Copy link
Collaborator Author

nojb commented Feb 3, 2021

I was looking at how to make cram tests work better on windows and I was surprised to see that now the result of Path.to_absolute_filename passed to cygpath is not detected as a cygwin path anymore:

$ cygpath 'c:\ocaml64\home'
/cygdrive/c/ocaml64/home
$ cygpath 'c:\OCaml64\home'
/home

Just to make sure I understand: is c:\OCaml64\ the cygwin root?

@emillon
Copy link
Collaborator

emillon commented Feb 4, 2021

Yes, the cygwin root is c:\OCaml64\ (that's the default in fdopen's installer)

@nojb
Copy link
Collaborator Author

nojb commented Feb 11, 2021

I was looking at how to make cram tests work better on windows and I was surprised to see that now the result of Path.to_absolute_filename passed to cygpath is not detected as a cygwin path anymore:

To be clear, the problem you have is with one path (namely, the context root) when passed to Path.to_absolute_filename, right? Because the function Path.to_absolute_filename has not itself been modified in any way...

@emillon
Copy link
Collaborator

emillon commented Feb 11, 2021

I first tried to change cram rules so that both the POSIX and Windows versions of cwd are replaced by $TESTCASE_ROOT. At the moment this is using to_absolute_filename:

{ source = Path.to_absolute_filename cwd; target = "$TESTCASE_ROOT" }

In turn, this function uses abs_root (which has been lowercased):

dune/src/stdune/path.ml

Lines 588 to 593 in 8330833

let to_absolute_filename t =
match t with
| External s -> External.to_string s
| In_source_dir l ->
External.to_string
(External.relative (Lazy.force abs_root) (Local.to_string l))

Because of this, the value that's being replaced as $TESTCASE_ROOT is something like "lowercase\windows\path\to\root/_build/path/to/dir, and it does not match anything 😕. I think passing this path to cygpath would work to create a POSIX path, but it doesn't because of the lowercase section. I'm a bit less concerned about the mix of forward and backslashes since cygpath can also be used to build a clean Windows path.

@nojb nojb added this to the 2.8.3 milestone Feb 17, 2021
rgrinberg pushed a commit that referenced this pull request Mar 7, 2021
Signed-off-by: Nicolás Ojeda Bär <n.oje.bar@gmail.com>
rgrinberg added a commit to rgrinberg/opam-repository that referenced this pull request Mar 7, 2021
…ne-action-plugin, dune-private-libs and dune-glob (2.8.3)

CHANGES:

- Make `patdiff` show refined diffs (ocaml/dune#4257, fixes ocaml/dune#4254, @hakuch)

- Fixed a bug that could result in needless recompilation under Windows due to
  case differences in the result of `Sys.getcwd` (observed under `emacs`).
  (ocaml/dune#3966, @nojb).

- Restore compatibility with Coq < 8.10 for coq-lang < 0.3 , document
  that `(using coq 0.3)` does require Coq 8.10 at least (ocaml/dune#4224, fixes
  ocaml/dune#4142, @ejgallego)

- Add a META rule for 'compiler-libs.native-toplevel' (ocaml/dune#4175, @AltGr)

- No longer call `chmod` on symbolic links (fixes ocaml/dune#4195, @dannywillems)

- Dune no longer automatically create or edit `dune-project` files
  (ocaml/dune#4239, fixes ocaml/dune#4108, @jeremiedimino)

- Have `dune` communicate the location of the standard library directory to
  `merlin` (ocaml/dune#4211, fixes ocaml/dune#4188, @nojb)

- Workaround incorrect exception raised by Unix.utimes (OCaml PR#8857) in
  Path.touch on Windows (ocaml/dune#4223, @dra27)

- `dune ocaml-merlin` is now able to provide configuration for source files in
  the `_build` directory. (ocaml/dune#4274, @voodoos)

- Automatically delete left-over Merlin files when rebuilding for the first time
  a project previously built with Dune `<= 2.7`. (ocaml/dune#4261, @voodoos, @aalekseyev)

- Fix `ppx.exe` being compiled for the wrong target when cross-compiling
  (ocaml/dune#3751, fixes ocaml/dune#3698, @toots)

- `dune top` correctly escapes the generated toplevel directives, and make it
  easier for `dune top` to locate C stubs associated to concerned libraries.
  (ocaml/dune#4242, fixes ocaml/dune#4231, @nojb)

- Do not pass include directories containing native objects when compiling
  bytecode (ocaml/dune#4200, @nojb)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants