Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Opam for windows : make cold fails with OPAMW_SHGetFolderPath #5861

Closed
Chimrod opened this issue Feb 28, 2024 · 24 comments · Fixed by #5862, janestreet/spawn#58 or #5880
Closed

Opam for windows : make cold fails with OPAMW_SHGetFolderPath #5861

Chimrod opened this issue Feb 28, 2024 · 24 comments · Fixed by #5862, janestreet/spawn#58 or #5880

Comments

@Chimrod
Copy link

Chimrod commented Feb 28, 2024

I’m trying to compile opam for windows from scratch. I have an error in the command make cold, which comes after the compiler has been bootstrapped:

$ make cold OCAML_PORT=mingw64
…
Opam will be built WITH its default built-in solver

Executables will be installed in /usr/local/bin
Manual pages will be installed in /usr/local/share/man

Downloading vendored source dependencies...
done
Extracting vendored source dependencies in src_ext/... done
env PATH="`pwd`/bootstrap/ocaml/bin:$PATH" CAML_LD_LIBRARY_PATH= make
make[1]: Entering directory '/home/opam'
/home/opam/src_ext/dune-local/dune.exe build --profile=release --root .  --promote-install-files -- opam-installer.install opam.install
Fatal error: exception Failure("OPAMW_SHGetFolderPath")
Fatal error: exception Failure("OPAMW_SHGetFolderPath")
File "doc/man/dune", line 5, characters 0-142:
5 | (rule
6 |   (targets opam.1)
7 |   (deps opam-topics.inc opam-admin-topics.inc)
8 |   (action (with-stdout-to %{targets} (run %{bin:opam} --help=groff))))
Fatal error: exception Failure("OPAMW_SHGetFolderPath")
make[1]: *** [Makefile:144: build-opam] Error 1
make[1]: Leaving directory '/home/opam'
make: *** [Makefile:283: cold] Error 2
@kit-ty-kate
Copy link
Member

Sorry you're encountering this. I've never seen this issue before and I'm not familiar with the windows API internals (https://learn.microsoft.com/en-us/windows/win32/api/shlobj_core/nf-shlobj_core-shgetfolderpatha doesn't really tell me much). @dra27 do you have any idea what could be going wrong?

On a side note: Is there any reason why you want to compile opam from scratch instead of using the precompiled binary provided? (see the first line of https://opam.ocaml.org/blog/opam-2-2-0-beta1/#How-to-Test-opam-on-Windows) Compiling opam is of course fine but between #5859 and this ticket you seem to keep encountering issues and I feel like it would be easier to just use the precompiled binary.

If you've seen this blog post but didn't see the line I could try to change it to make it more visible if that helps.

@kit-ty-kate
Copy link
Member

After reading some of the code, I opened #5862, which might or might not fix your issue (I don't know as I don't know how to reproduce it). In any case i believe it to be a general improvement so it might be worth merging it.

@dra27
Copy link
Member

dra27 commented Feb 28, 2024

@kit-ty-kate's fix is definitely the right way to tackle this, but out of curiosity, what are the values of USERPROFILE and LOCALAPPDATA in your environment (echo %USERPROFILE% and echo %LOCALAPPDATA% from a command prompt)?

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

Hello, this is the content of the two values:

C:\Users\Sébastien>echo %USERPROFILE%
C:\Users\Sébastien

C:\Users\Sébastien>echo %LOCALAPPDATA%
C:\Users\Sébastien\AppData\Local

For the context, I’m trying to configure a Windows VM I would use to compile code for windows. I won’t code in this environment, only compile and run tests.

I’ve started with a fresh windows installation, installed the required driver to help in the VM, cygwin and that’s all. No Vs-code, no msvc, I’ve only installed what’s in the installation documentation.

I got errors when using opam binary, and I wanted to reproduce using the latest git version before opening a ticket. I know this branch is still in beta, so it’s ok to get errors like this. I can’t help a lot, but if I can report you some more cases it’s ok for me :)

@kit-ty-kate
Copy link
Member

kit-ty-kate commented Feb 29, 2024

C:\Users\Sébastien

Looking around it looks like SHGetFolderPath refers by defaut to SHGetFolderPathA (where the A seems to stand for ASCII) so this function we are using doesn't support unicode characters. Instead we could have used SHGetFolderPathW which does support unicode characters.

So it looks like #5862 should fix your issue as SHGetKnownFolderPath (the non-deprecated function) supports unicode by default.

@kit-ty-kate
Copy link
Member

For the context, I’m trying to configure a Windows VM I would use to compile code for windows. I won’t code in this environment, only compile and run tests.

I’ve started with a fresh windows installation, installed the required driver to help in the VM, cygwin and that’s all. No Vs-code, no msvc, I’ve only installed what’s in the installation documentation.

I got errors when using opam binary, and I wanted to reproduce using the latest git version before opening a ticket. I know this branch is still in beta, so it’s ok to get errors like this. I can’t help a lot, but if I can report you some more cases it’s ok for me :)

This is definitely appreciated, thank you so much :3

@kit-ty-kate kit-ty-kate added this to the 2.2.0~beta2 milestone Feb 29, 2024
@kit-ty-kate
Copy link
Member

@Chimrod I believe #5862 is now ready to be tested. Would you be able to test it locally when you have some time?

git clone https://github.com/kit-ty-kate/opam -b windows-SHGetKnownFolderPath
cd opam
make cold

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

I still have errors:

Extracting vendored source dependencies in src_ext/... done
env PATH="`pwd`/bootstrap/ocaml/bin:$PATH" CAML_LD_LIBRARY_PATH= make
make[1]: Entering directory '/home/opam'
cd src_ext/dune-local && ocaml bootstrap.ml
ocamlc -output-complete-exe -w -24 -g -o .duneboot.exe -I boot unix.cma boot/libs.ml boot/duneboot.ml
.\.duneboot.exe
/home/opam/src_ext/dune-local/dune.exe build --profile=release --root .  --promote-install-files -- opam-installer.install opam.install
File "doc/man/opam-admin-topics.inc", line 25, characters 0-362:
25 | (rule
26 |   (targets opam-admin-add-hashes.1 opam-admin-add-hashes.err)
27 |   (deps using-built-opam)
28 |   (action (progn (with-stderr-to opam-admin-add-hashes.err
29 |                    (with-stdout-to opam-admin-add-hashes.1 (run %{bin:opam} admin add-hashes --help=groff)))
30 |                  (diff opam-admin-add-hashes.err %{dep:opam-admin-add-hashes.0})))
31 |   (package opam))
Command exited with code 2.

…

File "doc/man/opam-topics.inc", line 175, characters 0-292:
175 | (rule
176 |   (targets opam-update.1 opam-update.err)
177 |   (deps using-built-opam)
178 |   (action (progn (with-stderr-to opam-update.err
179 |                    (with-stdout-to opam-update.1 (run %{bin:opam} update --help=groff)))
180 |                  (diff opam-update.err %{dep:opam-update.0})))
181 |   (package opam))
Command exited with code 2.
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
File "doc/man/dune", line 5, characters 0-142:
5 | (rule
6 |   (targets opam.1)
7 |   (deps opam-topics.inc opam-admin-topics.inc)
8 |   (action (with-stdout-to %{targets} (run %{bin:opam} --help=groff))))
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
make[1]: *** [Makefile:144: build-opam] Error 1
make[1]: Leaving directory '/home/opam'
make: *** [Makefile:283: cold] Error 2

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

$ git status
On branch windows-SHGetKnownFolderPath
Your branch is up to date with 'origin/windows-SHGetKnownFolderPath'.

@kit-ty-kate
Copy link
Member

Could you try with the debug-windows-SHGetKnownFolderPath branch (on my fork too) ?
I've added a debug printf HRESULT = <ID of the error> that you should be able to see in the output when compiling.
This would be useful to see what we're actually dealing with.

Side note: you can just run make directly if you use the same directory, as the make cold you did before would have already set everything up

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

$ make
/home/opam/src_ext/dune-local/dune.exe build --profile=release --root .  --promote-install-files -- opam-installer.install opam.install
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
HRESULT = 3
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
HRESULT = 3
File "doc/man/dune", line 5, characters 0-142:
5 | (rule
6 |   (targets opam.1)
7 |   (deps opam-topics.inc opam-admin-topics.inc)
8 |   (action (with-stdout-to %{targets} (run %{bin:opam} --help=groff))))
Fatal error: exception Failure("OPAMW_SHGetKnownFolderPath")
HRESULT = 3
make: *** [Makefile:144: build-opam] Error 1

@kit-ty-kate
Copy link
Member

arf, sorry i made a mistake in the debug code. I've pushed a commit fixing it. Could you try again with that new code?

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

HRESULT = 80070003

@kit-ty-kate
Copy link
Member

The error looks to be ERROR_PATH_NOT_FOUND.
I've pushed one more debug commit to be sure of which path doesn't exists. Could you try the new code on the same debug branch one more time?

My hunch is that it's probably LocalAppData. If that is the case, could you check that this directory exists?

C:\Users\Sébastien\AppData\Local

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

Indeed, the error is located with LocalAppData. But the directory exists, and is populated:

In windows terminal:

C:\Users\Sébastien>dir AppData

19/02/2024  14:47    <DIR>          ..
29/02/2024  14:15    <DIR>          Local
29/12/2023  18:12    <DIR>          LocalLow
29/12/2023  18:11    <DIR>          Roaming

In cygwin:

$ ls /cygdrive/c/Users/Sébastien/AppData/Local
'Application Data'          D3DSCache      Microsoft   PeerDistRepub               Temp                        opam
 Comms                      Historique     OneDrive    PlaceholderTileLogoFolder  'Temporary Internet Files'
 ConnectedDevicesPlatform   IconCache.db   Packages    Publishers                  VirtualStore

(the opam directory comes from my firsts test, I’ve renamed it in the case the compiler did not managed well the previous installation but it did not changed the result)

@Chimrod
Copy link
Author

Chimrod commented Feb 29, 2024

Also, the environment variable exist:

dir %LocalAppData%

…

29/02/2024  14:15    <DIR>          .
29/12/2023  19:01    <DIR>          Comms
29/12/2023  18:11    <DIR>          ConnectedDevicesPlatform
28/02/2024  10:04    <DIR>          D3DSCache
25/02/2024  14:27    <DIR>          Microsoft
18/02/2024  20:51    <DIR>          OneDrive
28/02/2024  14:17    <DIR>          opam
25/02/2024  10:25    <DIR>          Packages
25/02/2024  10:42    <DIR>          PeerDistRepub
30/12/2023  15:47    <DIR>          PlaceholderTileLogoFolder
29/12/2023  18:32    <DIR>          Publishers
29/02/2024  13:50    <DIR>          Temp
08/02/2024  09:46    <DIR>          VirtualStore

@dra27
Copy link
Member

dra27 commented Mar 1, 2024

@Chimrod - I think the build is proceeding far enough that you should be able to run _build/default/src/client/opamMain.exe and I think that if you run that you will see the same exception? Could you try running that executable not from a Cygwin shell but from a fresh command prompt? Does it also raise the same exception?

@Chimrod
Copy link
Author

Chimrod commented Mar 1, 2024

FYI, I’ve created a new user named "sebastien" (no accent, all in lowercase) and I’ve been able to build opam without issue. I wanted to be sure there was’nt any other causes:

  • I’ve keeped the same cygwin installation (same root, same packages)
  • Cloned the repo in the new home (I didn’t want to bother with permission issue)
  • Ran make cold

Now I will switch to the user "Sébastien" and test your actions.

@Chimrod
Copy link
Author

Chimrod commented Mar 1, 2024

With the previous user, I’ve been able to run opam init:

This is the test in cygwin:

Sébastien@DESKTOP-LFFRUL8 /home/opam
$ _build/default/src/client/opamMain.exe update
LocalAppData

<><> Updating package repositories ><><><><><><><><><><><><><><><><><><><><>  🐫

System
[default] synchronised from git+https://github.com/ocaml-opam/opam-repository-mingw.git#sunset
[upstream] synchronised from https://opam.ocaml.org
Now run 'opam upgrade' to apply any package updates.

and from windows terminal:

C:\cygwin64\home\opam\_build\default\src\client>opamMain.exe update
Home
LocalAppData

<><> Updating package repositories ><><><><><><><><><><><><><><><><><><><><>  🐫
System
[default] synchronised from git+https://github.com/ocaml-opam/opam-repository-mingw.git#sunset
[upstream] no changes from https://opam.ocaml.org

(I’ve keeped the branch created by kit-ty-kate, and we can see the trace in the console).

@kit-ty-kate
Copy link
Member

While debugging this i encountered another (somewhat related it seems) problem in dune ocaml/dune#10180

I tried to reproduce your setup with a small difference: ě instead of é which makes the UTF16 encoding break in more spectacular ways and way sooner

@Chimrod
Copy link
Author

Chimrod commented Mar 1, 2024

So you got the initial issue where my journey started! I had this issue with the opam binary when I started to use it, but without any clue on the cause of the error, I switched to the sources and here we are.

@dra27
Copy link
Member

dra27 commented Mar 1, 2024

Aha - it was the caller, just not the one I was looking at it; nice find, @kit-ty-kate!

@kit-ty-kate
Copy link
Member

Aha - it was the caller, just not the one I was looking at it; nice find, @kit-ty-kate!

Ah! Does Windows make the processing of strings dependent on the parent processes? (e.g. the parent process use CreateProcess in ASCII mode, then the children processes will also inherit this?)

If so this issue would finally make some sense and i'd feel relieved :D

@dra27
Copy link
Member

dra27 commented Mar 1, 2024

It's definitely not supposed to, no - the string processing is fixed at compile-time (that's the defining _UNICODE and UNICODE stuff). I was wanting to eliminate Cygwin from the equation at least - but it appears to be Dune!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment