Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fully test native Windows in the testsuite #3260

Merged
merged 6 commits into from
Jun 28, 2019
Merged

Conversation

dra27
Copy link
Member

@dra27 dra27 commented Mar 17, 2018

This PR contains one commit which is still work-in-progress and all the work is dependent on #3242, #3248 and #3255. It also has to point the testsuite to my unstable fork of opam-repository, so it's definitely not for merging just yet. The squashing of the three base PRs means that GitHub's review interface is largely useless, but this can start the conversations going about the architectural changes.

This PR adds sufficient native Windows support to opam to create a full testsuite pass, and in full VT100 TechniColor. There is still more to come, as it excludes all of the shell integration and additional features required to support Windows well.

The PR is best viewed in two sections: the first 22 commits add/fix the required support; the remaining 14 commits concern porting the testsuite.

(Relatively) Minor changes

  • Added extlib's Option.map_default to OpamStd.Option
  • The detection of the C++ compiler I added in Ensure a C++ compiler exists if MCCS is enabled #3216 broke MSVC, which doesn't require additional detection anyway (the Microsoft C compiler has in fact been a C++ compiler for decades...)
  • Filename module is now included in OpamCompat (Filename.extension is required)
  • There's a bug in a presumably previously untested Windows (code-)path in OpamSystem.resolve_commandwhich caused it to return the directory the resolved command resided in, rather than the command!
  • Filename.check_suffix added to OpamFilename.Base and also a function OpamFilename.Base.add_extension

Windows-related changes

  • opam is going to need to use tools provided by either Cygwin/MSYS2 or WSL for some time to come. In many cases, Cygwin tools can accept Windows-style paths if the back-slashes are converted to forward-slashes, but this is not universally the case. Cygwin provides a utility called cygpath for converting between all the various styles (Microsoft have recently introduced wslpath for WSL). Support for this is added in opam with three low-level functions. One OpamStd.is_cygwin_variant determines whether a given executable itself requires Cygwin, doesn't require Cygwin itself, but is linked to a library which does, or is a native Windows executable. This is used to provide OpamSystem.get_cygpath_function which returns a lazy function for a given command which will convert a supplied path if this is required. Finally, OpamUrl.map_file_url allows this function to be applied to file://-style URLs. This support is then used to patch invocations to git, rsync, and tar which allows the Cygwin versions of these commands to be used by opam.
  • Windows does not permit files or directories to be unlinked if anything is open (it is possible to open files and directories with special flags which do permit this, but it's best to pretend these don't exist in terms of architecture, because an annoying user can always open the file or have a shell with a directory you want to delete as a working directory!). This causes two problems:
    • Getting rid of lock files when an error has occurred. Where possible, I have "fixed" this by releasing locks (This solution is potentially unsound, and I think needs replacing).
    • This doesn't help for unhandled exceptions, so I've added a sledgehammer for this purpose OpamSystem.release_all_locks. The locking subsystem now keeps a Hashtbl of all lock fds which it knows haven't been released, and this function simply closes them all.
  • Windows file locking doesn't support promotion or demotion or recursive/re-entrant locking in the same way as Unix (underlying differences of architecture). I've added a guard which prevents a write lock from being acquired a second time and I've changed the promotion code for read->write. This adds the possibility that promoting a lock could result in losing it. This should probably be changed to have a weird Windows-only lock which is taken out in order to promote/demote a lock
  • The locking features added in Pass fd to functions in OpamFilename.with_flock* #3248 are utilised to fix the writing of state files in OpamRepository
  • Cygwin executables follow a different command line escaping principle from Windows executables. There 's a huge piece of work in OpamProcess which means that parameters should definitely be passed correctly to Cygwin executables and which also allows scripts to be executed as though Windows supported #!.
  • New function OpamSystem.classify_executable determines whether a given executable file is a script or a binary. This is used to fix OpamSystem.install for Windows, since it really doesn't make sense to be using Cygwin's install command. Various warnings are displayed if inappropriate things get installed (like shell scripts) simply because I couldn't be bothered to split that off into the branch with the shell integration.
  • A shim (with warning) is added when processing .install files so that if an executable file foo is not found, but foo.exe exists then foo.exe is installed.
  • Several bugs in the URL handling were fixed - mainly down to not using filename concatenation functions and/or ensuring that backslashes don't get introduced when manipulating file:// URLs, etc.
  • Interpolating variables is fundamentally broken once those variables start including back-slashes, since it means that interpolating them into .config files doesn't work. Tempting as it is to introduce yet another interpolation operator, a simpler scheme is used here: if the first line of the file being interpolated begins opam-version:, then opam assumes that any interpolation within double-quotes should escape backslashes. It's worked quite nicely in tests - looking at opam-repository, it didn't look as though this was likely to introduce regressions (since anything beginning opam-version: is extremely likely to be a file where having unescaped backslash characters before was a syntax error).

LF/CRLF endings

This is a fundamental problem which I think must be addressed before 2.0.0 is released. Git-for-Windows by default sets core.autocrlf to true, which means that files in git remotes are very likely to get checked out with CRLF line endings, even if they were checked in with LF line endings originally (e.g. opam-repository). This has two key consequences for opam:

  1. It means that checksums need to be checked both against the file and, if it's a text file, the "other" version of the file. I've implemented this, admittedly in a sub-optimal manner (this shouldn't matter too much, since we're talking about very small files here, typically)
  2. It means that patch files, especially when processing git remotes, need a lot of special handling or you get CRLF errors and warnings from the patch utility, which is surprisingly obtuse about CRLF mismatches. There is a WIP commit in this branch handling the common case for this. The procedure I have come up with here is to pre-process diff files before patch is called. One effect of this is to ensure that the patch headers will always use Unix LF line-endings (this should eliminate warnings from patch about stripping trailing CRs). Each file to be patched is then analysed, and the patch updated to either include or strip CR characters based on what is actually on the disk. For example, git diff will (correctly) produce a patch with Unix line endings for two commits (which have LF normalised endings) even though the files on disk have CRLF checked out line-endings (from core.autocrlf) - so patch won't be able to apply the resulting diff file. This seems to be working quite nicely - there's also the possibility (not yet implemented) to work around an inadequacy of some versions of patch on Unix which don't correctly implement file renaming. My implementation needs more some error handling and the code-paths for context diffs aren't yet implemented, but this is both necessary for the testsuite to pass and, more to the point, working. The intention is that this processing would be always run, since a Unix opam could still be presented with patch files generated on Windows.

Testsuite changes

There's a little catalogue of alterations to the testsuite to permit correct operation for AppVeyor:

  • opam_check was linked against opam-client, but this is a nuisance for mingw as it pulls opam-solver in (which means it needs manifesting on mingw as in Add --with-private-runtime for mingw builds #3255). I've changed it link opam-state only, as that's sufficient.
  • Various tweaks to tests/Makefile and the shell scripts in the testsuite both to cope with spaces in PATH and also the stray \r characters which come from the output of native Windows executables.
  • Running under MSVC means .a cannot be used as a hard-coded extension for library files. I have better solutions in the pipeline (Switch defaults configuration file #2930, etc.) for passing this around, but for now opam config set is used to create a variable ext_lib containing the correct extension.
  • a few tweaks to the build commands mean that jbuilder runtest is now run with --no-buffer, so you get to see the results as they happen!
  • The git commands in tests/Makefile were a real mess of copy-and-paste with subtle differences, so I factored them out to a macro to ensure that noone will ever understand them again!
  • The trick for interpolating variables with backslashes mentioned above necessitates using proper .config.in files. I don't think that's relevant to any of the tests (nor even good practice to have a .config with no opam-version header now?)
  • The testsuite now executes opam from _build/install. The commit mentions that this is done to allow execution of opam-putenv, but this isn't yet required. It is however required for the mingw builds, since the manifests are only available in the install directory, not the source directory.
  • AppVeyor doesn't recognise [m as equivalent to [0m. I like colour, so this is altered both in tests/Makefile and also in a couple of places in opam itself. I can't imagine a regression there, since recognising [m as reset and not [0m as reset would be extremely strange.
  • AppVeyor is then finally tweaked to run the testsuite on all 6 Windows ports and also to install the ocamlfind package. At the moment, this has to be done from my personal fork of opam-repository.
  • AppVeyor is configured to test Git-for-Windows and Cygwin's Git for the 4 native ports. Note that using Git-for-Windows from a Cygwin port should remain an unsupported piece of stupidity (if you're using Cygwin, you should use Cygwin. Period.)

Allow for:

 - Spaces in PATH
 - Native tools returning \r
 - Converting paths between Cygwin and Native

Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Signed-off-by: David Allsopp <david.allsopp@metastack.com>
Replaces simply running the opam command!
@dra27
Copy link
Member Author

dra27 commented Mar 17, 2018

Travis is expected to fail. Fingers crossed for AppVeyor... (it's passed on my AppVeyor fork)

@dra27 dra27 mentioned this pull request Mar 29, 2018
@AltGr
Copy link
Member

AltGr commented Mar 30, 2018

Thanks a lot!
Just some random comments on your big summary above:

  • Getting rid of lock files when an error has occurred. Where possible, I have "fixed" this by releasing locks (This solution is potentially unsound, and I think needs replacing).
  • This doesn't help for unhandled exceptions, so I've added a sledgehammer for this purpose OpamSystem.release_all_locks. The locking subsystem now keeps a Hashtbl of all lock fds which it knows haven't been released, and this function simply closes them all.

Maybe it's not done properly everywhere, but locks should almost always be used through the with_* wrapper functions, which take care of releasing them when an exception was raised (this was actually one of the reasons I needed reraise_with_backtrace or I couldn't get a useful backtrace without awful hacks).
If that approach works and just a few cases need a cleanup, that would be worth it to avoid the sledgehammer approach :)

Git-for-Windows by default sets core.autocrlf to true, which means that files in git remotes are very likely to get checked out with CRLF line endings, even if they were checked in with LF line endings originally (e.g. opam-repository).

I am a bit uncomfortable with tweaking the files before validating checksums. I mean, this becomes a weird notion of "file integrity". Wouldn't it be simpler to locally force the value of core.autocrlf, if that works ? We already do it for a few breaking git options on Linux.
As for patches, that does sound more tricky. What are the cases where we end up with different CRLF from what was expected, if we can force git not to convert ? For these cases, having a wrapper to patch that makes it more tolerant to such issues would be OK, if the behaiour can remain predictable.

if the first line of the file being interpolated begins opam-version:, then opam assumes that any interpolation within double-quotes should escape backslashes. It's worked quite nicely in tests

err, this is a heuristic at best, we shouldn't base different processing behaviours based on this. It is nowhere specified — and not required — that the opam-version: field comes first in a file¹, and there might still be some cases where it is not mandatory.
For this issue, I believe interpolations shouldn't contain escapes in any case, i.e. a \ within the replacement string should always be interpreted as the backslash character.
Or is this just a matter for the substs: field ? Or for packages that would generate <pkg>.config files mistakenly containing backslashes ? Not sure I understand where the issue actually arises...

On a side-note, there is already an implementation of de-escaping+re-escaping, for the cases where we incrementally expand different variables in interpolated strings (so as to avoid double-deescaping or double expansion)

Windows file locking doesn't support promotion or demotion or recursive/re-entrant locking in the same way as Unix

IIRC there were some cases for which promoting or demoting wasn't supported either on Linux, and we just hope nobody hijacks the lock in the meantime :).

Finally, note that a few points above (e.g. the wrapper-based locking mechanism) are new or have been refactored since 1.2, so maybe your work was started before they were there ?

Ah, and don't let the nitpicking above reduce the appreciation of all the work involved in this!

¹ which, I agree, may be a design mistake, since it assumes there are no changes between versions at the parsing level. But that's a different matter.

@dra27 dra27 force-pushed the windows-testsuite branch 3 times, most recently from 066aa32 to 1c775df Compare April 6, 2018 10:46
@XVilka
Copy link
Contributor

XVilka commented Apr 27, 2018

Will this become a part of 2.0 release or a later release?

@dra27
Copy link
Member Author

dra27 commented Apr 27, 2018

There are some technical parts of this (the patching, and so forth) which I think should be in 2.0 but I'd be surprised if @AltGr (and the wider community) is willing to wait for opam 2.0.0 while the Windows support is completed!

@dra27
Copy link
Member Author

dra27 commented May 10, 2018

Rebased on to master and split 3346-3350. This PR now only contains the final parts to get the testsuite working.

@dra27 dra27 changed the title Native Windows Support Fully test native Windows in the testsuite May 10, 2018
@dra27 dra27 force-pushed the windows-testsuite branch 4 times, most recently from 9eac4b9 to a134008 Compare June 17, 2018 11:32
@dra27 dra27 force-pushed the windows-testsuite branch from a134008 to 4813016 Compare July 21, 2018 12:35
@XVilka
Copy link
Contributor

XVilka commented Dec 20, 2018

So what is exactly a stopper before this get merged?

@dra27 dra27 force-pushed the windows-testsuite branch from 4813016 to e36a32f Compare April 26, 2019 08:26
@dra27 dra27 force-pushed the windows-testsuite branch from e36a32f to 0250f61 Compare June 18, 2019 15:47
@dra27 dra27 force-pushed the windows-testsuite branch from 0250f61 to 6be0416 Compare June 27, 2019 09:06
The OpamGit repository backend supports using either Git-for-Windows (or
any other Git expecting native-style path naming) or Cygwin Git. Make
sure that both are tested.

Git's core.autocrlf is also set to the default installation value of
true (AppVeyor uses input in its image)
@dra27 dra27 force-pushed the windows-testsuite branch from 6be0416 to e7d4b8c Compare June 27, 2019 09:31
@dra27
Copy link
Member Author

dra27 commented Jun 27, 2019

Right, the part for building ocamlfind requires a fork of opam-repository, so I've moved that to another branch for now - but assuming that this passes CI, this branch is now ready for merge.

@AltGr AltGr merged commit 16ed21c into ocaml:master Jun 28, 2019
@rjbou rjbou added this to the 2.1.0 milestone Jun 28, 2019
@dra27 dra27 deleted the windows-testsuite branch June 28, 2019 14:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants