Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Makefile failures "Corrupted compiled interface" #327

Closed
clarus opened this issue Jan 14, 2020 · 10 comments
Closed

Parallel Makefile failures "Corrupted compiled interface" #327

clarus opened this issue Jan 14, 2020 · 10 comments

Comments

@clarus
Copy link
Contributor

clarus commented Jan 14, 2020

Hi,

In the opam package of CompCert, there are sometimes build failures with the following message:

Error: Corrupted compiled interface

during the compilation of .mli files when installing the package with parallel compilation.

These errors appear around 5-10% of the time in the opam bench for Coq. It seems fine with OCaml >= 4.06 (no current errors, I do not remember if there were). Here are some examples of errors (the links may soon be outdated as new tests are being run):

Here is an extract of logs with errors after an opam install coq-compcert:

[...]
- OCAMLC   extraction/ConstpropOp.mli
- OCAMLC   extraction/Cminorgen.mli
- OCAMLC   extraction/CleanupLabels.mli
- OCAMLC   extraction/Asmgen.mli
- OCAMLOPT x86/AsmToJSON.ml
- OCAMLOPT lib/Camlcoq.ml
- OCAMLOPT extraction/AST.ml
- OCAMLC   backend/XTL.mli
- OCAMLC   extraction/Conventions1.mli
- File "backend/XTL.mli", line 1:
- Error: Corrupted compiled interface
- lib/Camlcoq.cmi
- make[2]: *** [Makefile.extr:117: backend/XTL.cmi] Error 2
- make[2]: *** Waiting for unfinished jobs....
- make[2]: Leaving directory '/home/bench/.opam/ocaml-base-compiler.4.02.3/.opam-switch/build/coq-compcert.3.3.0'
- make[1]: *** [Makefile:164: ccomp] Error 2
- make[1]: Leaving directory '/home/bench/.opam/ocaml-base-compiler.4.02.3/.opam-switch/build/coq-compcert.3.3.0'
- make: *** [Makefile:139: all] Error 2
[ERROR] The compilation of coq-compcert failed at "/home/bench/.opam/opam-init/hooks/sandbox.sh build make -j7".
#=== ERROR while compiling coq-compcert.3.3.0 =================================#
# context              2.0.1 | linux/x86_64 | ocaml-base-compiler.4.02.3 | file:///home/bench/run/opam-coq-archive/released
# path                 ~/.opam/ocaml-base-compiler.4.02.3/.opam-switch/build/coq-compcert.3.3.0
# command              ~/.opam/opam-init/hooks/sandbox.sh build make -j7
# exit-code            2
# env-file             ~/.opam/log/coq-compcert-3524-e4db2f.env
# output-file          ~/.opam/log/coq-compcert-3524-e4db2f.out
### output ###
# [...]
# OCAMLOPT extraction/AST.ml
# OCAMLC   backend/XTL.mli
# OCAMLC   extraction/Conventions1.mli
# File "backend/XTL.mli", line 1:
# Error: Corrupted compiled interface
# lib/Camlcoq.cmi
# make[2]: *** [Makefile.extr:117: backend/XTL.cmi] Error 2
# make[2]: *** Waiting for unfinished jobs....
# make[2]: Leaving directory '/home/bench/.opam/ocaml-base-compiler.4.02.3/.opam-switch/build/coq-compcert.3.3.0'
# make[1]: *** [Makefile:164: ccomp] Error 2
# make[1]: Leaving directory '/home/bench/.opam/ocaml-base-compiler.4.02.3/.opam-switch/build/coq-compcert.3.3.0'
# make: *** [Makefile:139: all] Error 2
<><> Error report <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
+- The following actions failed
| - build coq-compcert 3.3.0
+- 
- No changes have been performed
# Run eval $(opam env) to update the current shell environment
'opam install -y -v coq-compcert.3.3.0 coq.8.7.0' failed.

I do not know how important is it and if it happens to some users in practice. I think the parallel compilation of the CompCert package is nice to keep as this is a large package. In the opam bench we black-list such errors. Posting it there for the record.

@clarus
Copy link
Contributor Author

clarus commented Jan 14, 2020

@palmskog

@clarus
Copy link
Contributor Author

clarus commented Jan 14, 2020

CompCert version which are known to sometimes fail:

  • 2.7.1
  • 3.1.0
  • 3.2.0
  • 3.3.0
  • 3.5
  • 3.6

@palmskog
Copy link

Thanks for the nice summary. It should be noted that in the past, many Coq plugin projects had the same nondeterministic parallel build failures on OPAM, such as CoqHammer. The solution there was simply to disable parallel OPAM builds, since the plugins built quickly anyway, but as mentioned above this is not feasible here.

@bschommer
Copy link
Member

I never encountered this kind of error, so I'm not sure what the reason is, but I think this is more of a problem of the (inofficial) opam package.

@clarus
Copy link
Contributor Author

clarus commented Jan 14, 2020

OK, good to know. Here are the build commands of the 3.6 version in case you see something fishy (but I guess this is quite standard):

  ["./configure"
    "ia32-linux" {os = "linux"}
    "ia32-macosx" {os = "macos"}
    "ia32-cygwin" {os = "cygwin"}
    "-bindir" "%{bin}%"
    "-libdir" "%{lib}%/compcert"
    "-install-coqdev"
    "-clightgen"
    "-coqdevdir" "%{lib}%/coq/user-contrib/compcert"
    "-ignore-coq-version"
  ]
  [make "-j%{jobs}%"]

@xavierleroy
Copy link
Contributor

This matches a known issue with parallel builds in OCaml prior to 4.06. See ocaml/ocaml#7472 . It was fixed in OCaml 4.06 by making the creation of .cmi files atomic.

Personally, I would just stop building CompCert with old OCaml versions: you're doing 4.02.3, 4.04 and 4.05, if I'm not mistaken, and I don't see the point.

@palmskog
Copy link

palmskog commented Jan 14, 2020

@xavierleroy do you then think we should set "ocaml" {>= "4.06"} for all CompCert packages in the Coq OPAM repository where this is possible? Our usual policy is to allow (and test) all major OCaml versions supported by Coq (which apparently tends to be the version that is shipped in Debian when the Coq release is made).

@xavierleroy
Copy link
Contributor

do you then think we should set "ocaml" {>= "4.06"} for all CompCert packages in the Coq OPAM repository where this is possible?

That's one possibility. Another, if technically doable, would be to ignore %{jobs} or set it to 1 if the OCaml version is less than 4.06.

I realize I'm not sure what the main purpose of these OPAM packages is. If they are mainly for regression testing of Coq, parallel builds are important but testing old OCaml versions is not, in my opinion. If the packages are for end-users of Coq, supporting old OCaml versions when possible is a nice touch, but the end-users should tolerate slower, sequential builds. Your choice!

Our usual policy is to allow (and test) all major OCaml versions supported by Coq

The next release of CompCert will drop support for OCaml < 4.05, and perhaps we should drop 4.05 as well because of the present issue.

(which apparently tends to be the version that is shipped in Debian when the Coq release is made).

This pushes you back into really old versions of OCaml that are totally unsupported and contains a number of known bugs. Again, I really see no point in testing with OCaml 4.02.3, which dates back to 2014. It's like testing with Ubuntu 14.04...

@palmskog
Copy link

Thanks for the feedback, @clarus and I will investigate the best way to keep the packages building consistently, which may indeed be with some kind of version-dependent build parameter selection.

I realize I'm not sure what the main purpose of these OPAM packages is.

Other maintainers may have other views, but I see the Coq OPAM archive as a service to the Coq user community and to researchers in particular - regression testing of Coq itself or OCaml is not the concern. There are many Coq projects "stuck" on particular Coq versions, and if the cost is not too high to support them, e.g., by modifying some OPAM scripts as the need arises, this seems reasonable to do. Also, since many Coq projects tend to bitrot or even disappear, an OPAM package and a tarball that worked at one point is usually a big help for getting something to work again.

The next release of CompCert will drop support for OCaml < 4.05, and perhaps we should drop 4.05 as well because of the present issue.

I second raising the requirement to > 4.05.

This pushes you back into really old versions of OCaml that are totally unsupported and contains a number of known bugs.

Based on my observations, comparatively old OCaml versions (around 4.05) tend to be used quite a lot by Coq users, e.g., due to Linux distro packaging. The minimum required OCaml version for Coq appears to be constantly discussed among Coq developers and concerns other than the Coq OPAM archive's typically dominate. I'm sure the devs (Matthieu, Maxime, Emilio, ...) would be open to feedback on this, but us maintainers are downstream from them.

@clarus
Copy link
Contributor Author

clarus commented Feb 2, 2020

Closing as the compilation of the package is now sequential for OCaml <= 4.05, thanks to a flag from @palmskog :

build: [
  ["./configure" "ia32-linux" {os = "linux"}
  "ia32-macosx" {os = "macos"}
  "ia32-cygwin" {os = "cygwin"}
  "-bindir" "%{bin}%"
  "-libdir" "%{lib}%/compcert"
  "-install-coqdev"
  "-clightgen"
  "-coqdevdir" "%{lib}%/coq/user-contrib/compcert"
  "-ignore-coq-version"]
  [make "-j%{jobs}%" {ocaml:version >= "4.06"}]
]

No new failures occurred. Thanks for the answers. As a remark, CompCert 3.6 is not compatible with the freshly released Coq 8.11 so I fixed that in the package coq/opam#1133

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants