Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dune 1.11.0 fails to build on armv7 with odd relocation errors #2527

Closed
rwmjones opened this issue Aug 8, 2019 · 25 comments · Fixed by #4085 or ocaml/opam-repository#17967
Closed

Comments

@rwmjones
Copy link

rwmjones commented Aug 8, 2019

The log is here:

https://kojipkgs.fedoraproject.org//work/tasks/2862/36862862/build.log

The same architecture with 1.10:

https://kojipkgs.fedoraproject.org//packages/ocaml-dune/1.10.0/5.fc31/data/logs/armv7hl/build.log

It seems as if we need to pass -fPIC to the ocamlopt invocation (or at least that may not be the solution but I'd surely like to try that). However I can't work out how to do that with this very complex build/bootstrap system.

@ghost
Copy link

ghost commented Aug 8, 2019

The error seems to trigger after the bootstrap. You can add a dune file at the root with that contents to add -fPIC everywhere:

(env (_ (ocamlopt_flags :standard -fPIC)))

Is it the same OCaml version in both case BTW?

@rwmjones
Copy link
Author

rwmjones commented Aug 8, 2019

Yes. The builds were only a couple of days apart. I'll try the dune file idea, thanks.

@rwmjones
Copy link
Author

rwmjones commented Aug 8, 2019

Unfortunately that didn't fix it. The option has been added, but the error is similar so I'm not sure what's going on. Latest log: https://kojipkgs.fedoraproject.org//work/tasks/3158/36863158/build.log

@ghost
Copy link

ghost commented Aug 8, 2019

Is it possible to get the contents of the _boot/log file? Diffing the build commands between the successful and failing build might reveal something

@rwmjones
Copy link
Author

rwmjones commented Aug 8, 2019

It's difficult to compare them because the output is kind of mixed up (plus I can't run this directly on the hardware but need to use the Fedora build system):
http://oirase.annexia.org/tmp/boot_log_good.txt
http://oirase.annexia.org/tmp/boot_log_bad.txt
Edit: I should note this is without using -fPIC

@rwmjones
Copy link
Author

rwmjones commented Aug 8, 2019

I'm going to temporarily (cough) disable armv7 builds ...

@ghost
Copy link

ghost commented Aug 8, 2019

Thanks. I diffed the files passed through grep '^\$' | sort. Looking at the failures from the original log and the diff, i noticed that for the command that are failing we are passing -nodynlink that we were not passing in 1.10. That's likely to be the issue.

I remember this change. The idea is that for executables, unless the shared_object mode is requested there is no point in generating dynlinkable code. So we pass -nodynlink in order to produce slightly better code. Not sure why that's breaking the build on armv7 though...

I wrote a quick patch to disable this optimisation in 1.11: https://github.com/diml/dune/commit/15c04b09a8c06871635d5fd98c3a37089bbde6d9

Would you be able to try it to validate this hypothesis?

@nojb
Copy link
Collaborator

nojb commented Aug 9, 2019

Looking at the error

/usr/bin/ld: bin/main/.main_jbuilder.eobjs/native/build_info__Build_info_data.o: relocation R_ARM_THM_MOVW_ABS_NC against `caml_int_of_string' can not be used when making a shared object; recompile with -fPIC

makes me think that the problem is that the modules for executables are non-PIC mode but they are being linked with PIC libraries. In particular, the compiler uses movw/movt pair to load the address of global symbols when in -nodynlink mode:
https://github.com/ocaml/ocaml/blob/8c1107d910fb9e3ca07edfe922b56acaa69bdf74/asmcomp/arm/emit.mlp#L389-L391 which seems like it could be problematic for symbols coming from PIC libraries (but am not an expert, so take this with a grain of salt).

@rwmjones
Copy link
Author

rwmjones commented Aug 9, 2019

I'm testing the patch now.

With respect to the previous comment, note that Fedora compiles virtually every C file with -fPIC. This includes libasmrun.a which contains objects compiled with:

gcc -c -O2 -fno-strict-aliasing -fwrapv -O2 -g -pipe -Wall -Werror=format-security -Wp,-D_FORTIFY_SOURCE=2 -Wp,-D_GLIBCXX_ASSERTIONS -fexceptions -fstack-protector-strong -grecord-gcc-switches -specs=/usr/lib/rpm/redhat/redhat-hardened-cc1 -specs=/usr/lib/rpm/redhat/redhat-annobin-cc1 -march=armv7-a -mfpu=vfpv3-d16 -mtune=generic-armv7-a -mabi=aapcs-linux -mfloat-abi=hard -Wall -Werror -g -D_FILE_OFFSET_BITS=64 -D_REENTRANT -DCAML_NAME_SPACE  -DOCAML_STDLIB_DIR='"/usr/lib/ocaml"'  -DNATIVE_CODE -DTARGET_arm -DMODEL_armv7 -DSYS_linux_eabihf   -o startup_aux_n.o startup_aux.c

where /usr/lib/rpm/redhat/redhat-hardened-cc1 is:

*cc1_options:
+ %{!r:%{!fpie:%{!fPIE:%{!fpic:%{!fPIC:%{!fno-pic:-fPIE}}}}}}

The reason for this is of course hardening and in this case the wish to make every binary in the distribution use PIE.

@rwmjones
Copy link
Author

rwmjones commented Aug 9, 2019

No that doesn't seem to have worked. New build log: https://kojipkgs.fedoraproject.org//work/tasks/8832/36878832/build.log

@glondu
Copy link
Contributor

glondu commented Aug 9, 2019

I've just updated dune to 1.11.0 in Debian unstable, and it fails with a similar error:

https://buildd.debian.org/status/fetch.php?pkg=ocaml-dune&arch=armhf&ver=1.11.0-1&stamp=1565333162&raw=0

Note that it fails on armhf only, and the previous version (1.6.2) worked there. And we are still using OCaml 4.05.0 in unstable.

I've tried @diml's patch and it still fails.

@glondu
Copy link
Contributor

glondu commented Aug 9, 2019

About -nodynlink: when I compile a simple hello.ml file:

print_endline "Hello world!"

with ocamlopt -nodynlink hello.ml, I get similar errors that go away when I remove -nodynlink. So it might be that @diml's hypothesis is right, but the patch is wrong.

@nojb
Copy link
Collaborator

nojb commented Aug 9, 2019

I think @diml's patch is wrong, it should set dynlink = true instead of false. Could you try it ?

@rwmjones
Copy link
Author

rwmjones commented Aug 9, 2019

In case it isn't clear, on Fedora it also only fails on armv7, but succeeds on all other architectures. I will try to update diml's patch as suggested.

@glondu
Copy link
Contributor

glondu commented Aug 9, 2019

I think @diml's patch is wrong, it should set dynlink = true instead of false.

Of course...

Could you try it ?

It works! \o/

@rwmjones
Copy link
Author

rwmjones commented Aug 9, 2019

Yes can also confirm the updated patch works.

ocamlopt -nodynlink hello.ml

So this sounds as if it could be an upstream bug on armv7? Given that it affects both Fedora and Debian so it's not likely to be because of hardening flags ...

@glondu
Copy link
Contributor

glondu commented Aug 9, 2019

Given that it affects both Fedora and Debian so it's not likely to be because of hardening flags ...

We also do hardening in Debian. I don't know the details, though.

@nojb
Copy link
Collaborator

nojb commented Aug 9, 2019

Yes, looks like a compiler bug; see https://sourceware.org/ml/binutils/2009-04/msg00395.html where it is explained (I think) that one should not use MOVW/MOVT to load addresses of symbols from PIC libraries from non-PIC code.

@nojb
Copy link
Collaborator

nojb commented Aug 9, 2019

About -nodynlink: when I compile a simple hello.ml file:

print_endline "Hello world!"

with ocamlopt -nodynlink hello.ml, I get similar errors that go away when I remove -nodynlink.

@glondu could you open an issue on https://github.com/ocaml/ocaml with this bug?

@ghost
Copy link

ghost commented Aug 12, 2019

Oops, thatnks @nojb for spotting the mistake in my patch! Booleans are hard, especially after 🍷 haha.

What should we do about this then? Should we consider that it's a compiler bug and do nothing in Dune, or disable the optimisation at least in Dune 1.x?

@nojb
Copy link
Collaborator

nojb commented Aug 17, 2019

I may be wrong, but it feels like https://discuss.ocaml.org/t/program-segfaults-when-compiled-with-dune-1-11/4254 is also caused by the -nodynlink change.

@ghost
Copy link

ghost commented Aug 19, 2019

Alright, well let's disable this optimisation for now until we figure out the proper way to fix this. I pushed a commit to disable it in both 1.11 and master.

rgrinberg added a commit to rgrinberg/opam-repository that referenced this issue Aug 20, 2019
CHANGES:

- Remove the optimisation of passing `-nodynlink` for executalbes when
  not necessary. It seems to be breaking things (see ocaml/dune#2527, @diml)

- Fix invalid library names in `dune-package` files. Only public names should
  exist in such files. (ocaml/dune#2558, fix ocaml/dune#2425, @rgrinberg)
@xavierleroy
Copy link

In a nutshell: -nodynlink can be incompatible with PIE (position-independent executables), which are becoming the default in Linux distros. So, some work is needed on -nodynlink and the option should not be set by default. More details here: ocaml/ocaml#8867 (comment)

@rwmjones
Copy link
Author

rwmjones commented Sep 4, 2020

FYI we are building dune 2.7.0 on armv7 (and other arches) successfully at the moment, eg:
https://koji.fedoraproject.org/koji/buildinfo?buildID=1604265

@xavierleroy
Copy link

@rwmjones: yes, you should be fine if -nodynlink is not set by Dune.

Expect a problem with s390x (Z systems) when PIE becomes the default on this architecture, because ocamlopt currently generates PIE-incompatible code on this platform.

emillon added a commit to emillon/dune that referenced this issue Jan 5, 2021
Closes ocaml#4069

When `dynlink:false` is passed, the ARM backend emits MOVW/MOVT
instructions which have relocations incompatible with PIC code.
This is similar to ocaml#2527 (ocaml issue: ocaml/ocaml#8867).

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit to emillon/dune that referenced this issue Jan 5, 2021
Closes ocaml#4069

When `dynlink:false` is passed, the ARM backend emits MOVW/MOVT
instructions which have relocations incompatible with PIC code.
This is similar to ocaml#2527 (ocaml issue: ocaml/ocaml#8867).

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit that referenced this issue Jan 6, 2021
This is an optimization when PIC executables are not used, but this
optimization is disabled becauses it causes errors on arm32. Most
distributions are going in the direction of requiring PIC, and
`-nodynlink` might go away (see ocaml/ocaml#8867), so the supporting
code in dune (which is bypassed in most cases) can be removed.

Closes #4069
Closes #2527

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit that referenced this issue Jan 6, 2021
This is an optimization when PIC executables are not used, but this
optimization is disabled becauses it causes errors on arm32. Most
distributions are going in the direction of requiring PIC, and
`-nodynlink` might go away (see ocaml/ocaml#8867), so the supporting
code in dune (which is bypassed in most cases) can be removed.

Closes #4069
Closes #2527

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit that referenced this issue Jan 11, 2021
This is an optimization when PIC executables are not used, but this
optimization is disabled becauses it causes errors on arm32. Most
distributions are going in the direction of requiring PIC, and
`-nodynlink` might go away (see ocaml/ocaml#8867), so the supporting
code in dune (which is bypassed in most cases) can be removed.

Closes #4069
Closes #2527

Signed-off-by: Etienne Millon <me@emillon.org>
emillon added a commit that referenced this issue Jan 11, 2021
This is an optimization when PIC executables are not used, but this
optimization is disabled becauses it causes errors on arm32. Most
distributions are going in the direction of requiring PIC, and
`-nodynlink` might go away (see ocaml/ocaml#8867), so the supporting
code in dune (which is bypassed in most cases) can be removed.

Closes #4069
Closes #2527

Signed-off-by: Etienne Millon <me@emillon.org>
rgrinberg added a commit to rgrinberg/opam-repository that referenced this issue Jan 13, 2021
…ne-action-plugin, dune-private-libs and dune-glob (2.8.0)

CHANGES:

- `dune rules` accepts aliases and other non-path rules (ocaml/dune#4063, @mrmr1993)

- Action `(diff reference test_result)` now accept `reference` to be absent and
  in that case consider that the reference is empty. Then running `dune promote`
  will create the reference file. (ocaml/dune#3795, @bobot)

- Ignore special files (BLK, CHR, FIFO, SOCKET), (ocaml/dune#3570, fixes ocaml/dune#3124, ocaml/dune#3546,
  @ejgallego)

- Experimental: Simplify loading of additional files (data or code) at runtime
  in programs by introducing specific installation sites. In particular it allow
  to define plugins to be installed in these sites. (ocaml/dune#3104, ocaml/dune#3794, fixes ocaml/dune#1185,
  @bobot)

- Move all temporary files created by dune to run actions to a single directory
  and make sure that actions executed by dune also use this directory by setting
  `TMPDIR` (or `TEMP` on Windows). (ocaml/dune#3691, fixes ocaml/dune#3422, @rgrinberg)

- Fix bootstrap script with custom configuration. (ocaml/dune#3757, fixes ocaml/dune#3774, @marsam)

- Add the `executable` field to `inline_tests` to customize the compilation
  flags of the test runner executable (ocaml/dune#3747, fixes ocaml/dune#3679, @lubegasimon)

- Add `(enabled_if ...)` to `(copy_files ...)` (ocaml/dune#3756, @nojb)

- Make sure Dune cleans up the status line before exiting (ocaml/dune#3767,
  fixes ocaml/dune#3737, @alan-j-hu)

- Add `{gitlab,bitbucket}` as options for defining project sources with `source`
  stanza `(source (<host> user/repo))` in the `dune-project` file.  (ocaml/dune#3813,
  @rgrinberg)

- Fix generation of `META` and `dune-package` files when some targets (byte,
  native, dynlink) are disabled. Previously, dune would generate all archives
  for regardless of settings. (ocaml/dune#3829, ocaml/dune#4041, @rgrinberg)

- Do not run ocamldep to for single module executables & libraries. The
  dependency graph for such artifacts is trivial (ocaml/dune#3847, @rgrinberg)

- Fix cram tests inside vendored directories not being interpreted correctly.
  (ocaml/dune#3860, fixes ocaml/dune#3843, @rgrinberg)

- Add `package` field to private libraries. This allows such libraries to be
  installed and to be usable by other public libraries in the same project
  (ocaml/dune#3655, fixes ocaml/dune#1017, @rgrinberg)

- Fix the `%{make}` variable on Windows by only checking for a `gmake` binary
  on UNIX-like systems as a unrelated `gmake` binary might exist on Windows.
  (ocaml/dune#3853, @kit-ty-kate)

- Fix `$ dune install` modifying the build directory. This made the build
  directory unusable when `$ sudo dune install` modified permissions. (fix
  ocaml/dune#3857, @rgrinberg)

- Fix handling of aliases given on the command line (using the `@` and `@@`
  syntax) so as to correctly handle relative paths. (ocaml/dune#3874, fixes ocaml/dune#3850, @nojb)

- Allow link time code generation to be used in preprocessing executable. This
  makes it possible to use the build info module inside the preprocessor.
  (ocaml/dune#3848, fix ocaml/dune#3848, @rgrinberg)

- Correctly call `git ls-tree` so unicode files are not quoted, this fixes
  problems with `dune subst` in the presence of unicode files. Fixes ocaml/dune#3219
  (ocaml/dune#3879, @ejgallego)

- `dune subst` now accepts common command-line arguments such as
  `--debug-backtraces` (ocaml/dune#3878, @ejgallego)

- `dune describe` now also includes information about executables in addition to
  that of libraries. (ocaml/dune#3892, ocaml/dune#3895, @nojb)

- instrumentation backends can now receive arguments via `(instrumentation
  (backend <name> <args>))`. (ocaml/dune#3906, ocaml/dune#3932, @nojb)

- Tweak auto-formatting of `dune` files to improve readability. (ocaml/dune#3928, @nojb)

- Add a switch argument to opam when context is not default. (ocaml/dune#3951, @tmattio)

- Avoid pager when running `$ git diff` (ocaml/dune#3912, @AltGr)

- Add `(root_module ..)` field to libraries & executables. This makes it
  possible to use library dependencies shadowed by local modules (ocaml/dune#3825,
  @rgrinberg)

- Allow `(formatting ...)` field in `(env ...)` stanza to set per-directory
  formatting specification. (ocaml/dune#3942, @nojb)

- [coq] In `coq.theory`, `:standard` for the `flags` field now uses the
  flags set in `env` profile flags (ocaml/dune#3931 , @ejgallego @rgrinberg)

- [coq] Add `-q` flag to `:standard` `coqc` flags , fixes ocaml/dune#3924, (ocaml/dune#3931 , @ejgallego)

- Add support for Coq's native compute compilation mode (@ejgallego, ocaml/dune#3210)

- Add a `SUFFIX` directive in `.merlin` files for each dialect with no
  preprocessing, to let merlin know of additional file extensions (ocaml/dune#3977,
  @vouillon)

- Stop promoting `.merlin` files. Write per-stanza Merlin configurations in
  binary form. Add a new subcommand `dune ocaml-merlin` that Merlin can use to
  query the configuration files. The `allow_approximate_merlin` option is now
  useless and deprecated. Dune now conflicts with `merlin < 3.4.0` and
  `ocaml-lsp-server < 1.3.0` (ocaml/dune#3554, @voodoos)

- Configurator: fix a bug introduced in 2.6.0 where the configurator V1 API
  doesn't work at all when used outside of dune. (ocaml/dune#4046, @aalekseyev)

- Fix `libexec` and `libexec-private` variables. In cross-compilation settings,
  they now point to the file in the host context. (ocaml/dune#4058, fixes ocaml/dune#4057,
  @TheLortex)

- When running `$ dune subst`, use project metadata as a fallback when package
  metadata is missing. We also generate a warning when `(name ..)` is missing in
  `dune-project` files to avoid failures in production builds.

- Remove support for passing `-nodynlink` for executables. It was bypassed in
  most cases and not correct in other cases in particular on arm32.
  (ocaml/dune#4085, fixes ocaml/dune#4069, fixes ocaml/dune#2527, @emillon)

- Generate archive rules compatible with 4.12. Dune longer attempt to generate
  an archive file if it's unnecessary (ocaml/dune#3973, fixes ocaml/dune#3766, @rgrinberg)

- Fix generated Merlin configurations when multiple preprocessors are defined
  for different modules in the same folder. (ocaml/dune#4092, fixes ocaml/dune#2596, ocaml/dune#1212 and
  ocaml/dune#3409, @voodoos)

- Add the option `use_standard_c_and_cxx_flags` to `dune-project` that 1.
  disables the unconditional use of the `ocamlc_cflags` and `ocamlc_cppflags`
  from `ocamlc -config` in C compiler calls, these flags will be present in the
  `:standard` set instead; and 2. enables the detection of the C compiler family
  and populates the `:standard` set of flags with common default values when
  building CXX stubs. (ocaml/dune#3875, ocaml/dune#3802, fix ocaml/dune#3718 and ocaml/dune#3528, @voodoos)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants