Skip to content

Conversation

@wilzbach
Copy link
Contributor

Because most Linux OSes now use hardening (e.g. Debian, Ubuntu, Arch Linux) and the DMD test suite should work out of the box.
-fPIC has been the default on Linux x64 since 2.072.2.

See also:

@dlang-bot
Copy link
Contributor

dlang-bot commented Dec 11, 2017

Thanks for your pull request, @wilzbach!

Bugzilla references

Auto-close Bugzilla Description
18014 DMD test suite fails to link on Linux distros where PIC/PIE is enforced

@wilzbach
Copy link
Contributor Author

Okay so regarding:

 ... runnable/test_cdvecfill.d      -O (-mcpu=avx -mcpu=avx2)
Test failed.  The logged output:
../src/dmd -conf= -m64 -Irunnable -O -fPIC  -odgenerated/runnable -ofgenerated/runnable/test_cdvecfill_0 runnable/test_cdvecfill.d
generated/runnable/test_cdvecfill_0
Expected code sequence for load!(ubyte, 16) not found.
  Expected: 0x50 0x66 0x0f 0x6e 0xc7 0x66 0x0f 0x60 0xc0 0x66 0x0f 0x61 0xc0 0x66 0x0f 0x70 0xc0 0x00 0x59 0xc3
  Actual: 0x55 0x48 0x8b 0xec 0x66 0x0f 0x6e 0xc7 0x66 0x0f 0x60 0xc0 0x66 0x0f 0x61 0xc0 0x66 0x0f 0x70 0xc0 0x00 0x5d 0xc3 0x00 0x55 0x48 0x8b 0xec 0x0f 0xb6 0x07 0x66 0x0f 0x6e 0xc0 0x66 0x0f 0x60 0xc0 0x66 0x0f 0x61 0xc0 0x66 0x0f 0x70 0xc0 0x00 0x5d 0xc3 0x00 0x00 0x55 0x48 0x8b 0xec 0x66 0x0f 0x6e 0xc7 0x66 0x0f 0x60 0xc0
core.exception.AssertError@runnable/test_cdvecfill.d(895): Assertion failure

-> I updated the cdvec_fill.d file (this script is apparently only run on Linux x64)

foreach (arch; [EnumMembers!Arch])
{
auto args = [dmd, "-c", "-O", "-mcpu=" ~ arch.to!string, "test/runnable/test_cdvecfill.d"];
auto args = [dmd, "-c", "-fPIC", "-O", "-mcpu=" ~ arch.to!string, "test/runnable/test_cdvecfill.d"];
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line in the update script isn't necessary itself because the default dmd.conf for Linux x86_64 already adds -fPIC as default flag, but imho explicitness helps.

@wilzbach wilzbach force-pushed the fpic-dmd branch 2 times, most recently from bf89166 to 26fea93 Compare December 11, 2017 08:50
/* pop rcx */ 0x59,
/* pop rbp */ 0x5d,
/* ret */ 0xc3,
]),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this code change necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently when emitting position-independent code, different registers are used. There's an update script included in this file which I used to update the test cases.

Copy link
Contributor

@JinShil JinShil Dec 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be prudent to keep the original test for Non-PIC and simply keep this PR as a simple addition? My motivation being to test both scenarios.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I could use version(D_PIC), but this would make the update script more complex. And I'm not sure it's necessary, because this script is only run under Linux x86_64 atm and since 2.072.2 dmd uses -fPIC for compiled programs on Linux as default (for Darwin -fPIC is even fixed). Anyhow, if I add the old test cases, this would only make sense, if they are run as part of the test suite and then if not in version(PIE) we would have to detect if the environment is hardened and then avoid running the test cases.
And in case this will ever be needed, the original code is still saved in git and alternatively it can be easily regenerated with Martin's fancy update script.
So I would only do this extra work if @MartinNowak or @WalterBright think it's essential.

@JinShil
Copy link
Contributor

JinShil commented Dec 12, 2017

Looks like this may address this issue: https://issues.dlang.org/show_bug.cgi?id=18013

@wilzbach
Copy link
Contributor Author

Looks like this may address this issue: https://issues.dlang.org/show_bug.cgi?id=18013

Yes it will. I'm a Arch Linux user myself, so I constantly bump into the -fPIC issues with D, but luckily with a rising number of merged PRs the problems have been drastically reduced. This is actually (hopefully) the last bit of a long series (dlang/phobos#5586, dlang/phobos#5750, dlang/druntime#1880, #7002, dlang/tools#264 ...) in my quest to build and test the main D repos on Arch Linux without needing to patch anything ;-)

JinShil
JinShil previously approved these changes Dec 12, 2017
@PetarKirov
Copy link
Member

Can you add this to the list of bugs fixed:
https://issues.dlang.org/show_bug.cgi?id=18014

My fix is a bit different, I'll push it soon so we can compare our approaches.

@PetarKirov
Copy link
Member

PetarKirov commented Dec 12, 2017

I don't think your PR fixes all of the issues from 18014. When running:

docker run -it zombinedev/dmd-test-suite-docker --repos:dmd:pr/7420

I get these errors:

/usr/bin/ld: test_results/compilable/main.o: relocation R_X86_64_32 against `.rodata.str1.1' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: test_results/compilable/a.a(a_1_149.o): relocation R_X86_64_32 against symbol `_D28TypeInfo_S4tmpl__T4TmplTiZQi6__initZ' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: test_results/compilable/b.a(b_1_149.o): relocation R_X86_64_32 against symbol `_D28TypeInfo_S4tmpl__T4TmplTlZQi6__initZ' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Error: linker exited with status 1
Makefile:164: recipe for target 'test_results/compilable/test6461.sh.out' failed
make[1]: *** [test_results/compilable/test6461.sh.out] Error 1
make[1]: *** Waiting for unfinished jobs....
 ... fail_compilation/diag6707.d    ()
 ... fail_compilation/fail14416.d   ()
 ... fail_compilation/ice8309.d     ()
 ... fail_compilation/ice10076.d    ()
 ... fail_compilation/fail233.d     -o- ()
/usr/bin/ld: test_results/compilable/test14894main.o: relocation R_X86_64_32 against symbol `_Dmain' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Nonrepresentable section on output
collect2: error: ld returned 1 exit status
Error: linker exited with status 1
Makefile:164: recipe for target 'test_results/compilable/test14894.sh.out' failed
make[1]: *** [test_results/compilable/test14894.sh.out] Error 1

Though it gets much further than:

docker run -it zombinedev/dmd-test-suite-docker --repos:dmd:master

Which can't even build the test runner (d_do_test.d).

@PetarKirov
Copy link
Member

I'll see if with your fix for 18013 my branch finally passes the test suite, and if that's the case I'll open an alternative PR.

@PetarKirov
Copy link
Member

@wilzbach
Copy link
Contributor Author

79.152% (-1.841%) compared to cd6b40f

Seems like we have to go with three containers :/

@wilzbach wilzbach force-pushed the fpic-dmd branch 2 times, most recently from ac30592 to 8ece1fa Compare January 17, 2018 02:53
@wilzbach
Copy link
Contributor Author

image

Looks like we are good to go here :)
The Docker image, see e.g. #7579 (comment) for an example of the relocation errors you get.

The docker image is based on this: https://github.com/wilzbach/dlang-docker/blob/master/circleci/dlang.docker

Why 3 jobs and not 2?

Code coverage:

79.152% (-1.841%) compared to cd6b40f

Why is there now circleci: pic and circleci: no_pic appearing?

It's a feature from CodeCov to send different status notifications for different stages.

https://circleci.com/docs/2.0/workflows/

Having the ci/circleci checkbox enabled will prevent the status from showing as completed in GitHub when using a workflow because CircleCI posts statuses to Github with a key that includes the job by name.

@wilzbach
Copy link
Contributor Author

Ah I forgot to mention - the pic job is even faster (probably due to all dependencies being part of the Docker image):

image

https://circleci.com/workflow-run/cd281b04-466e-4db5-b44d-c6281a3b31db

no_pic:
working_directory: ~/dmd
docker:
- image: circleci/node:4.8.2
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could replace this with a D Docker image that doesn't require -fPIC, e.g. https://github.com/wilzbach/dlang-docker/blob/master/circleci/dlang.docker built for Ubuntu 14.04, but this should be more than fine for now (it's the existing status quo after all).

parallelism: 2
branches:
ignore:
- dmd-1.x
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This didn't work with the new workflow setting and AFAICT is only required for the respective branch anyways.

command: ./.circleci/run.sh coverage
name: Run testsuite with -cov
command: ./.circleci/run.sh all
name: Run all
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I reduced the commands to "all", s.t. it's easier to maintain both jobs.

endif
export DFLAGS
endif
REQUIRED_ARGS+=$(PIC_FLAG)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After the other PRs that's all that's needed to fix the test suite for -fPIC :)

check-clean-git) echo "removed" ;;
codecov)
echo "removed - use 'all'"
;&
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fall-through is required, s.t. existing PRs don't need to be rebased, but will call all automatically.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a good idea to add that as a comment to the script.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes! Done.

@JinShil
Copy link
Contributor

JinShil commented Jan 17, 2018

Does the test suite pass? Even test/runnable/test17559.d?

@wilzbach
Copy link
Contributor Author

Does the test suite pass? Even test/runnable/test17559.d?

Yes:

image

@JinShil
Copy link
Contributor

JinShil commented Jan 17, 2018

This is looking great. Can you give a final summary? What's remaining before we can merge this?

@wilzbach
Copy link
Contributor Author

Can you give a final summary?

I try to do an honest summary of the disadvantages I see as you know the advantages of testing automatically for -fPIC yourself.

  1. One more container run at CircleCi

Con:

  • This shouldn't be a huge problem. CircleCi is still one of our fastest CIs and we have four containers (=jobs) per organization available.
  • If this really becomes a bottleneck, we can always easily disable the pic stage
  1. Dependency on the D docker image

Con:

  • it's configured with Travis CRON + auto-deploy for new releases, s.t. not much maintenance should be required
  • it uses the official CircleCi Dockerfile and just adds the D installer + a few packages on top
  • it's just 29 lines
  1. One more status update by CircleCi (no_pic and pic) = more visual noise

I haven't found a good solution for this :/

  1. The status check ci/circleci was still set to "enforced". As the newly deployed auto-merge should obey combined CI status dlang-bot#69 feature lets the bot respect the failing CIs properly, I set "ci/circleci" to non-enforced until we reach a decision here.

What's remaining before we can merge this?

This is ready from my sides. CIs are happy -> I'm happy

Copy link
Contributor

@JinShil JinShil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! 👍

@wilzbach
Copy link
Contributor Author

The only thing that is still unclear to me is why the stack traces work on the Docker image.
It's failing on my local system and clearly I'm not the only one:

http://forum.dlang.org/post/ufpwobqmisamazigcaav@forum.dlang.org
https://issues.dlang.org/show_bug.cgi?id=18068

Maybe it's not as assumed due to PIE, but sth. else. glibc maybe? Arch Linux currently ships 2.26-10

@JinShil
Copy link
Contributor

JinShil commented Jan 19, 2018

It's failing on my local system and clearly I'm not the only one:

Same here. That's why I asked.

Maybe it's not as assumed due to PIE, but sth. else. glibc maybe?

Evidence certainly points to something else. I use Arch Linux too. I don't think it should hold up this PR.

@wilzbach
Copy link
Contributor Author

Evidence certainly points to something else. I use Arch Linux too. I don't think it should hold up this PR.

I agree, so let's move forward here?

@JinShil
Copy link
Contributor

JinShil commented Jan 24, 2018

I agree, so let's move forward here?

Yep, just wanted to give others a few days opportunity to chime it. I think they've had enough time. auto-merge toggling on...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants