-
Notifications
You must be signed in to change notification settings - Fork 24
Version upgrades #38
Version upgrades #38
Conversation
Why does the docker build fail on the PR runner 😕 HDF5_DIR=/usr pip3 install -v --install-option="\'--hdf5 \${HDF5_DIR}\'" --install-option="\'--jobs=$(nproc) \'" --no-build-isolation tables with a pip error but doesn't fail locally and it looks like it worked fine in PR 37? 🤔 |
After removing build caches and trying the build again, I was able to reproduce this locally, so something in pip/tables changed in the last few days and causes what used to be a working option to a non-working option. I'll work on resolving this. |
Failing now with exit code 143, which comes from the host, not the build, but the
|
Exit Code 143 in this case is probably a resource issue: actions/runner-images#6680 |
Merging this PR is high priority. It looks like the GitHub runners are all failing -- My suspicion (I'm inexperienced) is that the 2 core 8Gb runner is getting overwhelmed. We will not have larger runners available quickly, though we should probably work on that. If we can confirm that this builds on several external platforms, does anyone disagree with merging the PR over the failed tests? I'd propose that we test on these or confirm already-complete tests on the following environments, then merge the PR:
@hellkite500 @arpita0911patel -- please briefly weigh in here. test sequence:
Git download command
|
I'm ok with merging for now. I built on apple arm locally and I'm working on testing an x86 cross build. Some additional validation as mentioned would be good. |
/test_docker_build/docker$ docker buildx build -f Dockerfile -t local/dmod_ngen_test_DOESTHISBUILDWORK . Use the command below instead.
|
I was able to install docker-ce on Ubuntu.
Getting an error, obviously this is due to disk space. What is minimum disk space to run this build?
|
I built 4 times, and the build process got killed when it is running the below line.
|
This is getting strange. @benlee0423 I don't actually see an error in the build log, it just cuts off, but it looks like around the same place that @jameshalgren build was getting a I'm thinking this may be some environment setting that is hindering the compiler -- I'll keep investigating and see what I can find. |
@jameshalgren on a fresh build (I had to factory reset rancher, again...) I was able to hit what looks like the same error you did, but I noticed a
|
@hellkite500
|
We may not be able to cross compile from M1 macs, see docker/for-mac#6204 which references an upstream qemu bug/issue that I bet we are running into. Other references to the same issue: I imagine we are running into similar issues. We will need to test on a native x86 build. |
@jameshalgren on your native mac build, are how much RAM do you have allocated to your rancher vm? |
16 Gb. Retrying with 48Gb. |
as for the x86 issues, I was able to get an environment to test this build in, and I think I can generally reproduce the failure seen by @benlee0423. The problem there isn't an explicit failure, but it looks to me like a potential fork bomb is eating up all the resources and just hanging/crashing the machine. Might be related to this issue documented on ngen. I'll dig into that a little more and see. Here is what I see when the build hangs and crashes the machine. But this is definitely the compiler spawning, not the testing process, so may not be related at all the mentioned issue... |
So I think I sorted this out....we asked make to do it, and make did it...from the make documentation (which is the build generator used)
and in the ngen build line, we use I'll push a fix momentarily that ensure |
My local arm build and an AWS x86 build both successful for me now! |
Docker build is successful in AWS arm64.
Nice job @hellkite500 for getting this work. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works as advertised.
This image also builds/runs on arm 64 successfully, so this should allow the cross compiling of x86 and arm images.
These commits do a few things worth noting.
Note that in step 4, the tests built and run will automtaically be adjusted based on the build flags/args, and if any test fails, the build will error/fail.
Closes #24 from a technical perspective, ngen parallel should work, but the realization in that issue may need to be debugged with correct library paths?
Closes #25
Can also close #20