-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Docker builds using GitHub Actions #274
Docker builds using GitHub Actions #274
Conversation
be0f032
to
c835914
Compare
- bump mpich version to 4
- after mpich bump, tried to bump python version - this did not work, resulted in segfault in mpi version
c835914
to
e2f5600
Compare
- previous commits included bumping to py39. - going back to py3.7.3 as bumping created segfault in mpi version - added info on build readme
e2f5600
to
eefba22
Compare
- adds pull_request to actions - use COPY instead of git clone, for testing branches - builds using dockerhub org instead of "samwelborn" or "openchemistry" - GH checkout action changed to recursively grab submodules
2d3fa48
to
1cc8150
Compare
It looks to me that the segmentation fault is occurring inside this constructor, which mainly converts the vector to a pybind11 array. I have two guesses for fixes:
@swelborn Can you try also upgrading the version of pybind11 when you upgrade the version of Python, and see if you still encounter the error? Upgrading pybind11 however may also cause some compile errors due to API changes, but we'll see what happens! |
@psavery We tried upgrading the pybind11 version I believe. |
I wonder if we need to acquire the GIL? Like adding this at the beginning of the function: py::gil_scoped_acquire acquire; But then again, I don't know how the GIL could be a problem only for the MPI version. |
@psavery @cjh1 I will open up a new issue on upgrading the python version. Here is a related github issue pybind/pybind11#1042. Seems like that is where the |
@swelborn Are the two failures expected? |
@cjh1 The reason for that is likely that I have set |
pin h5py, don't use environment_after.yml
139042d
to
b36cebc
Compare
Was failing with `ImportError: libarchive.so.13: cannot open shared object file: No such file or directory`
package does not exist
- install dev packages (gdb, vim) - add libarchive to fix mamba issue - move Big MPI to base build
Restructuring workflow
I made significant changes to this PR. It still provides the same functionality, but I think it is more coherent and avoiding of duplicated code. Here are the primary changes:
Probably the biggest change is ability to change dependencies easily, like: python-version: ['3.7', '3.8', '3.9', '3.10'] to python-version: ['3.7'] This was already helpful... I acquired gil and fixed #275, and used the git versioning to test each build with sfapi_client at NERSC, see here for the script I used. I will detail this in a separate PR after this is approved/merged. Note: the |
This PR adds a GitHub Actions docker build workflow to facilitate faster development cycles, avoid docker build timeouts on dockerhub, and prepare for the future (python/ubuntu versions will be out of support soon).
Instead of pip installing everything, the new Docker images contain a conda environment. This is in order to facilitate (potentially) adding non pip-installable packages to stempy.
On migrating to newer versions, I first bumped to mpich 4 prior to changing python. After bumping python 3.7.3 -> python 3.8 || python 3.9, a segfault occurs when running MPI counting. Backtrace is below. This does not seem to occur on the non-MPI version.
The new images have been tested (briefly), but need to undergo more rigorous tests before messing with the image people are actively using. Thus, a separate job within the workflow builds the original images (with minor modifications to the Dockerfile (for some reason the COPY command was not recursively copying pybind11). New versions will be under different tags.
The following images will be built:
along with base images that serve as a "cache":
Here is the bt: