Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OpenFOAM MPI in the Docker/Singularity container. #542

Closed
bnordgren opened this issue Dec 5, 2024 · 5 comments
Closed

OpenFOAM MPI in the Docker/Singularity container. #542

bnordgren opened this issue Dec 5, 2024 · 5 comments

Comments

@bnordgren
Copy link

There are rumors of a problem with the containerization of WindNinja as it relates to the MPI libraries. I have been asked to help. Let's have a meeting, but prior to the meeting, I'd like to ask that @sathwikreddy56 or @dgh007786 define the problem a little more closely and make available the Dockerfile used which is exhibiting the problem. As we may well be pursuing solutions in our own separate environments, may I suggest that one of you make a development branch (call it say HPC or containerization) into which you load the exact source that is failing on the HPC cluster. We can all work off of that and merge changes back to master when we're done. To summarize, we need three things:

  1. A concise statement of the problem detailed enough that it can be replicated.
  2. The exact code exhibiting the problem. (in the dev branch you make).
  3. A self-contained dataset that's small enough to download, but which causes the problem.

We can review these at the meeting, but I will be better prepared to ask questions that aren't stupid if I have the above.

@sathwikreddy56
Copy link
Contributor

sathwikreddy56 commented Dec 5, 2024

Sure Bryce,

so the issue here which we are facing is mainly with the momentum solver which uses Open Foam

  1. The main error we are having is the Openfoam being installed on the docker tries to connect to host installation instead of the one installed in the container. which mainly causes two errors when the path is not sent in properly I have found that mounting the /home directory causes the issue and I am currently testing a version of the container that doesn't mount the /home directory but a test directory. Also as a backup I found the OpenFoam 8 docker implementation which I am trying to get aligned with our requirements for the Windninja installation.

  2. The other issue in hand is the use of multi threading in the application:
    As we know WindNinja is capable of using multi thread openmpi implementation for the OpenFoam. when we initialize the app to use more than 1 thread this cause errors in the OpenFoam at reconstructPar function when further analysis is done I found that this error was similar to this (OpenFoam error when using more than one thread in Docker #497) which I found that is caused when the host machine doesn't allow the containers to use multiple cores. The online forms say that this is due to some kind of lock that is implemented on the kernel level and when the docker system calls the host kernel the lock implemented at the initialization cant get the exact CPU core it was locked to and fails. and their suggestion was to use the host MPI installation path instead of the container installed MPI. I have tried that but wasn't successful in mitigating the problem. so as a naive developed did I didn't want to upset the OpenFoam and used 1 thread which it was running fine. I am still trying to figure out alternative solutions to this problem.

The 2 issues are currently the major hurdles which need to be addressed for running windninja momentum simulations. These Simulations are currently high priority for future simulations also so I am currently working on these to check what could be the root cause of the issue.

coming to the code part the current master branch is a working implementation for diagnosing the issue.

I have a set of self contained dataset that are ready for getting some work done on them to check if the container is working properly with the multithreading enabled OpenFoam installation

@bnordgren
Copy link
Author

Meeting notes: 12/5/24

  • Problem 1: The build assumes that the FOAM_DIR environment variable persists from RUN statement to RUN statement, but it doesn't. Openfoam is built and installed inside the shell script build_deps_ubuntu_2004.sh. It is installed into the Windninja build two RUN statements later. So we need to re-source the source /opt/openfoam8/etc/bashrc file at the start of that RUN statement for the environment variable to be accessible. Sathwik will re-build the container image with this change and we should see the files copied to the right place.
  • Problem 2: We're going to pursue this under OpenFoam error when using more than one thread in Docker #497

@bnordgren
Copy link
Author

bnordgren commented Dec 6, 2024

Testing required before the next meeting, limited to problem 1:

Once these are accomplished, we are ready to test: Can Windninja start a run that uses OpenFOAM MPI without connecting to the host? This is the topic of #497.

@bnordgren
Copy link
Author

bnordgren commented Dec 6, 2024

Meeting notes 12/6/2024:

  • Sathwik tested using a windninja run, and the OpenFOAM MPI version did not attempt to contact the MPI installation on the host. The /home directory was not mounted during the test.
  • For the purposes of this ticket, any time we say "/home" was not mounted, we mean not explicitly mounted. However, singularity mounts the home directory by default and does not require an explicit mount directive. So we assume that it was always mounted.
  • The difference between "working" and "not working" as initially reported by Sathwik, is likely that "working" meant that his script directory moved from his home directory to a project directory which he then explicitly mounted. The scripts were identical.
  • We're agreed that the current iteration of the scripts, when running from a project directory instead of the home directory, does not cause the MPI process inside the container to try and contact the MPI processes on the host.
  • To check whether the initially reported problem still exists, or whether it was fixed by correctly installing Openfoam into the windninja instalation, Sathwik is going to try moving the scripts back into the home directory and running them from there. Test successful.
  • We're agreed we can close this issue because the MPI processes inside the container are not attempting to contact the MPI installation on the host. Sathwik will commit the current state of his build directory with a comment that closes this issue.

@sathwikreddy56
Copy link
Contributor

The issue with the OpenFoam is resolved by a change in the docker file this is now reflected in apptainer_test branch after further testing this will be pulled in to master

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants