OpenFOAM MPI in the Docker/Singularity container. #542

bnordgren · 2024-12-05T19:21:16Z

There are rumors of a problem with the containerization of WindNinja as it relates to the MPI libraries. I have been asked to help. Let's have a meeting, but prior to the meeting, I'd like to ask that @sathwikreddy56 or @dgh007786 define the problem a little more closely and make available the Dockerfile used which is exhibiting the problem. As we may well be pursuing solutions in our own separate environments, may I suggest that one of you make a development branch (call it say HPC or containerization) into which you load the exact source that is failing on the HPC cluster. We can all work off of that and merge changes back to master when we're done. To summarize, we need three things:

A concise statement of the problem detailed enough that it can be replicated.
The exact code exhibiting the problem. (in the dev branch you make).
A self-contained dataset that's small enough to download, but which causes the problem.

We can review these at the meeting, but I will be better prepared to ask questions that aren't stupid if I have the above.

sathwikreddy56 · 2024-12-05T19:45:03Z

Sure Bryce,

so the issue here which we are facing is mainly with the momentum solver which uses Open Foam

The main error we are having is the Openfoam being installed on the docker tries to connect to host installation instead of the one installed in the container. which mainly causes two errors when the path is not sent in properly I have found that mounting the /home directory causes the issue and I am currently testing a version of the container that doesn't mount the /home directory but a test directory. Also as a backup I found the OpenFoam 8 docker implementation which I am trying to get aligned with our requirements for the Windninja installation.
The other issue in hand is the use of multi threading in the application:
As we know WindNinja is capable of using multi thread openmpi implementation for the OpenFoam. when we initialize the app to use more than 1 thread this cause errors in the OpenFoam at reconstructPar function when further analysis is done I found that this error was similar to this (OpenFoam error when using more than one thread in Docker #497) which I found that is caused when the host machine doesn't allow the containers to use multiple cores. The online forms say that this is due to some kind of lock that is implemented on the kernel level and when the docker system calls the host kernel the lock implemented at the initialization cant get the exact CPU core it was locked to and fails. and their suggestion was to use the host MPI installation path instead of the container installed MPI. I have tried that but wasn't successful in mitigating the problem. so as a naive developed did I didn't want to upset the OpenFoam and used 1 thread which it was running fine. I am still trying to figure out alternative solutions to this problem.

The 2 issues are currently the major hurdles which need to be addressed for running windninja momentum simulations. These Simulations are currently high priority for future simulations also so I am currently working on these to check what could be the root cause of the issue.

coming to the code part the current master branch is a working implementation for diagnosing the issue.

I have a set of self contained dataset that are ready for getting some work done on them to check if the container is working properly with the multithreading enabled OpenFoam installation

bnordgren · 2024-12-05T21:14:00Z

Meeting notes: 12/5/24

Problem 1: The build assumes that the FOAM_DIR environment variable persists from RUN statement to RUN statement, but it doesn't. Openfoam is built and installed inside the shell script build_deps_ubuntu_2004.sh. It is installed into the Windninja build two RUN statements later. So we need to re-source the source /opt/openfoam8/etc/bashrc file at the start of that RUN statement for the environment variable to be accessible. Sathwik will re-build the container image with this change and we should see the files copied to the right place.
Problem 2: We're going to pursue this under OpenFoam error when using more than one thread in Docker #497

bnordgren · 2024-12-06T19:33:39Z

Testing required before the next meeting, limited to problem 1:

Did files from openfoam's install get copied to the correct place in the Windninja install inside the container?
Can you manually start a simple hello_world MPI program inside the container without connecting to the host? Use the following for testing:
- https://github.com/open-mpi/ompi/blob/8d711976aae4325b24187a93dff0ab3ab9e42d5b/examples/hello_c.c
- https://docs.open-mpi.org/en/v5.0.x/launching-apps/localhost.html
Can you submit a single node hello_world job to the slurm queue and have it execute on a compute node without trying to connect to the host?

Once these are accomplished, we are ready to test: Can Windninja start a run that uses OpenFOAM MPI without connecting to the host? This is the topic of #497.

bnordgren · 2024-12-06T22:03:20Z

Meeting notes 12/6/2024:

Sathwik tested using a windninja run, and the OpenFOAM MPI version did not attempt to contact the MPI installation on the host. The /home directory was not mounted during the test.
For the purposes of this ticket, any time we say "/home" was not mounted, we mean not explicitly mounted. However, singularity mounts the home directory by default and does not require an explicit mount directive. So we assume that it was always mounted.
The difference between "working" and "not working" as initially reported by Sathwik, is likely that "working" meant that his script directory moved from his home directory to a project directory which he then explicitly mounted. The scripts were identical.
We're agreed that the current iteration of the scripts, when running from a project directory instead of the home directory, does not cause the MPI process inside the container to try and contact the MPI processes on the host.
To check whether the initially reported problem still exists, or whether it was fixed by correctly installing Openfoam into the windninja instalation, Sathwik is going to try moving the scripts back into the home directory and running them from there. Test successful.
We're agreed we can close this issue because the MPI processes inside the container are not attempting to contact the MPI installation on the host. Sathwik will commit the current state of his build directory with a comment that closes this issue.

sathwikreddy56 · 2024-12-09T21:27:47Z

The issue with the OpenFoam is resolved by a change in the docker file this is now reflected in apptainer_test branch after further testing this will be pulled in to master

bnordgren added the component:ninjafoam label Dec 5, 2024

nwagenbrenner assigned bnordgren, dgh007786 and sathwikreddy56 Dec 5, 2024

bnordgren mentioned this issue Dec 6, 2024

OpenFoam error when using more than one thread in Docker #497

Open

sathwikreddy56 closed this as completed Dec 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

OpenFOAM MPI in the Docker/Singularity container. #542

OpenFOAM MPI in the Docker/Singularity container. #542

bnordgren commented Dec 5, 2024

sathwikreddy56 commented Dec 5, 2024 •

edited

Loading

bnordgren commented Dec 5, 2024

bnordgren commented Dec 6, 2024 •

edited

Loading

bnordgren commented Dec 6, 2024 •

edited

Loading

sathwikreddy56 commented Dec 9, 2024

OpenFOAM MPI in the Docker/Singularity container. #542

OpenFOAM MPI in the Docker/Singularity container. #542

Comments

bnordgren commented Dec 5, 2024

sathwikreddy56 commented Dec 5, 2024 • edited Loading

bnordgren commented Dec 5, 2024

bnordgren commented Dec 6, 2024 • edited Loading

bnordgren commented Dec 6, 2024 • edited Loading

sathwikreddy56 commented Dec 9, 2024

sathwikreddy56 commented Dec 5, 2024 •

edited

Loading

bnordgren commented Dec 6, 2024 •

edited

Loading

bnordgren commented Dec 6, 2024 •

edited

Loading