-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Charliecloud and shared cacheDir #3367
Comments
Tagging @phue who is the master of Charliecloud support in Nextflow |
The I agree that this is far from optimal in the context of a shared cache directory, because each user will potentially bind mount many directories, but I'm afraid this cannot be solved in Nextflow. |
Hi @phue , Thank you for your answer. I agree with you that the overlayfs support is problematic in charliecloud. The main problem for me is working with SquashFS images where you cannot mkdir within the image or plain format image where users can mess up the content (and the huge number of files associated with a plain format shared cache of images). For the But this also mean that if I want to bind the Do you think that providing the option to enable or disable the |
That's exactly the point, the empty folder needs to exist in the filesystem tree. This is somewhat guaranteed
Yes this could be done with an additional config parameter in the charliecloud scope, in fact the implementations for singularity and docker have such a parameter already ( But it should default to |
Hi @phue, I finally had time to write the mentioned modification. In addition to the However, during my investigation, I encountered another problem: nextflow try to make concurrent pull of images with charliecloud and throw the following error:
This error is expected if different instances of nextflow (or different users) try to pull images in the same storage directory at the same time but not from the execution of one pipeline. I have seen some code with a |
This is tricky, the Edit: charliecloud |
Ok, the function is clearer now. I understand that the multiple instances/users is difficult to handle and for me the error is expected in this case. However, the error I have is when pulling multiple containers for a single process instance (for DSL2 nf-core pipeline with multiple containers for example):
In this case nextflow tries to pull |
As far as I understand, this is because Charliecloud locks the entire (toplevel) storage directory when a container is pulled. Could you try if the |
Now I get the following error:
|
I see.. looks like this would break it for a single nextflow instance |
@l-modolo please include the full error stack trace |
Hi @pditommaso, Here is the corresponding `.nextflow.log` file
|
I have to say I never seen this before. Don't know if it's a problem with the FileMutex implementation of the use made by Charliecloud. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
Bug report
Using nextflow with charliecloud.enabled=true with a shared
cacheDir
on a HPCExpected behavior and actual behavior
By default the
ch-run
command uses the result ofmakeVolumes
as default bind (-b
) option to bind the full path toward the pipeline working directory where only the root folder of this path is needed.Steps to reproduce the problem
On my HCP, the
.command.run
run something like:which returns
the following
ch-run
command works correctly:Program output
(Copy and paste here output produced by the failing execution. Please highlight it as a code block. Whenever possible upload the
.nextflow.log
file.)Environment
Additional context
I would expect the
ch-run
command to run without the-w
tag enabled.-w
stand for Mount image read-write (by default, the image is mounted read-only). This is problematic because some artifacts owned by the last user of the image can be created in the shared image cacheDir.The text was updated successfully, but these errors were encountered: