-
Notifications
You must be signed in to change notification settings - Fork 651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Incompatibility with charliecloud@0.34 #4463
Comments
I realized that this issue is tied to $CH_IMAGE_STORAGE. ch-run refuses to allow images from its storage directory by path, and accepts these only by name. |
Charliecloud project lead here. Thanks for this bug report! This behavior is by design. In general, the storage directory has never been a stable API, so folks should not depend on its structure. However, we've only been erroring on You can now run images in storage by specifying their name, e.g. instead of However, |
An alternative to See also the embarrassingly old hpc/charliecloud#96. |
Since charliecloud will soon provide |
Unfortunately, no version of Charliecloud is fine with
|
I guess I should have said 'works, although it is not a great idea' :). I think using |
The sole purpose of that The shared (multi-user) cache directory you are using is another can of worms though, I currently don't see how that can work properly with the rather recent git-based layer caching used in @nschan
|
I generally agree that breaking backwards compatibility could be justified in this case. I see the issues of the multi-user cache.. In practice that cache is pretty much only used by me, but we will move away from a shared cache. Anyhow, the approach you suggested does not currently work (nextflow@23.10.1, charliecloud@0.36~pre+78aae5e), but I assume that is something for @reidpr ?
When i try to run the container manually with
This is using quay.io/biocontainers/agat:1.1.0--pl5321hdfd78af_1 Edit: Another edit: |
To sum up: I think the fact that relying on I think an alternative could be something along the lines of building write-able process specific containers. This would hopefully solve most of the problems associated with running containers from a cache. I admit my understanding of containers is pretty basic, but as I read the current implementation all parallel processes that use the same container image run in the same container (irrespective Of course guess adding |
“Stale file handle” is something about inode re-use. I know it often comes up in NFS. If it’s a Charliecloud bug we definitely want to fix it; is there a reproducer without Nextflow?
Processes on the same node that are in different containers cannot use certain system calls to communicate, and this has performance impact. That’s the motivation for Note that multiple containers can be running the same image. I’d be skeptical of a separate image per process. That’d be a lot of images on a wide computer. Another thing to consider is hpc/charliecloud#1408. This would be a supported way to create a container modified by a shell command (e.g., to add mount points). One could also I hope that helps. |
I could not reproduce the "stale file handle" outside nextflow, it might also have other, filesystem related reasons.
I was referring to nextflow processes (tasks), which I think should anyway not communicate with each other while running. What is not really clear to me, is what would happen if I have several (simultaneous) jobs running containers of the same image and for some reason input-specific files are created in a certain folder of the image (e.g. If that could happen I think this would argue in favour of using |
Looks like I misunderstood. Thanks for clarifying. |
Bug report
Running nextflow@23.10.0 together with charliecloud@0.34 yields an error:
error: can't run directory images from storage (hint: run by name)
Expected behavior and actual behavior
The expected behaviour is that nextflow is able to start the container.
This is related to a change in charlieclouds behaviour (hpc/charliecloud#1505)
Steps to reproduce the problem
This problem should occur in any case where charlielcloud > 0.33 is used as the container engine for nextflow. I do not think that this problem is specific to the nextflow version used.
Program output
Environment
Additional context
The text was updated successfully, but these errors were encountered: