Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to add the -- shm-size parameter at runtime #1821

Closed
kongxiangya opened this issue Mar 23, 2023 · 6 comments · Fixed by #1972
Closed

How to add the -- shm-size parameter at runtime #1821

kongxiangya opened this issue Mar 23, 2023 · 6 comments · Fixed by #1972

Comments

@kongxiangya
Copy link

Shm errors in running deep learning models within a docker container

Error: Unexpected bus error accounted in worker This might be caused by insufficient shared memory (shm).

Reason: The main reason is that the shm size is too small

When creating an analysis container using cwltool, it is expected to add the parameter – shm-size 4g through configuration

docker run --shm-size 4G

Can you add this configuration? Thank you

@tetron
Copy link
Member

tetron commented Mar 23, 2023

This is interesting, I was not familiar with this option.

I'd support adding something like a ShmSize requirement to specify that /dev/shm is needed and that should be a certain size, with the restriction that it can only be used by the CommandLineTool or sub processes launched by it within the container (no communication with processes outside the container).

I would accept a pull request to do something like this. Adding a new extension is slightly involved but if you are interested in doing the work to implement it we can walk you through it.

@kongxiangya
Copy link
Author

Thank you for your reply

Should this configuration addition method be similar to --strict-memory-limit and used in conjunction with ramMin in the cwl configuration file. For example, add --set-shm-size, and add the shmSize setting value in the requirements section of the cwl configuration file

Or similar to --no-read-only, but increase the parameter setting value. For example, add --set-shm-size 5g to add this configuration when running the tool

@tetron
Copy link
Member

tetron commented Mar 23, 2023

It should be part of the requirements section of the CWL file. example:

class: CommandLineTool
requirements:
  cwltool:ShmSize:
    shmSize: 4096

The gist of what you need to do is:

  1. Add the schema to cwltool/extensions-v1.2.yml
  2. Add it to the list of supportedProcessRequirements in cwltool/process.py
  3. In DockerCommandLineJob.create_runtime in cwltool/docker.py call self.builder.get_requirement("http://commonwl.org/cwltool#ShmSize") and use that to add --shm-size to the command line
  4. Add a test in tests/test_docker.py

@mr-c
Copy link
Member

mr-c commented Mar 23, 2023

Is there a Singularity/Apptainer equivalent to this? Likewise podman, udocker, ...

@kongxiangya
Copy link
Author

Hi @tetron

Thanks for your answer

Due to limited capabilities and time, I use the method of adding from the outermost layer to quickly meet my current needs. If other researchers have this requirement, you can consider whether to develop this configuration.

My changes are:

  1. argparser.py, Add under arg_parser
   parser.add_argument(
            "--set-shm-size",
            type=str,
            help="Set ShmSize, default 64m",
            default="64m",
            dest="set_shm_size"
    )
  1. context.py, Add under RuntimeContext
self.set_shm_size = "64m"  # type: str
  1. docker.py, Add under create_runtime
runtime.extend(["--shm-size",runtimeContext.set_shm_size])

Finally, add configuration at runtime:

cwltool_run --set-shm-size 5g

@kongxiangya
Copy link
Author

Is there a Singularity/Apptainer equivalent to this? Likewise podman, udocker, ...

Hi @mr-c

Do you mean whether podman and udocker also have a --shm-size configuration ?

Due to restrictions on the use of udocker, I only tested podman, and the results also support this configuration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants