Skip to content

v3.0.x: plm_slurm_module: adjust for new SLURM CLI options #6674

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 18 additions & 9 deletions orte/mca/plm/slurm/plm_slurm_module.c
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@
* University of Stuttgart. All rights reserved.
* Copyright (c) 2004-2005 The Regents of the University of California.
* All rights reserved.
* Copyright (c) 2006-2014 Cisco Systems, Inc. All rights reserved.
* Copyright (c) 2006-2019 Cisco Systems, Inc. All rights reserved
* Copyright (c) 2007-2015 Los Alamos National Security, LLC. All rights
* reserved.
* Copyright (c) 2014-2017 Intel, Inc. All rights reserved.
Expand Down Expand Up @@ -272,14 +272,6 @@ static void launch_daemons(int fd, short args, void *cbdata)
opal_argv_append(&argc, &argv, "--kill-on-bad-exit");
}

/* ensure the orteds are not bound to a single processor,
* just in case the TaskAffinity option is set by default.
* This will *not* release the orteds from any cpu-set
* constraint, but will ensure it doesn't get
* bound to only one processor
*/
opal_argv_append(&argc, &argv, "--cpu_bind=none");

#if SLURM_CRAY_ENV
/*
* If in a SLURM/Cray env. make sure that Cray PMI is not pulled in,
Expand Down Expand Up @@ -420,6 +412,23 @@ static void launch_daemons(int fd, short args, void *cbdata)
/* setup environment */
env = opal_argv_copy(orte_launch_environ);

/* ensure the orteds are not bound to a single processor,
* just in case the TaskAffinity option is set by default.
* This will *not* release the orteds from any cpu-set
* constraint, but will ensure it doesn't get
* bound to only one processor
*
* NOTE: We used to pass --cpu_bind=none on the command line. But
* SLURM 19 changed this to --cpu-bind. There is no easy way to
* test at run time which of these two parameters is used (see
* https://github.com/open-mpi/ompi/pull/6654). There was
* discussion of using --test-only to see which one works, but
* --test-only is only effective if you're not already inside a
* SLURM allocation. Instead, set the env var SLURM_CPU_BIND to
* "none", which should do the same thing as --cpu*bind=none.
*/
opal_setenv("SLURM_CPU_BIND", "none", true, &env);

if (0 < opal_output_get_verbosity(orte_plm_base_framework.framework_output)) {
param = opal_argv_join(argv, ' ');
OPAL_OUTPUT_VERBOSE((1, orte_plm_base_framework.framework_output,
Expand Down