Skip to content

Commit

Permalink
Only trigger job failed to start once
Browse files Browse the repository at this point in the history
Trigger the "job failed to start" state only when the
first process to do so reports. This avoids a "bounce"
effect that causes the job object to be multiply
released.

Signed-off-by: Ralph Castain <rhc@pmix.org>
(cherry picked from commit a386514)
  • Loading branch information
rhc54 committed Feb 26, 2024
1 parent cd20e1f commit ec5477f
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion src/mca/errmgr/dvm/errmgr_dvm.c
Original file line number Diff line number Diff line change
Expand Up @@ -488,14 +488,14 @@ static void proc_errors(int fd, short args, void *cbdata)
PRTE_FLAG_SET(jdata, PRTE_JOB_FLAG_ABORTED);
/* kill the job */
_terminate_job(jdata->nspace);
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
}
/* if this was a daemon, report it */
if (PMIX_CHECK_NSPACE(jdata->nspace, PRTE_PROC_MY_NAME->nspace)) {
/* output a message indicating we failed to launch a daemon */
pmix_show_help("help-errmgr-base.txt", "failed-daemon-launch",
true, prte_tool_basename);
}
PRTE_ACTIVATE_JOB_STATE(jdata, PRTE_JOB_STATE_FAILED_TO_START);
break;

case PRTE_PROC_STATE_CALLED_ABORT:
Expand Down

0 comments on commit ec5477f

Please sign in to comment.