Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bugfix/flux-nodes #484

Merged
merged 15 commits into from
Jun 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -26,12 +26,14 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
- Applying filters for `merlin detailed-status` will now log debug statements instead of warnings
- Modified the unit tests for the `merlin status` command to use pytest rather than unittest
- Added fixtures for `merlin status` tests that copy the workspace to a temporary directory so you can see exactly what's run in a test
- Batch block and workers now allow for variables to be used in node settings

### Fixed
- Bugfix for output of `merlin example openfoam_wf_singularity`
- A bug with the CHANGELOG detection test when the target branch isn't in the ci runner history
- Link to Merlin banner in readme
- Issue with escape sequences in ascii art (caught by python 3.12)
- Bug where Flux wasn't identifying total number of nodes on an allocation


## [1.12.1]
Expand Down
10 changes: 8 additions & 2 deletions merlin/spec/merlinspec.json
Original file line number Diff line number Diff line change
Expand Up @@ -221,7 +221,8 @@
"nodes": {
"anyOf": [
{"type": "null"},
{"type": "integer", "minimum": 1}
{"type": "integer", "minimum": 1},
{"type": "string", "pattern": "^\\$\\(\\w+\\)$"}
]
},
"batch": {
Expand Down Expand Up @@ -279,7 +280,12 @@
"launch_pre": {"type": "string", "minLength": 1},
"launch_args": {"type": "string", "minLength": 1},
"worker_launch": {"type": "string", "minLength": 1},
"nodes": {"type": "integer", "minimum": 1},
"nodes": {
"anyOf": [
{"type": "integer", "minimum": 1},
{"type": "string","pattern": "^\\$\\(\\w+\\)$"}
]
},
"walltime": {
"anyOf": [
{"type": "string", "minLength": 1},
Expand Down
8 changes: 8 additions & 0 deletions merlin/study/batch.py
Original file line number Diff line number Diff line change
Expand Up @@ -134,6 +134,14 @@ def get_node_count(default=1):
the environment cannot be determined.
:param returns: (int) The number of nodes to use.
"""

# If flux is the scheduler, we can get the size of the allocation with this
try:
get_size_proc = subprocess.run("flux getattr size", shell=True, capture_output=True, text=True)
bgunnar5 marked this conversation as resolved.
Show resolved Hide resolved
return int(get_size_proc.stdout)
except Exception:
pass

if "SLURM_JOB_NUM_NODES" in os.environ:
return int(os.environ["SLURM_JOB_NUM_NODES"])

Expand Down
5 changes: 5 additions & 0 deletions tests/integration/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -295,6 +295,11 @@ def define_tests(): # pylint: disable=R0914,R0915
"conditions": [HasReturnCode(), HasRegex(r"default_worker", negate=True)],
"run type": "local",
},
"run-workers echo variable for worker nodes": {
"cmds": f"{workers_flux} {flux_native} --echo",
"conditions": [HasReturnCode(), HasRegex(r"-N 4")],
"run type": "local",
},
}
wf_format_tests = {
"local minimum_format": {
Expand Down
2 changes: 2 additions & 0 deletions tests/integration/test_specs/flux_par_native_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ env:
OUTPUT_PATH: ./studies
N_SAMPLES: 10
SCRIPTS: $(SPECROOT)/../../../merlin/examples/workflows/flux/scripts
WORKER_NODES: 4

study:
- description: Build the code
Expand Down Expand Up @@ -71,6 +72,7 @@ merlin:
simworkers:
args: -l INFO --concurrency 1 --prefetch-multiplier 1 -Ofair
steps: [runs, data]
nodes: $(WORKER_NODES)
samples:
column_labels: [V1, V2]
file: $(MERLIN_INFO)/samples.npy
Expand Down
Loading