Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for shared steps #116

Merged
merged 17 commits into from
Sep 20, 2023
Merged

Conversation

xylar
Copy link
Collaborator

@xylar xylar commented Sep 14, 2023

This merge adds support to the polaris framework for shared steps (steps that can belong to more than one task). Existing tasks and steps are modified to accommodate the changes to the framework but tasks and steps are not yet reorganized to take advantage of shared steps so that bit-for-bit comparison with main can still be made with this branch.

Changes to Step include:

  • switch the subdir attribute to be relative to the component work directory, not the task work directory
  • add an indir parameter to the constructor to easily construct a subdir that joins indir and name. If indir is the task work directory, this is the same as the old default behavior when subdir was not specified.

Changes to Component include:

  • adding a dictionary of steps (with work-directory paths relative to the component's work directory as keys)
  • add add_step() and remove_step() methods

Changes to Task include:

  • modify the add_step() method to perform error checking (e.g. make sure a different step with the same path hasn't already been added) and then add the step to the component and the task. This method also includes a new symlink parameter that can be used to make a local symlink in the task's work directory to a shared step if it resides outside the task.
  • add remove_step() methods

Other changes include:

  • listing steps with full work-directory paths when polaris list --verbose is called
  • removing tasks from step pickle files (since shared steps will not belong to a single task)
  • add a polaris_step_complete.log file once a step has run to indicate that it is complete (and should not be run again)
  • add --clean flags to polaris setup and polaris suite to remove the base work directory before setting up tasks. (This is sometimes useful when debugging because steps that have already run will now be skipped, rather than running again, which may not be what the developer wants.)
  • adding a resolution_to_subdir() function to the ocean framework for conveniently converting a float resolution to a string.

Checklist
and changes look as expected

  • Testing comment in the PR documents testing used to verify the changes

@xylar
Copy link
Collaborator Author

xylar commented Sep 14, 2023

Testing

I have run all test cases except 1 and 4 km RPE tests and made sure results are BFB with main.

@xylar xylar mentioned this pull request Sep 14, 2023
7 tasks
@xylar
Copy link
Collaborator Author

xylar commented Sep 14, 2023

I don't plan on updating the documentation here. Instead, I will do that in #117

@xylar xylar requested review from cbegeman and sbrus89 September 14, 2023 14:41
@xylar xylar added enhancement New feature or request clean-up framework Changes relating to the polaris framework as opposed to individual tests or analysis ocean Related to ocean tests or analysis sea ice Related to sea-ice tests or analysis labels Sep 14, 2023
This can be used to remove the work directory before setting up
tasks.
@xylar xylar force-pushed the support-shared-steps branch from 6d1b0b2 to 5de811f Compare September 14, 2023 15:32
Co-authored-by: Carolyn Begeman <cbegeman@lanl.gov>
@xylar
Copy link
Collaborator Author

xylar commented Sep 18, 2023

@cbegeman, thanks very much for the review. These are all great suggestions, which I will make tomorrow.

@xylar xylar force-pushed the support-shared-steps branch from ed43ab1 to ad7f08d Compare September 19, 2023 09:56
This is safer an more useful than removing the whole work directory
@xylar xylar force-pushed the support-shared-steps branch from febcc7e to 8c0fe60 Compare September 19, 2023 10:31
@xylar
Copy link
Collaborator Author

xylar commented Sep 19, 2023

@cbegeman, I believe I've addressed your 3 requested changes. Please have another look when you have time.

Copy link
Collaborator

@cbegeman cbegeman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xylar I looked over the changes and they look good! Thanks for addressing all my comments so quickly. Let me know if you'd like me to do any additional testing.

@xylar
Copy link
Collaborator Author

xylar commented Sep 19, 2023

@cbegeman, thanks for looking it over again. I think we're good for now. Let's see what @sbrus89 says.

polaris/io.py Outdated
Comment on lines 168 to 172
directory = os.path.dirname(os.path.abspath(link_name))
try:
os.makedirs(directory)
except OSError:
pass
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this raise an error?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this would be better?

Suggested change
directory = os.path.dirname(os.path.abspath(link_name))
try:
os.makedirs(directory)
except OSError:
pass
directory = os.path.dirname(os.path.abspath(link_name))
try:
os.makedirs(directory)
except FileExistsError:
pass

Something like this is the preferred way try to make a directory, but to do nothing if it already exists. It avoids a race condition that can happen if you first check if the directory exists and then create it.

Comment on lines -31 to 32
def __init__(self, task, name, subdir):
def __init__(self, component, name, subdir):
"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can steps be shared across components?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the current design is that steps belong one component only. You also cannot currently run tasks from different components in the same suite (or "custom" suite defined from a list of tasks), in part because the -p flag points to the directory where the model associated with the component exists.

Comment on lines 79 to +84
subdir : str, optional
the subdirectory for the step. The default is ``name``
the subdirectory for the step. If neither this nor ``indir``
are provided, the directory is the ``name``

indir : str, optional
the directory the step is in, to which ``name`` will be appended
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still al little fuzzy on the difference between subdir and indir could you explain it to me again?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subdir is the directory under the components work directory where the step exists. The indir is one level up from that, with the name of the step appended to it to get the subdir. Before shared steps, the default behavior was to take the task's subdirectory and append the name of the step to get the default subdir for the step. The easiest way to efficiently recreate this default behavior without steps having ot know about tasks is to have indir be the task's subdirectory. But there's no reason that it has to be, it could be any convenient directory that is one up from the step's subdirectory, as long as it makes sense for the step's name to be appended to it to get the subdir for the step. I don't think there are currently any examples where indir is anything other than the parent task's subdir.

@xylar
Copy link
Collaborator Author

xylar commented Sep 19, 2023

@sbrus89, thanks so much for your comments! I appreciate you taking a look. Let me know if I've addressed your comments or if you'd like to see more changes.

Copy link
Contributor

@sbrus89 sbrus89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@xylar, thanks for the explainations! I sucessfully ran the nightly, pr, and cosine_bell suites.

@xylar xylar merged commit 0989cef into E3SM-Project:main Sep 20, 2023
@xylar xylar deleted the support-shared-steps branch September 20, 2023 14:34
@xylar
Copy link
Collaborator Author

xylar commented Sep 20, 2023

Thanks so much!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clean-up enhancement New feature or request framework Changes relating to the polaris framework as opposed to individual tests or analysis ocean Related to ocean tests or analysis sea ice Related to sea-ice tests or analysis
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants