Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CMake Plan: Pipeline dry-run and transformation-based plan creation #453

Merged
merged 5 commits into from
Dec 12, 2024

Conversation

reuterbal
Copy link
Collaborator

This is an alternative implementation suggestion to #444

The objective here would be to retain current functionality and implement support for transformations that change scheduler graph topology in subsequent PRs.

Please have a look and point out anything that looks odd to you. In particular, please consider the interface - e.g., is a plan argument acceptable and the naming ok?

With this, the loki-transform.py plan command is redirected to the loki-transform.py convert command, clearing the way for subsequent removal of the custom transformation entry point and further homogenisation in the CLI and CMake layer.

I will also add another commit to add some more documentation (and I expect the linter pass to fail due to the local import).

Copy link

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/453/index.html

@MichaelSt98
Copy link
Collaborator

I skimmed it briefly and I really like the implementation!

Would it make sense to have instead of process(self, transformation, plan=False) something like process(self, transformation, mode=<batch|normal|plan|...>)?

Copy link

codecov bot commented Nov 29, 2024

Codecov Report

Attention: Patch coverage is 90.29851% with 13 lines in your changes missing coverage. Please review.

Project coverage is 93.28%. Comparing base (ba7e230) to head (633d038).
Report is 13 commits behind head on main.

Files with missing lines Patch % Lines
loki/batch/transformation.py 82.92% 7 Missing ⚠️
loki/batch/scheduler.py 88.00% 3 Missing ⚠️
loki/transformations/build_system/file_write.py 88.23% 2 Missing ⚠️
loki/transformations/build_system/plan.py 97.22% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #453      +/-   ##
==========================================
- Coverage   93.28%   93.28%   -0.01%     
==========================================
  Files         220      221       +1     
  Lines       41202    41263      +61     
==========================================
+ Hits        38436    38492      +56     
- Misses       2766     2771       +5     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 93.24% <90.29%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@reuterbal
Copy link
Collaborator Author

Would it make sense to have instead of process(self, transformation, plan=False) something like process(self, transformation, mode=<batch|normal|plan|...>)?

That's not a bad idea, however, mode as a name is also used to specify the "transformation mode", i.e., scc, scc-stack etc. Not sure if we would want to avoid that conflict?

Other than that, what would the various options be? I can think of:

  • plan: The dry-run mode from this PR
  • atomic (not sure that's a good name): The current mode of applying each transformation to every item before applying the next transformation
  • batch: Could be a future mode that applies to an item all transformations that share the same traversal mode before going to the next item, thus "batching" the processing. Out of scope for this PR, just to highlight a potential future mode.
  • default: Should then alias to whatever we think is the best default, e.g. atomic right now?

@MichaelSt98
Copy link
Collaborator

Would it make sense to have instead of process(self, transformation, plan=False) something like process(self, transformation, mode=<batch|normal|plan|...>)?

That's not a bad idea, however, mode as a name is also used to specify the "transformation mode", i.e., scc, scc-stack etc. Not sure if we would want to avoid that conflict?

Other than that, what would the various options be? I can think of:

  • plan: The dry-run mode from this PR
  • atomic (not sure that's a good name): The current mode of applying each transformation to every item before applying the next transformation
  • batch: Could be a future mode that applies to an item all transformations that share the same traversal mode before going to the next item, thus "batching" the processing. Out of scope for this PR, just to highlight a potential future mode.
  • default: Should then alias to whatever we think is the best default, e.g. atomic right now?

Naming things is always the hardest part ...
What about process_mode?
Yes, those options are what I was thinking about and I think atomic is a good name, however, we could also go with serial?
Moreover, we could think about calling transform_file/transform_routine or plan_file/plan_routine in dependence of mode/process_mode like calling <process_mode>_file/<process_mode>_routine, but this would probably be over-engineering right now...

@reuterbal reuterbal force-pushed the nabr-pipeline-plan branch 2 times, most recently from b49bba0 to 4060c21 Compare December 10, 2024 14:44
@reuterbal
Copy link
Collaborator Author

Thanks, I went for proc_strategy and list available options in an enum ProcessingStrategy. That seemed most appropriate to me while minimising accidental aliasing with other meanings.

Inside transformations I kept the boolean option because there it really is just the switch between "transformation" and "planning".

I fixed the linter warnings and added some documentation, so I think this should be ready for another look.

Copy link
Collaborator

@MichaelSt98 MichaelSt98 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Really nice!

loki/batch/transformation.py Outdated Show resolved Hide resolved
Co-authored-by: Michael Staneker <50531288+MichaelSt98@users.noreply.github.com>
Copy link
Collaborator

@mlange05 mlange05 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, very neat and a great step towards making this all a bit more accessible. GTG from me. :shipit:

@mlange05 mlange05 added the ready for merge This PR has been approved and is ready to be merged label Dec 11, 2024
@reuterbal reuterbal merged commit a26cb44 into main Dec 12, 2024
13 checks passed
@reuterbal reuterbal deleted the nabr-pipeline-plan branch December 12, 2024 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ready for merge This PR has been approved and is ready to be merged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants