Pipeline-plan duplicate/remove transformation changing dependencies #462

MichaelSt98 · 2024-12-18T15:53:48Z

Proof-of-concept transformations DuplicateKernel and RemoveKernel with both changing the dependencies for the scheduler.

This builds on top of #453

DuplicateKernel takes a kernel and creates a duplicated kernel with corresponding duplicated items and adds a call to this duplicated kernel.

RemoveKernel removes a kernel from the scheduler and also removes the call.

github-actions · 2024-12-18T15:56:30Z

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/462/index.html

codecov · 2024-12-18T16:53:06Z

Codecov Report

Attention: Patch coverage is 91.13924% with 28 lines in your changes missing coverage. Please review.

Project coverage is 93.27%. Comparing base (ad1e2bc) to head (3415e92).

Files with missing lines	Patch %	Lines
loki/batch/item.py	74.28%	18 Missing ⚠️
loki/transformations/dependency.py	92.68%	6 Missing ⚠️
loki/transformations/tests/test_dependency.py	98.62%	2 Missing ⚠️
loki/frontend/source.py	87.50%	1 Missing ⚠️
loki/sourcefile.py	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #462      +/-   ##
==========================================
- Coverage   93.29%   93.27%   -0.03%     
==========================================
  Files         221      223       +2     
  Lines       41360    41674     +314     
==========================================
+ Hits        38587    38871     +284     
- Misses       2773     2803      +30

Flag	Coverage Δ
lint_rules	`96.39% <ø> (ø)`
loki	`93.23% <91.13%> (-0.03%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

MichaelSt98 · 2024-12-19T08:04:37Z

@mlange05 and @reuterbal I would wait for some short feedback before I add more tests to improve code coverage

reuterbal

Many thanks for this, this is a very useful demonstration of this functionality!

Conceptually, there's a few gotchas and things that I would prefer to do differently, and I'll try to explain here:

Item creation should always go through the item_factory, and the introduced clone method shortcuts this. Instead, we should have a get_or_create_from_item method or similar on the ItemFactory, to make sure any new items automatically land in the cache.
You inject the dependency changes incurred from plan_data in the dependencies property. However, that property returns (and I would like to keep it that way) IR nodes that constitute the dependency, e.g., CallStatement, Import, etc. So, we have two options: (a) inject (and remove) relevant IR nodes during planning to fake these dependencies or (b) inject the plan dependencies at a point where we are already working with items. I think (b) is a little easier and I have explored doing this via create_dependency_items instead, which seems to work (more details below).
The need for a recursive clone can be inferred from not specifying an IR to the clone method. For contained subroutines in ProgramUnit we are already doing this, and I would prefer to do the same for source files instead of introducing a new rebuild-method on scoped nodes.
We need to separate plan and trafo in the tests - both need to be able to apply the relevant sgraph topology changes independently, because we run either plan or trafo mode in practice. Getting both to work, i.e., trafo after plan, might be rather tricky but I also don't see a need for that at the moment.

Admittedly, half the observations above I only figured out when starting to play with this directly and trying to apply the relevant changes. So, I've decided not to bother you with too many inline comments that I kept editing and changing and I will instead file a PR on top of yours against your PR branch. I hope this doesn't come across as dismissive, but I've underestimated how fiddly this gets...

reuterbal · 2024-12-19T11:04:34Z

loki/batch/item.py

@@ -137,8 +139,21 @@ def __init__(self, name, source, config=None):
        self.name = name
        self.source = source
        self.trafo_data = {}
+        self.plan_data = {}


Please add a description to the docstring above

reuterbal · 2024-12-19T11:07:50Z

loki/batch/item.py

+        if 'source' not in kwargs:
+            kwargs['source'] = self.source.clone() # self.source.clone()
+        if self.config is not None and 'config' not in kwargs:
+            kwargs['config'] = self.config


Suggested change

kwargs['config'] = self.config

kwargs['config'] = self.config.copy()

Otherwise you will incur aliasing problems. This should also be verified in a test, i.e., clone an item, modify the config in one and ensure the other item's config is unchanged.

reuterbal · 2024-12-19T11:24:59Z

loki/batch/item.py

@@ -24,6 +25,7 @@
 from loki.tools import as_tuple, flatten, CaseInsensitiveDict
 from loki.types import DerivedType

+# pylint: disable=too-many-lines


I would suggest moving the item factory into a separate file

reuterbal · 2024-12-19T11:29:19Z

loki/batch/item.py

        super().__init__(config)

+    def clone(self, **kwargs):


I think my preference would be to not have a clone method on Items. The batch processing mechanics is diligent to never replicate items to avoid accidental copies, and this would make it easy to break this behaviour.

Instead, all item creation should go through the ItemFactory and therefore it would be better to have there a method that allows creating items from other items with a new name, forcing the appropriate renaming as required.

reuterbal · 2024-12-19T11:31:52Z

loki/transformations/dependency.py

+        print(f"suffix: {self.suffix}")
+        print(f"module_suffix: {self.module_suffix}")


Suggested change

print(f"suffix: {self.suffix}")

print(f"module_suffix: {self.module_suffix}")

reuterbal · 2024-12-19T11:32:52Z

loki/transformations/dependency.py

+        self.module_suffix = duplicate_module_suffix or duplicate_suffix
+        print(f"suffix: {self.suffix}")
+        print(f"module_suffix: {self.module_suffix}")
+        self.kernels = tuple(kernel.lower() for kernel in as_tuple(kernels))


Is this the list of names that are to be duplicated? Can we find a more descriptive name for this?

reuterbal · 2024-12-19T11:38:50Z

loki/batch/item.py

@@ -959,6 +1014,53 @@ def __contains__(self, key):
        """
        return key in self.item_cache

+    def clone_procedure_item(self, item, suffix='', module_suffix=''):


The suffix and module_suffix arguments are unnecessarily restrictive to the current use case and are not required if we instead have a generic clone_item method in the ItemFactory. That gives us the requested new name of the cloned item, and thus also all this information because the naming pattern is
<module_name_or_empty>#<procedure_name>

The clone_procedure_item should instead be generalized as suggested below

reuterbal · 2024-12-20T12:48:08Z

Ah, I forgot to delete the inline comments. Disregard them or at least don't action on them, yet :)

MichaelSt98 · 2024-12-20T13:20:16Z

Many thanks for this, this is a very useful demonstration of this functionality!

Conceptually, there's a few gotchas and things that I would prefer to do differently, and I'll try to explain here:

Item creation should always go through the item_factory, and the introduced clone method shortcuts this. Instead, we should have a get_or_create_from_item method or similar on the ItemFactory, to make sure any new items automatically land in the cache.

You inject the dependency changes incurred from plan_data in the dependencies property. However, that property returns (and I would like to keep it that way) IR nodes that constitute the dependency, e.g., CallStatement, Import, etc. So, we have two options: (a) inject (and remove) relevant IR nodes during planning to fake these dependencies or (b) inject the plan dependencies at a point where we are already working with items. I think (b) is a little easier and I have explored doing this via create_dependency_items instead, which seems to work (more details below).

The need for a recursive clone can be inferred from not specifying an IR to the clone method. For contained subroutines in ProgramUnit we are already doing this, and I would prefer to do the same for source files instead of introducing a new rebuild-method on scoped nodes.

We need to separate plan and trafo in the tests - both need to be able to apply the relevant sgraph topology changes independently, because we run either plan or trafo mode in practice. Getting both to work, i.e., trafo after plan, might be rather tricky but I also don't see a need for that at the moment.

Admittedly, half the observations above I only figured out when starting to play with this directly and trying to apply the relevant changes. So, I've decided not to bother you with too many inline comments that I kept editing and changing and I will instead file a PR on top of yours against your PR branch. I hope this doesn't come across as dismissive, but I've underestimated how fiddly this gets...

That is perfectly fine for me and I wasn't expecting this PR to go in as it is but to serve as a first step/proof of concept.

MichaelSt98 force-pushed the nams-pipeline-plan-duplicate-remove branch from a9ad546 to afb5750 Compare December 18, 2024 16:14

MichaelSt98 requested a review from reuterbal December 18, 2024 16:15

MichaelSt98 requested a review from mlange05 December 19, 2024 08:03

Pipeline-plan duplicate/remove transformation changing dependencies

3415e92

MichaelSt98 force-pushed the nams-pipeline-plan-duplicate-remove branch from afb5750 to 3415e92 Compare December 19, 2024 09:13

reuterbal requested changes Dec 20, 2024

View reviewed changes

reuterbal mentioned this pull request Dec 20, 2024

Suggested changes for Pipeline planning of Duplicate/Remove #463

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pipeline-plan duplicate/remove transformation changing dependencies #462

Pipeline-plan duplicate/remove transformation changing dependencies #462

MichaelSt98 commented Dec 18, 2024

github-actions bot commented Dec 18, 2024

codecov bot commented Dec 18, 2024 •

edited

Loading

MichaelSt98 commented Dec 19, 2024

reuterbal left a comment

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal Dec 19, 2024

reuterbal commented Dec 20, 2024

MichaelSt98 commented Dec 20, 2024

	kwargs['config'] = self.config
	kwargs['config'] = self.config.copy()

		print(f"suffix: {self.suffix}")
		print(f"module_suffix: {self.module_suffix}")

Pipeline-plan duplicate/remove transformation changing dependencies #462

Are you sure you want to change the base?

Pipeline-plan duplicate/remove transformation changing dependencies #462

Conversation

MichaelSt98 commented Dec 18, 2024

github-actions bot commented Dec 18, 2024

codecov bot commented Dec 18, 2024 • edited Loading

Codecov Report

MichaelSt98 commented Dec 19, 2024

reuterbal left a comment

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal Dec 19, 2024

Choose a reason for hiding this comment

reuterbal commented Dec 20, 2024

MichaelSt98 commented Dec 20, 2024

codecov bot commented Dec 18, 2024 •

edited

Loading