Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DataOffload: Field offload of driver regions #457

Open
wants to merge 13 commits into
base: main
Choose a base branch
from

Conversation

mlange05
Copy link
Collaborator

@mlange05 mlange05 commented Dec 5, 2024

This PR refactors the current Field-API offload transformation and its utilities and adds two previously missing features:

  • The ability to detect aliased array pointer usage, so CALL F(obj%a(:, 1), obj%a(:, 2)) only declares and offloads obj_a once.
  • The ability to run FieldOffloadTransformation over compute regions in drivers without kernel calls

Moreover, it turns I've started moving things around a little and ended up putting most of the heavy lifting in the FieldPointerMap utility. To avoid future cyclic import and other issues, I've then moved this out to a pure helper sub-module loki.transformations.field_api and renamed the using transformations data_offload.field_offload and parallel.field_views.

In refactoring the existing transformation and utilities to achieve the above, I moved quite a few things around - sorry for the large diff. I'd be happy to split this up, if this will aid the review process.

In a little more detail, the other changes in this PR are:

  • Refactoring Field API offload tests to use a common fixture for state_type, where appropriate
  • Filtering duplicates at alls stages via dict.fromkeys instead of set, as this retains ordering and thus ensures reproducible orders of variables/utility calls at all stages
  • The FieldPointerMap not converts from field view array symbols to general data pointers (with block dimension) on-the-fly. This allows the data pointers to be given as a mere property on the pointer map object.
  • Similarly the FieldPointerMap now provides the lists of boilerplate calls for host-to-device copies and host-sync calls as properties and derives them from the three internally stored argument/array symbol lists.
  • Apply symbol substitution over all array symbols that match the pointer/array symbols stored in the field pointer map over an entire region, not just calls
  • Derive the array symbols for which to generate Field API offload semantics via dataflow analysis, so that we can run this over any marked pragma region

@mlange05 mlange05 requested review from wertysas and awnawab December 5, 2024 19:37
Copy link

github-actions bot commented Dec 5, 2024

Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/457/index.html

Copy link

codecov bot commented Dec 5, 2024

Codecov Report

Attention: Patch coverage is 99.06977% with 2 lines in your changes missing coverage. Please review.

Project coverage is 93.28%. Comparing base (e9760e8) to head (9d572ee).

Files with missing lines Patch % Lines
loki/transformations/data_offload/field_offload.py 98.68% 1 Missing ⚠️
loki/transformations/field_api.py 98.52% 1 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #457   +/-   ##
=======================================
  Coverage   93.28%   93.28%           
=======================================
  Files         220      221    +1     
  Lines       41224    41249   +25     
=======================================
+ Hits        38457    38481   +24     
- Misses       2767     2768    +1     
Flag Coverage Δ
lint_rules 96.39% <ø> (ø)
loki 93.24% <99.06%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@mlange05 mlange05 marked this pull request as ready for review December 6, 2024 03:25
Copy link
Contributor

@wertysas wertysas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's very nice to get these extensions of the field offload capabilities. It makes a lot of sense to separate the field utilities and offload transformations like this and I think it's very nice to have the offload map tracking the offload variables like this👍.

# nor does it submit to any jurisdiction.

"""
A set utility classes for dealing with FIELD API boilerplate in
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
A set utility classes for dealing with FIELD API boilerplate in
A set of utility classes for dealing with FIELD API boilerplate in

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is much cleaner now with the separation of the general f-api utilities from the offloading transformations.

"""
Returns a tuple of :any:`CallStatement` for host-to-device transfers on fields.
"""
READ_ONLY, READ_WRITE = FieldAPITransferType.READ_ONLY, FieldAPITransferType.READ_WRITE
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is neat

replace_kernel_args(driver, offload_map, self.offload_index)


def find_offload_variables(driver, region, field_group_types):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this can be split into two functions in the future. One more general utility in transformations/data_offload/offload.py that finds the inargs/inoutargs/outargs using dataflow analysis. And the logic of the second method could even be a helper function here or a method of the FieldOffloadTransformation.

In the meantime I think it would be good to have a docstring that specifies that this has to be run with dataflow analysis attached.

return inargs, inoutargs, outargs


def declare_device_ptrs(driver, deviceptrs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't realise this when I wrote this helper, but there is nothing field or device ptr specific about this function. I suggest we name it something else, perhaps add_variables, and add it as member function of the ProgramUnit base class.

change_map = {}
offload_idx_expr = driver.variable_map[offload_index]

args = tuple(chain(offload_map.inargs, offload_map.inoutargs, offload_map.outargs))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be useful to expose this chained tuple as a property, args, of the offload map.

return field_view.parent.get_derived_type_member(field_type_name)

@property
def dataptrs(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment with the name applies here.

Helper class to map FIELD API pointers to intents and access descriptors.

This utility is used to store arrays passed to target kernel calls
and the corresponding device pointers added by the transformation.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
and the corresponding device pointers added by the transformation.
and easily access corresponding device pointers added by the transformation.

Comment on lines +82 to +85
return tuple(dict.fromkeys(
self.dataptr_from_array(a)
for a in chain(*(self.inargs, self.inoutargs, self.outargs))
))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return tuple(dict.fromkeys(
self.dataptr_from_array(a)
for a in chain(*(self.inargs, self.inoutargs, self.outargs))
))
return tuple(
self.dataptr_from_array(a)
for a in chain(self.inargs, self.inoutargs, self.outargs)
)

If there are no duplicates in self.inargs/inoutargs/outargs, then there shouldn't be any in the ones created either.

Comment on lines +39 to +41

The pointer/array variable pairs are exposed through the class
properties, based on the intent of the kernel argument.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The pointer/array variable pairs are exposed through the class
properties, based on the intent of the kernel argument.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants