-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DataOffload: Field offload of driver regions #457
base: main
Are you sure you want to change the base?
Conversation
Sets are unordered, so using them to filter creates an arbitrary ordering, which in turn yields an unpredictable order of declarations.
This allows us to re-use the key pieces without being hooked into the `!$loki data` regions semantics.
Instead of subbing just on calls, we apply the remapping over the whole routine body.
Without tying the transformation to calls, explicit no-target skipping becomes virtually impossible; hence removing the test for it.
Documentation for this branch can be viewed at https://sites.ecmwf.int/docs/loki/457/index.html |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #457 +/- ##
=======================================
Coverage 93.28% 93.28%
=======================================
Files 220 221 +1
Lines 41224 41249 +25
=======================================
+ Hits 38457 38481 +24
- Misses 2767 2768 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's very nice to get these extensions of the field offload capabilities. It makes a lot of sense to separate the field utilities and offload transformations like this and I think it's very nice to have the offload map tracking the offload variables like this👍.
# nor does it submit to any jurisdiction. | ||
|
||
""" | ||
A set utility classes for dealing with FIELD API boilerplate in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A set utility classes for dealing with FIELD API boilerplate in | |
A set of utility classes for dealing with FIELD API boilerplate in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is much cleaner now with the separation of the general f-api utilities from the offloading transformations.
""" | ||
Returns a tuple of :any:`CallStatement` for host-to-device transfers on fields. | ||
""" | ||
READ_ONLY, READ_WRITE = FieldAPITransferType.READ_ONLY, FieldAPITransferType.READ_WRITE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is neat
replace_kernel_args(driver, offload_map, self.offload_index) | ||
|
||
|
||
def find_offload_variables(driver, region, field_group_types): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this can be split into two functions in the future. One more general utility in transformations/data_offload/offload.py
that finds the inargs/inoutargs/outargs using dataflow analysis. And the logic of the second method could even be a helper function here or a method of the FieldOffloadTransformation.
In the meantime I think it would be good to have a docstring that specifies that this has to be run with dataflow analysis attached.
return inargs, inoutargs, outargs | ||
|
||
|
||
def declare_device_ptrs(driver, deviceptrs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't realise this when I wrote this helper, but there is nothing field or device ptr specific about this function. I suggest we name it something else, perhaps add_variables
, and add it as member function of the ProgramUnit
base class.
change_map = {} | ||
offload_idx_expr = driver.variable_map[offload_index] | ||
|
||
args = tuple(chain(offload_map.inargs, offload_map.inoutargs, offload_map.outargs)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be useful to expose this chained tuple as a property, args
, of the offload map.
return field_view.parent.get_derived_type_member(field_type_name) | ||
|
||
@property | ||
def dataptrs(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The same comment with the name applies here.
Helper class to map FIELD API pointers to intents and access descriptors. | ||
|
||
This utility is used to store arrays passed to target kernel calls | ||
and the corresponding device pointers added by the transformation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the corresponding device pointers added by the transformation. | |
and easily access corresponding device pointers added by the transformation. |
return tuple(dict.fromkeys( | ||
self.dataptr_from_array(a) | ||
for a in chain(*(self.inargs, self.inoutargs, self.outargs)) | ||
)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return tuple(dict.fromkeys( | |
self.dataptr_from_array(a) | |
for a in chain(*(self.inargs, self.inoutargs, self.outargs)) | |
)) | |
return tuple( | |
self.dataptr_from_array(a) | |
for a in chain(self.inargs, self.inoutargs, self.outargs) | |
) |
If there are no duplicates in self.inargs/inoutargs/outargs, then there shouldn't be any in the ones created either.
|
||
The pointer/array variable pairs are exposed through the class | ||
properties, based on the intent of the kernel argument. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pointer/array variable pairs are exposed through the class | |
properties, based on the intent of the kernel argument. |
This PR refactors the current Field-API offload transformation and its utilities and adds two previously missing features:
CALL F(obj%a(:, 1), obj%a(:, 2))
only declares and offloadsobj_a
once.FieldOffloadTransformation
over compute regions in drivers without kernel callsMoreover, it turns I've started moving things around a little and ended up putting most of the heavy lifting in the
FieldPointerMap
utility. To avoid future cyclic import and other issues, I've then moved this out to a pure helper sub-moduleloki.transformations.field_api
and renamed the using transformationsdata_offload.field_offload
andparallel.field_views
.In refactoring the existing transformation and utilities to achieve the above, I moved quite a few things around - sorry for the large diff. I'd be happy to split this up, if this will aid the review process.
In a little more detail, the other changes in this PR are:
state_type
, where appropriatedict.fromkeys
instead ofset
, as this retains ordering and thus ensures reproducible orders of variables/utility calls at all stagesFieldPointerMap
not converts from field view array symbols to general data pointers (with block dimension) on-the-fly. This allows the data pointers to be given as a mere property on the pointer map object.FieldPointerMap
now provides the lists of boilerplate calls for host-to-device copies and host-sync calls as properties and derives them from the three internally stored argument/array symbol lists.