-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Gpu datatest #553
base: main
Are you sure you want to change the base?
Gpu datatest #553
Conversation
cscs-ci run default |
cscs-ci run default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some renaming, for the numpy.ma parts you might pick changes from this PR
@@ -33,6 +33,7 @@ | |||
log = logging.getLogger(__name__) | |||
|
|||
|
|||
# TODO (Chia Rui): Convert all numpy computations to cupy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't you done this now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah. Thanks. I forgot to delete it.
@@ -49,6 +49,38 @@ def _compute_virtual_potential_temperatures_and_pressure_gradient( | |||
return z_theta_v_pr_ic_vp, theta_v_ic_wp, astype(z_th_ddz_exner_c_wp, vpfloat) | |||
|
|||
|
|||
@field_operator | |||
def _compute_only_virtual_potential_temperatures( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _compute_only_virtual_potential_temperatures( | |
def _compute_virtual_potential_temperatures( |
the only is kind of implicit...
|
||
|
||
@field_operator | ||
def _compute_only_pressure_gradient( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def _compute_only_pressure_gradient( | |
def _compute_pressure_gradient( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to remove the duplication I would have the field_operator
call the two new ones, like that
@field_operator
def _compute_virtual_potential_temperatures_and_pressure_gradient(...):
z_theta_v_pr_ic_vp, theta_v_ic_wp = _compute_only_virtual_potential_temperatures()
z_th_ddz_exner_c_wp = _compute_only_pressure_gradient(...)
return z_theta_v_pr_ic_vp, theta_v_ic_wp, astype(z_th_ddz_exner_wp, vpfloat)
This will also not harm the blueline as it is purely internal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh. Good suggestion. Thanks!
|
||
|
||
@gtx.program(grid_type=gtx.GridType.UNSTRUCTURED, backend=backend) | ||
def predictor_stencils_7_8_9_secondstep( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could even call this one compute_pressure_gradient
@@ -69,6 +70,7 @@ class EntryType(IntEnum): | |||
@builder.builder | |||
def with_dimension(self, dim: Dimension, global_index: np.ndarray, owner_mask: np.ndarray): | |||
masked_global_index = ma.array(global_index, mask=owner_mask) | |||
masked_global_index = xp.asarray(masked_global_index) | |||
self._global_index[dim] = masked_global_index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess that won't work. I don't now what happens with the mask if you do that and we need it that mask later. But indeed cupy does not implement the numpy's masked_array interface. We would have to do a simple hand made implementation which should be enough for what we need. I'll open a fix for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. Thanks.
cscs-ci run default |
cscs-ci run default |
launch jenkins spack |
cscs-ci run default |
launch jenkins spack |
cscs-ci run default |
launch jenkins spack |
* implement a simple poor mans masked array as the numpy.ma interface is not implementd in cupy. * fix imports: use xp fix typo in dunder function * fix imports * fix missing device copy in serialbox_utils.py
* keep start and end indices on host, remove some functions * fix typo
cscs-ci run default |
launch jenkins spack |
Mandatory Tests Please make sure you run these tests via comment before you merge!
Optional Tests To run benchmarks you can use:
To run tests and benchmarks with the DaCe backend you can use:
In case your change might affect downstream icon-exclaim, please consider running
For more detailed information please look at CI in the EXCLAIM universe. |
log.info( | ||
f" MAXPRE VN: {prognostic_state[nnew].vn.ndarray.max():.15e} , MAXPRE W: {prognostic_state[nnew].w.ndarray.max():.15e}" | ||
) | ||
log.info( | ||
f" MAXPRE RHO: {prognostic_state[nnew].rho.ndarray.max():.15e} , MAXPRE THETA_V: {prognostic_state[nnew].theta_v.ndarray.max():.15e}" | ||
) | ||
log.info( | ||
f" AVEPRE VN: {prognostic_state[nnew].vn.ndarray.mean(axis=(0,1)):.15e} , AVEPRE W: {prognostic_state[nnew].w.ndarray.mean(axis=(0,1)):.15e}" | ||
) | ||
log.info( | ||
f" AVEPRE RHO: {prognostic_state[nnew].rho.ndarray.mean(axis=(0,1)):.15e} , AVEPRE THETA_V: {prognostic_state[nnew].theta_v.ndarray.mean(axis=(0,1)):.15e}" | ||
) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
these are reductions and hurt performance even if logging level is set to higher than info. If you want to keep them, see
https://stackoverflow.com/questions/58592292/does-python-logging-incur-a-performance-hit-if-you-log-below-the-set-level
if logger.isEnabledFor(logging.DEBUG):
logger.debug('Message with %s, %s', expensive_func1(),
expensive_func2())
But then I would still demote them to DEBUG
(not INFO
).
Enable single-node GPU run for the Jablonowski-WIlliamson test on R2B4 grid and all datatests for the Timeloop.
serialbox_utils.py
.test_timeloop.py
andjablownski_williamson.py
anddriver/utils.py
predictor_stencils_7_8_9
to two stencils to avoid illegal memory access when running on gpu.exner_dyn_incr_lastsubstep
in classDiagnosticStateNonHydro
as the output argument for the stencilupdate_dynamical_exner_time_increment
because this stencil originally updates the input argumentexner_dyn_incr
.