Gpu datatest #553

OngChia · 2024-09-19T12:33:50Z

Enable single-node GPU run for the Jablonowski-WIlliamson test on R2B4 grid and all datatests for the Timeloop.

Change np to xp in serialbox_utils.py.
Change asnumpy() to ndarray in test_timeloop.py and jablownski_williamson.py and driver/utils.py
Split predictor_stencils_7_8_9 to two stencils to avoid illegal memory access when running on gpu.
Add a new field exner_dyn_incr_lastsubstep in class DiagnosticStateNonHydro as the output argument for the stencil update_dynamical_exner_time_increment because this stencil originally updates the input argument exner_dyn_incr.

OngChia · 2024-09-19T12:37:13Z

cscs-ci run default

OngChia · 2024-09-19T13:34:43Z

cscs-ci run default

halungge

Some renaming, for the numpy.ma parts you might pick changes from this PR

halungge · 2024-09-19T13:22:28Z

model/driver/src/icon4py/model/driver/test_cases/jablonowski_williamson.py

@@ -33,6 +33,7 @@
 log = logging.getLogger(__name__)


+# TODO (Chia Rui): Convert all numpy computations to cupy


haven't you done this now?

Ah. Thanks. I forgot to delete it.

halungge · 2024-09-19T13:24:00Z

...n4py/model/atmosphere/dycore/compute_virtual_potential_temperatures_and_pressure_gradient.py

@@ -49,6 +49,38 @@ def _compute_virtual_potential_temperatures_and_pressure_gradient(
    return z_theta_v_pr_ic_vp, theta_v_ic_wp, astype(z_th_ddz_exner_c_wp, vpfloat)


+@field_operator
+def _compute_only_virtual_potential_temperatures(


Suggested change

def _compute_only_virtual_potential_temperatures(

def _compute_virtual_potential_temperatures(

the only is kind of implicit...

halungge · 2024-09-19T13:24:11Z

...n4py/model/atmosphere/dycore/compute_virtual_potential_temperatures_and_pressure_gradient.py

+
+
+@field_operator
+def _compute_only_pressure_gradient(


Suggested change

def _compute_only_pressure_gradient(

def _compute_pressure_gradient(

halungge · 2024-09-19T13:34:16Z

...n4py/model/atmosphere/dycore/compute_virtual_potential_temperatures_and_pressure_gradient.py

In order to remove the duplication I would have the field_operator call the two new ones, like that

@field_operator def _compute_virtual_potential_temperatures_and_pressure_gradient(...): z_theta_v_pr_ic_vp, theta_v_ic_wp = _compute_only_virtual_potential_temperatures() z_th_ddz_exner_c_wp = _compute_only_pressure_gradient(...) return z_theta_v_pr_ic_vp, theta_v_ic_wp, astype(z_th_ddz_exner_wp, vpfloat)

This will also not harm the blueline as it is purely internal.

Ahh. Good suggestion. Thanks!

halungge · 2024-09-19T13:36:08Z

model/atmosphere/dycore/src/icon4py/model/atmosphere/dycore/nh_solve/solve_nonhydro_program.py

+
+
+@gtx.program(grid_type=gtx.GridType.UNSTRUCTURED, backend=backend)
+def predictor_stencils_7_8_9_secondstep(


you could even call this one compute_pressure_gradient

halungge · 2024-09-19T14:19:35Z

model/common/src/icon4py/model/common/decomposition/definitions.py

@@ -69,6 +70,7 @@ class EntryType(IntEnum):
    @builder.builder
    def with_dimension(self, dim: Dimension, global_index: np.ndarray, owner_mask: np.ndarray):
        masked_global_index = ma.array(global_index, mask=owner_mask)
+        masked_global_index = xp.asarray(masked_global_index)
        self._global_index[dim] = masked_global_index


I guess that won't work. I don't now what happens with the mask if you do that and we need it that mask later. But indeed cupy does not implement the numpy's masked_array interface. We would have to do a simple hand made implementation which should be enough for what we need. I'll open a fix for that.

Okay. Thanks.

OngChia · 2024-09-19T18:31:47Z

cscs-ci run default

…ector_step

OngChia · 2024-09-24T11:25:19Z

cscs-ci run default

OngChia · 2024-09-24T11:25:26Z

launch jenkins spack

OngChia · 2024-09-24T11:26:22Z

cscs-ci run default

OngChia · 2024-09-24T11:26:31Z

launch jenkins spack

OngChia · 2024-09-24T11:29:32Z

cscs-ci run default

OngChia · 2024-09-24T11:29:38Z

launch jenkins spack

* implement a simple poor mans masked array as the numpy.ma interface is not implementd in cupy. * fix imports: use xp fix typo in dunder function * fix imports * fix missing device copy in serialbox_utils.py

* keep start and end indices on host, remove some functions * fix typo

OngChia · 2024-09-25T18:50:28Z

cscs-ci run default

OngChia · 2024-09-25T18:50:35Z

launch jenkins spack

github-actions · 2024-10-24T07:47:47Z

Mandatory Tests

Please make sure you run these tests via comment before you merge!

cscs-ci run default
launch jenkins spack

Optional Tests

To run benchmarks you can use:

cscs-ci run benchmark

To run tests and benchmarks with the DaCe backend you can use:

cscs-ci run dace

In case your change might affect downstream icon-exclaim, please consider running

launch jenkins icon

For more detailed information please look at CI in the EXCLAIM universe.

havogt · 2024-10-30T16:03:12Z

model/atmosphere/dycore/src/icon4py/model/atmosphere/dycore/nh_solve/solve_nonhydro.py

+        log.info(
+            f" MAXPRE VN: {prognostic_state[nnew].vn.ndarray.max():.15e} , MAXPRE W: {prognostic_state[nnew].w.ndarray.max():.15e}"
+        )
+        log.info(
+            f" MAXPRE RHO: {prognostic_state[nnew].rho.ndarray.max():.15e} , MAXPRE THETA_V: {prognostic_state[nnew].theta_v.ndarray.max():.15e}"
+        )
+        log.info(
+            f" AVEPRE VN: {prognostic_state[nnew].vn.ndarray.mean(axis=(0,1)):.15e} , AVEPRE W: {prognostic_state[nnew].w.ndarray.mean(axis=(0,1)):.15e}"
+        )
+        log.info(
+            f" AVEPRE RHO: {prognostic_state[nnew].rho.ndarray.mean(axis=(0,1)):.15e} , AVEPRE THETA_V: {prognostic_state[nnew].theta_v.ndarray.mean(axis=(0,1)):.15e}"
+        )
+


these are reductions and hurt performance even if logging level is set to higher than info. If you want to keep them, see
https://stackoverflow.com/questions/58592292/does-python-logging-incur-a-performance-hit-if-you-log-below-the-set-level

if logger.isEnabledFor(logging.DEBUG): logger.debug('Message with %s, %s', expensive_func1(), expensive_func2())

But then I would still demote them to DEBUG (not INFO).

OngChia added 2 commits September 19, 2024 10:00

first successful run

e3e4528

clean up

d7a3eb8

OngChia requested review from halungge and havogt September 19, 2024 12:33

remove log file

de9a019

OngChia added 3 commits September 19, 2024 15:21

add connectivity on device

d51ef9c

remove unwanted print statement

5fa42c4

remove log file

ebe6548

halungge requested changes Sep 19, 2024

View reviewed changes

OngChia added 3 commits September 19, 2024 19:56

changes from review

3190f97

Merge branch 'main' into gpu_datatest

01a6c28

fix bug

2228cff

fix bug in test_run_solve_nonhydro_single_step and test_velocity_corr…

71ef8cc

…ector_step

Merge branch 'main' into gpu_datatest

645995a

clean up _compute_virtual_potential_temperatures_and_pressure_gradient

c77b31a

OngChia and others added 4 commits September 25, 2024 09:17

Merge branch 'main' into gpu_datatest

4faca41

Simple MaskedArray implementation (#554)

d35fb0a

* implement a simple poor mans masked array as the numpy.ma interface is not implementd in cupy. * fix imports: use xp fix typo in dunder function * fix imports * fix missing device copy in serialbox_utils.py

Start end indices on host (#558)

e1a7243

* keep start and end indices on host, remove some functions * fix typo

clean up

e2dd88f

OngChia requested a review from halungge September 25, 2024 18:50

OngChia added 4 commits October 23, 2024 22:22

fix nlevp1 bug in solve_nonhydro_program

c758c63

fix bug in baroclinic test initialization

15ba644

add a note on nlevp1 bug in dycore README

ab66764

merge nlevp1 bug fix

7842771

OngChia added 3 commits October 24, 2024 18:02

revert some changes

ed381c1

CHECK: why predictor_stencils_7_8_9 still needs to be split

fb4c315

add flag apply_to_temperature in diffusion run

c6ed7d5

havogt reviewed Oct 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Gpu datatest #553

Gpu datatest #553

OngChia commented Sep 19, 2024

OngChia commented Sep 19, 2024

OngChia commented Sep 19, 2024

halungge left a comment

halungge Sep 19, 2024

OngChia Sep 19, 2024

halungge Sep 19, 2024

halungge Sep 19, 2024

halungge Sep 19, 2024

OngChia Sep 19, 2024

halungge Sep 19, 2024

halungge Sep 19, 2024

OngChia Sep 19, 2024

OngChia commented Sep 19, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 25, 2024

OngChia commented Sep 25, 2024

github-actions bot commented Oct 24, 2024

havogt Oct 30, 2024

		@@ -33,6 +33,7 @@
		log = logging.getLogger(__name__)


		# TODO (Chia Rui): Convert all numpy computations to cupy

	def _compute_only_virtual_potential_temperatures(
	def _compute_virtual_potential_temperatures(

	def _compute_only_pressure_gradient(
	def _compute_pressure_gradient(



		@gtx.program(grid_type=gtx.GridType.UNSTRUCTURED, backend=backend)
		def predictor_stencils_7_8_9_secondstep(

Gpu datatest #553

Are you sure you want to change the base?

Gpu datatest #553

Conversation

OngChia commented Sep 19, 2024

OngChia commented Sep 19, 2024

OngChia commented Sep 19, 2024

halungge left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

OngChia commented Sep 19, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 24, 2024

OngChia commented Sep 25, 2024

OngChia commented Sep 25, 2024

github-actions bot commented Oct 24, 2024

Choose a reason for hiding this comment