Merge main #1

SF-N · 2024-07-08T11:48:50Z

No description provided.

…tests. (#1409) * Add power unrolling functionality and respective unit tests. * Define base and exponent variables for better readability in PowerUnrolling * Remove distinction between SymRef and FunCall in power unrolling * Optimize power unrolling to avoid multiple computations of FunCalls * Further improve power unrolling * Update wrt review and adapt expected results respectively * Add correct annotation --------- Co-authored-by: Sara Faghih-Naini <sara.faghihnaini@ecmwf.int>

Found some incompatible tasklet representation while upgrading to dace v0.15.1. Array access inside tasklet with partial index subset worked in v0.14.1, although not valid. The fix consists of modifying the memlets to pass the full array shape to such tasklet, and use all explicit indices inside the tasklet to access the array. This is the right representation in DaCe SDFG, as discussed with the DaCe developers.

* edits for BoundArgs with kwargs in correct order

Add an example illustrating using gt4py.cartesian and gt4py.next computations next to each other using gt4py.next storages. Refactor GTFieldInterface and cleanup GTDimsInterface for next.

Bumping dace version to 0.15.1 affects both cartesian and next gt4py: * cartesian: removed try/except for dace backward compatibility * next: re-enabled some tests that were broken on dace 0.14.4 * all: fixed and/or suppressed flake8 and mypy errors

…gs (#1418) Fixes hidden bugs in `eve.datamodels` and `eve.extended_typing` to support Python 3.11. Actual bug fixes: - Previous fix to support `typing.Any` implementation as a class (python/cpython@5a4973e) didn't work in 3.11. - Partially concretization of generic datamodels replacing typevars was broken. - Partially concretization of generic datamodels leaving some parameters as typevars was broken. Other changes: - Add python 3.11 as supported version. - Remove dead code in comments. - Fix some imports style to comply with our coding guidelines.

…d program (#1323) After #1275 most of the error message given to the user when calling a field operator or program with invalid arguments was only available in verbose mode. This PR shows this information again. ```python @field_operator def foo(x: IField): return x @field_operator def testee(a: IField, b: IField, c: IField) -> IField: return foo(1) ``` ``` gt4py.next.errors.exceptions.DSLError: Invalid argument types in call to `foo`. E Invalid call to function of type `FieldOperatorType(definition=FunctionType(pos_only_args=[], pos_or_kw_args={'x': FieldType(dims=[Dimension(value='IDim', kind=<DimensionKind.HORIZONTAL: 'horizontal'>)], dtype=ScalarType(kind=<ScalarKind.INT32: 32>, shape=None))}, kw_only_args={}, returns=FieldType(dims=[Dimension(value='IDim', kind=<DimensionKind.HORIZONTAL: 'horizontal'>)], dtype=ScalarType(kind=<ScalarKind.INT32: 32>, shape=None))))`: E - Expected argument `x` to be of type `Field[[IDim], int32]`, but got `int32`. E File ".../gt4py_functional/tests/next_tests/integration_tests/feature_tests/ffront_tests/test_arg_call_interface.py", line 113 E return foo(1) ```

* Add more debug info to DaCe (pass SourceLocation from past/foast to itir, and from itir to the SDFG): Preserve Location through Visitors

…nsion (#1422) Main purpose of this PR is to avoid the definition of shape symbols for array dimensions known at compile time. The local size of neighbor connectivity tables falls into this category. For each element in the origin dimension, the number of elements in the target dimension is defined by the attribute max_neighbors in the offset provider.

The lowering of scan operator to SDFG uses a state machine to represent a loop. This PR replaces the state machine with a LoopRegion construct introduced in dace v0.15. The LoopRegion construct is not yet supported by dace transformation, but it will in the future and it could open new optimization opportunities (e.g. K-caching).

Replace deprecated constructor API dace.Memlet.simple() with dace.Memlet()

Introduces mechanism in tests for having different allocators for the same (`None`) backend. Fixes: - The resulting buffer for scan is deduced from the buffer type of the arguments, if there are no arguments we fallback to numpy (maybe break). We need to find a mechanism for this corner case. Currently these tests are excluded with `pytest.mark.uses_scan_without_field_args` for cupy embedded execution. Refactoring: - make common.field and common.connectivity private - rename next_tests.exclusion_matrices to definitions TODOs for later: - `broadcast` of scalar ignores the broadcast --------- Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>

Baseline contained a bug in the lowering of deref in the context of neighbor reduction. The data container should be statically allocated with size equal to the max_neighbors attribute in the offset provider.

- Update minimal version for pygments due to conflict (failing daily min requirements ci) - Many files touched due to formatting change in black - Fix a bug in cartesian hypothesis setup

…1432) Add unit tests for `ConnectivityField.inverse_image()`.

- Adds a mesh with skip values - Define `common.SKIP_VALUE = -1` instead of using `-1` explicitly - Skip tests with that mesh in embedded (will come in a next PR).

@edopao

In PR #1422 @edopao introduced a mechanism to skip the SDFG translation. This PR moves this cache from the `run_dace_iterator()` function into the `build_sdfg_from_itir()` function.

…1442) Undo an unintended change in #1202 to re-enable an icon4py pattern. Longer term, probably, only transposable tuples of fields make sense, e.g. by intersecting.

This PR provides a bugfix for the case of neighbor reductions with lambda function as reduction operation and connectivity table containing skip values. The lambda function should only accumulate the results for the valid neighbors. On the contrary, the baseline implementation was using the reduction identity value for the missing neighbors, resulting in invalid result. The fix consists of producing an array of boolean flags to determine if the neighbor value is valid or not. If not valid, the call to the lambda function is by-passed.

…ions (no unrolling) (#1431) Baseline dace backend forced unroll of neighbor reductions, in the ITIR pass, in order to eliminate all lift expressions. This PR adds support for lowering of lift expressions in neighbor reductions, thus avoiding the need to unroll reduce expressions. The result is a more compact SDFG, which leaves to the optimization backend the option of unrolling neighbor reductions.

Temporaries are implemented in DaCe backend as transient arrays. This PR adds extraction of temporaries and generation of corresponding transient arrays in the SDFG representation.

The cache was copied in `with_backend`, but backend is not part of the hash. Now the cache will be empty after `with_backend`.

Small PR in preparation of the new ITIR type system. Currently the type of a `itir.Literal` is stored as a string which blocks introducing a `type: ts.TypeSpecification` attribute in all `itir.Node`s. In order to keep the PR for the new type inference easy to review this has been factored out. ```python class Literal(Expr): value: str type: str @datamodels.validator("type") def _type_validator(self: datamodels.DataModelTP, attribute: datamodels.Attribute, value): if value not in TYPEBUILTINS: raise ValueError(f"'{value}' is not a valid builtin type.") ``` is changed to ```python class Literal(Expr): value: str type: ts.ScalarType ```

…ir.Ref. (#1532) In the new ITIR type inference #1531 IR nodes store their type in the node itself. While we initially exclude the attribute from equality comparison we should nonetheless avoid comparison of nodes that only differ in type. This PR removes many of this occurrences.

Remove the cpp_backend_tests. They were from a time when we couldn't run gtfn from python.

…or (#1533) Currently we have a mix of specifying the backend which already comes with a lift_mode default and separately the lift_mode fixture in some tests. The default was not overwritten in the roundtrip backend. Now we remove the separate `lift_mode` and add extra backends with the lift_mode set. Note: we don't run double_roundtrip with temporaries. Longer term we should refactor all itir tests to use the ffront test infrastructure. --------- Co-authored-by: Till Ehrengruber <t.ehrengruber@me.com>

@disabled

## Description GTC `cuda` backend was made available a few years ago for AI2 team research. It has been kept updated but a recent poll shows that it is not in use. Recent new features break the backend and we propose here to hard deprecate it rather than keep spending time maintaining it. `GT4PY_GTC_ENABLE_CUDA=1` can be used to force the use of the backend, but will warn that any feature from February 2024 are not available/not tested. Additionally a mechanism to deprecate all GTC backends are now in use. Using ```python @disabled( message="Disable message.", enabled_env_var="EnvVarToEnable", ) ``` ## Requirements - [x] All fixes and/or new features come with corresponding tests. --------- Co-authored-by: Hannes Vogt <hannes@havogt.de>

## Description ### New: - `ffront.stages.FieldOperatorDefinition` - all the data to start the toolchain from a field operator dsl definition - `ffront.stages.FoastOperatorDefinition` - data after lowering from field operator dsl code - `ffront.stages.FoastWithTypes` - program argument types in addition to the foast definition for creating a program AST - `ffront.stages.FoastClosure` - program arguments in addition to the foast definition, ready to run the whole toolchain ### Changed: - `decorator.Program.__post_init__` - implementation moved to `past_passes.linters` workflow steps - linting stage added to program transforms - `decorator.FieldOperator.from_function` - implementation moved to workflow step in `ffront.func_to_foast` - `decorator.FieldOperator.as_program` - implementation moved to workflow steps in `ffront.foast_to_past` - `decorator.FieldOperator` data attributes - added: `definition_stage` - removed: - `.foast_node`: replaced with `.foast_stage.foast_node` - `.definition`: replaced with `.definition_stage.definition` - `next.backend.Backend` - renamed: `.transformer` -> `.transforms_prog` - added: `.transforms_fop`, toolchain for starting from field operator - `otf.recipes.FieldOpTransformWorkflow` - now has all the steps from DSL field operator to `ProgramCall` via `foast_to_past`, with additional steps to go to the field operator IteratorIR expression directly instead (not run by default). The latter `foast_to_itir` step is required during lowering of programs that call a field operator.

…n error (#1541) Suppress the deprecation error for cartesian "CUDA" backend on CSCS-CI.

Move pytest `addopts` setting from pytest config in pyproject.toml to an enviroment setting in tox.ini to keep pytest CLI invocation clean during local development. --------- Co-authored-by: DropD <rico.haeuselmann@gmail.com>

## Description Recent work with new user showed that basic mistake can lead to pretty gnarly stack trace instead of a clean error message. This fixes one of the most common: a bad backend name required. ## Requirements - [x] All fixes and/or new features come with corresponding tests.

Extend the implementation of the `premap` field operation (previously named `remap`, conceptually equivalent to a Contravariant Functor's `contramap`) to support more efficient implementations of different use cases depending on the contents of the connectivity field. ### Added - `gt4py.eve`: new typing aliases and minor utilities ### Changed - `gt4py.next.common`: - new typing aliases. - small refactoring of `Domain` to support creation of subdomains via slicing using the `.slice_at` attribute. The actual implementation comes from the now deleted `gt4py.next.embedded.nd_array_field._relative_ranges_to_domain()` function. - refactor `ConnectivityKind` to represent all known use cases - extend `CartesianConnectivity` to support translation and relocations - rename `remap` to `premap` - `gt4py.next.embedded.nd_array_field`: - full refactoring of `premap()` (old `remap`) and add usage documentation - some renamings (`_hypercube()` -> `_hyperslice()`, `_compute_mask_ranges()` -> `_compute_mask_slices()` ### Removed - `gt4py.next.embedded.nd_array_field`: `_relative_ranges_to_domain()` function moved to an `Domain` attribute in `gt4py.next.common` --------- Co-authored-by: Hannes Vogt <hannes@havogt.de>

Additional: fix a generator in pretty_printer after mypy update showed the error. --------- Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>

## Changed: Toolchain does not own input arguments to DSL programs anymore, instead the input datastructure owns them and the toolchain can dispatch them to a subset of the steps. ## Toolchain migration: Old: ```python FieldopTransformWorkflow().replace(foast_inject_args=FopArgsInjector(*args, **kwargs))(fieldoperator_definition) ProgramTransformWorkflow().replace(program_inject_args=ProgArgsInjector(*args, **kwargs))(program_definition) ``` New: ```python FieldopTransformWorkflow()(InputWithArgs(fieldoperator_definition, args, kwargs)) ProgramTransformWorkflow()(InputWithArgs(program_definition, args, kwargs)) ``` ## Added: - `otf.workflow`: - new workflow type: `NamedStepSequenceWithArgs` takes an `InputWithArgs` and dispatches `.args` and `.kwargs` to steps that set `take_args = True` in the field metadata - new data type `InputWithArgs` wraps a workflow stage and call args - `backend`: Replace `*ArgsInjector` using the new `NamedStepSequenceWithArgs` infrastructure

Remove unnecessary unique ID, which makes code generation easier for ICON-DSL bindings.

Failure introduced after #1544, but was wrong before.

This allows to pull the image (from within CSCS network) without authentication. Useful for debugging images. See https://gitlab.com/cscs-ci/ci-testing/containerised_ci_doc/-/blob/6f83631d9ad2c63edd63bb19b88706d108638c0f/clone_image.md#allow-anonymous-access-to-container-image

gridtools_cpp adds a new minimum boost version in the next release

…v variables (#1491) This PR allows downstream applications to specify compiler optimization level and flags on a stencil basis.

Upgrade DaCe package version to release v0.16.1. Include GT4Py-next code changes enabled by bugfixes included in the new DaCe package. Package versions updated in requirement files, but keeping `nanobind==1.9.2` (GT4Py does not support v2.0.0) and `mypy==1.10.0` (v1.10.1 not available for pre-commit). A couple of code changes in GT4Py-cartesian DaCe backend: - temporary config in test case to force serialize all fields in SDFG JSON representation, because the default config value was chaged in the new dace release - handle scalar parameters in stencil objects as scalars rather than symbols, to avoid undefined symbol issues when the parameter is not used

New type inference algorithm on ITIR unifying the type system with the one used in the frontend. Types are stored directly in the ITIR nodes. This replaces the constraint based type inference giving significant performance and usability improvements. Types of builtins are expressing using simple to write `TypeSynthesizer` of the form: ```python @_register_builtin_type_synthesizer def power(base: ts.ScalarType, exponent: ts.ScalarType) -> ts.ScalarType: return base ```

requires gridtools_cpp >= 2.3.4 for nanobind 2.x Additional changes: freezes numpy to 1.x until #1559

Support NumPy 2.0 by collecting the functions required for creating storages in a single object, created at initialization time, from their actual location in NumPy v1 or v2 APIs.

Updates itir.embedded to work with `itir.Program`s, i.e. `set_at` and `as_fieldop`. For programs to be able to run in embedded, the domain needs to be provided as second argument to `as_fieldop`. Introduces a `DimensionKind` to `itir.AxisLiteral` to be able to reconstruct the kind from the IR. This is needed now as the `set_at` assigns from field to field, which requires matching dimensions. However, previously the python program generated from IR would always construct horizontal dimensions (but the information would not be used). --------- Co-authored-by: Till Ehrengruber <t.ehrengruber@me.com>

…e inference (#1566) Since `cast_` is a grammar builtin whose return type is given by its second argument it is easy to forget inferring the types of the first argument and its children. This surfaced in icon4py and is fixed here.

When the new ITIR type inference was introduced this broke the code generation in icon4py. This PR adds a small assert and improves the docstring.

This PR adds the CI/CD configuration for GH200 nodes, while keeping the currrent configuration for x86_64+CUDA on PizDaint. Some differences between GH200 (on Todi vCluster) and x86_64+CUDA (on PizDaint): - CUDA v11.2.2 / CUDA ARCH=60 on PizDaint vs. CUDA v12.4.1 / CUDA ARCH=90 on GH200 - Support for Python 3.8, 3.9, 3.10, 3.11 on x86_64 Ubuntu base image, while Python 3.8 is not supported on the ARM base image for GH200 - JAX dependency updated from v0.4.13 to v0.4.18 because this is the minimum version available on the ARM base image - A compiler is allowed to choose if `char` is signed or unsigned. The Python bindings in GT4Py cartesian rely on the signed representation, which was the default for the compiler on the x86_64 Ubuntu base image. This behavior is not the default on ARM base image, so we have to enforce it with the flag `-fsigned-char`

…compiler build (#1552) Add one more environment GT4PY_EXTRA_LINK_ARGS for flexible support of various compiler build (this contribution is not copyrightable as it's a trivial change)

In case a stencil contains an Out-Of-Bounds accesses the derivation of the dtype in ITIR embedded can fail with an obscure error like ``` ValueError: DType 'DType(scalar_type=<class 'numpy.object_'>, tensor_shape=())' not supported. ``` This PR adds an assert to catch this earlier and fail more gracefully.

SF-N and others added 30 commits January 17, 2024 14:47

bug[next]: Bound args kwargs edit (#1411)

3edf21e

* edits for BoundArgs with kwargs in correct order

example: cartesian with next compatibility (#1202)

ba36856

Add an example illustrating using gt4py.cartesian and gt4py.next computations next to each other using gt4py.next storages. Refactor GTFieldInterface and cleanup GTDimsInterface for next.

feat[next]: Pass sizes to temporaries from gt4py program (#1359)

49db7ef

Update AUTHORS.md

e6f4160

Update AUTHORS.md

8bd5a41

feat[next][dace]: Add more debug info to DaCe (#1384)

d5cfa7d

* Add more debug info to DaCe (pass SourceLocation from past/foast to itir, and from itir to the SDFG): Preserve Location through Visitors

Bump version to 1.0.2 (#1421)

11f9c1c

Remove usage of deprecated API dace.Memlet.simple (#1425)

9cd9879

Replace deprecated constructor API dace.Memlet.simple() with dace.Memlet()

ci: test jupyter notebooks (#1426)

8c3b3d7

fix[next][dace]: Bugfix in deref (dynamic memory allocation) (#1430)

eb43002

Baseline contained a bug in the lowering of deref in the context of neighbor reduction. The data container should be statically allocated with size equal to the max_neighbors attribute in the offset provider.

build: update min requirements (#1435)

3fb512d

- Update minimal version for pygments due to conflict (failing daily min requirements ci) - Many files touched due to formatting change in black - Fix a bug in cartesian hypothesis setup

test[next]: fix obsolete asarray (#1436)

6262708

test[next]: Add unit test for embedded inverse_image and fix bugs (#…

e4dc1ee

…1432) Add unit tests for `ConnectivityField.inverse_image()`.

build: Update gridtools-cpp version to 2.3.2 (#1437)

28ed830

feat[next]: add tests with mesh with skip values (#1433)

adf3a3c

- Adds a mesh with skip values - Define `common.SKIP_VALUE = -1` instead of using `-1` explicitly - Skip tests with that mesh in embedded (will come in a next PR).

Fix missing cstdint header in gtcpp codegen (#1439)

75d23d0

feat[next][dace]: Modified the file caching. (#1434)

d6dfd6f

In PR #1422 @edopao introduced a mechanism to skip the SDFG translation. This PR moves this cache from the `run_dace_iterator()` function into the `build_sdfg_from_itir()` function.

bug[next]: allow fields of different sizes in tuple in itir embedded (#…

58ec4dd

…1442) Undo an unintended change in #1202 to re-enable an icon4py pattern. Longer term, probably, only transposable tuples of fields make sense, e.g. by intersecting.

feat[next][dace]: DaCe support for temporaries (#1351)

6509dd9

Temporaries are implemented in DaCe backend as transient arrays. This PR adds extraction of temporaries and generation of corresponding transient arrays in the SDFG representation.

bug[next]: fix field_operator caching (#1445)

d95bf89

The cache was copied in `with_backend`, but backend is not part of the hash. Now the cache will be empty after `with_backend`.

tehrengruber and others added 29 commits April 17, 2024 21:00

test[next]: delete cppbackend_tests (#1534)

81426f0

Remove the cpp_backend_tests. They were from a time when we couldn't run gtfn from python.

ci[cartesian]: env var on CSCS-CI to suppress cuda backend deprecatio…

8c5ab41

…n error (#1541) Suppress the deprecation error for cartesian "CUDA" backend on CSCS-CI.

build: update frozen dependencies (#1543)

d49086e

Additional: fix a generator in pretty_printer after mypy update showed the error. --------- Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>

feat[next][dace]: Modify name of stride symbols (#1548)

f472c90

Remove unnecessary unique ID, which makes code generation easier for ICON-DSL bindings.

ci: add manual trigger for daily-ci (#1558)

93fa6f6

ci: fix daily min-requirements (#1557)

374df48

Failure introduced after #1544, but was wrong before.

ci: cscs-ci upgrade boost to latest (#1561)

dccf7bc

gridtools_cpp adds a new minimum boost version in the next release

feat[cartesian]: Specify compiler optimization level and flags via en…

198c0ae

…v variables (#1491) This PR allows downstream applications to specify compiler optimization level and flags on a stencil basis.

build[next]: Compatibility with nanobind 2.x (#1547)

c676fbb

requires gridtools_cpp >= 2.3.4 for nanobind 2.x Additional changes: freezes numpy to 1.x until #1559

feat[storage]: add support for numpy and cupy 2.0 (#1563)

ecf1306

Support NumPy 2.0 by collecting the functions required for creating storages in a single object, created at initialization time, from their actual location in NumPy v1 or v2 APIs.

feat[next]: Check fencil/program args in ITIR type inference (#1565)

a07a7d0

When the new ITIR type inference was introduced this broke the code generation in icon4py. This PR adds a small assert and improves the docstring.

feat[next][dace]: Use DaCe SDFG fastcall (#1562)

58f061f

feat[cartesian]: add extra_link_args for flexible support of various …

6bbedb1

…compiler build (#1552) Add one more environment GT4PY_EXTRA_LINK_ARGS for flexible support of various compiler build (this contribution is not copyrightable as it's a trivial change)

SF-N merged commit b724a8d into SF-N:main Jul 8, 2024
52 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge main #1

Merge main #1

SF-N commented Jul 8, 2024

Merge main #1

Merge main #1

Conversation

SF-N commented Jul 8, 2024