forked from GridTools/gt4py
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merge main #1
Merged
Merged
Merge main #1
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…tests. (#1409) * Add power unrolling functionality and respective unit tests. * Define base and exponent variables for better readability in PowerUnrolling * Remove distinction between SymRef and FunCall in power unrolling * Optimize power unrolling to avoid multiple computations of FunCalls * Further improve power unrolling * Update wrt review and adapt expected results respectively * Add correct annotation --------- Co-authored-by: Sara Faghih-Naini <sara.faghihnaini@ecmwf.int>
Found some incompatible tasklet representation while upgrading to dace v0.15.1. Array access inside tasklet with partial index subset worked in v0.14.1, although not valid. The fix consists of modifying the memlets to pass the full array shape to such tasklet, and use all explicit indices inside the tasklet to access the array. This is the right representation in DaCe SDFG, as discussed with the DaCe developers.
* edits for BoundArgs with kwargs in correct order
Add an example illustrating using gt4py.cartesian and gt4py.next computations next to each other using gt4py.next storages. Refactor GTFieldInterface and cleanup GTDimsInterface for next.
Bumping dace version to 0.15.1 affects both cartesian and next gt4py: * cartesian: removed try/except for dace backward compatibility * next: re-enabled some tests that were broken on dace 0.14.4 * all: fixed and/or suppressed flake8 and mypy errors
…gs (#1418) Fixes hidden bugs in `eve.datamodels` and `eve.extended_typing` to support Python 3.11. Actual bug fixes: - Previous fix to support `typing.Any` implementation as a class (python/cpython@5a4973e) didn't work in 3.11. - Partially concretization of generic datamodels replacing typevars was broken. - Partially concretization of generic datamodels leaving some parameters as typevars was broken. Other changes: - Add python 3.11 as supported version. - Remove dead code in comments. - Fix some imports style to comply with our coding guidelines.
…d program (#1323) After #1275 most of the error message given to the user when calling a field operator or program with invalid arguments was only available in verbose mode. This PR shows this information again. ```python @field_operator def foo(x: IField): return x @field_operator def testee(a: IField, b: IField, c: IField) -> IField: return foo(1) ``` ``` gt4py.next.errors.exceptions.DSLError: Invalid argument types in call to `foo`. E Invalid call to function of type `FieldOperatorType(definition=FunctionType(pos_only_args=[], pos_or_kw_args={'x': FieldType(dims=[Dimension(value='IDim', kind=<DimensionKind.HORIZONTAL: 'horizontal'>)], dtype=ScalarType(kind=<ScalarKind.INT32: 32>, shape=None))}, kw_only_args={}, returns=FieldType(dims=[Dimension(value='IDim', kind=<DimensionKind.HORIZONTAL: 'horizontal'>)], dtype=ScalarType(kind=<ScalarKind.INT32: 32>, shape=None))))`: E - Expected argument `x` to be of type `Field[[IDim], int32]`, but got `int32`. E File ".../gt4py_functional/tests/next_tests/integration_tests/feature_tests/ffront_tests/test_arg_call_interface.py", line 113 E return foo(1) ```
* Add more debug info to DaCe (pass SourceLocation from past/foast to itir, and from itir to the SDFG): Preserve Location through Visitors
…nsion (#1422) Main purpose of this PR is to avoid the definition of shape symbols for array dimensions known at compile time. The local size of neighbor connectivity tables falls into this category. For each element in the origin dimension, the number of elements in the target dimension is defined by the attribute max_neighbors in the offset provider.
The lowering of scan operator to SDFG uses a state machine to represent a loop. This PR replaces the state machine with a LoopRegion construct introduced in dace v0.15. The LoopRegion construct is not yet supported by dace transformation, but it will in the future and it could open new optimization opportunities (e.g. K-caching).
Replace deprecated constructor API dace.Memlet.simple() with dace.Memlet()
Introduces mechanism in tests for having different allocators for the same (`None`) backend. Fixes: - The resulting buffer for scan is deduced from the buffer type of the arguments, if there are no arguments we fallback to numpy (maybe break). We need to find a mechanism for this corner case. Currently these tests are excluded with `pytest.mark.uses_scan_without_field_args` for cupy embedded execution. Refactoring: - make common.field and common.connectivity private - rename next_tests.exclusion_matrices to definitions TODOs for later: - `broadcast` of scalar ignores the broadcast --------- Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>
Baseline contained a bug in the lowering of deref in the context of neighbor reduction. The data container should be statically allocated with size equal to the max_neighbors attribute in the offset provider.
- Update minimal version for pygments due to conflict (failing daily min requirements ci) - Many files touched due to formatting change in black - Fix a bug in cartesian hypothesis setup
…1432) Add unit tests for `ConnectivityField.inverse_image()`.
- Adds a mesh with skip values - Define `common.SKIP_VALUE = -1` instead of using `-1` explicitly - Skip tests with that mesh in embedded (will come in a next PR).
This PR provides a bugfix for the case of neighbor reductions with lambda function as reduction operation and connectivity table containing skip values. The lambda function should only accumulate the results for the valid neighbors. On the contrary, the baseline implementation was using the reduction identity value for the missing neighbors, resulting in invalid result. The fix consists of producing an array of boolean flags to determine if the neighbor value is valid or not. If not valid, the call to the lambda function is by-passed.
…ions (no unrolling) (#1431) Baseline dace backend forced unroll of neighbor reductions, in the ITIR pass, in order to eliminate all lift expressions. This PR adds support for lowering of lift expressions in neighbor reductions, thus avoiding the need to unroll reduce expressions. The result is a more compact SDFG, which leaves to the optimization backend the option of unrolling neighbor reductions.
Temporaries are implemented in DaCe backend as transient arrays. This PR adds extraction of temporaries and generation of corresponding transient arrays in the SDFG representation.
The cache was copied in `with_backend`, but backend is not part of the hash. Now the cache will be empty after `with_backend`.
Small PR in preparation of the new ITIR type system. Currently the type of a `itir.Literal` is stored as a string which blocks introducing a `type: ts.TypeSpecification` attribute in all `itir.Node`s. In order to keep the PR for the new type inference easy to review this has been factored out. ```python class Literal(Expr): value: str type: str @datamodels.validator("type") def _type_validator(self: datamodels.DataModelTP, attribute: datamodels.Attribute, value): if value not in TYPEBUILTINS: raise ValueError(f"'{value}' is not a valid builtin type.") ``` is changed to ```python class Literal(Expr): value: str type: ts.ScalarType ```
Remove the cpp_backend_tests. They were from a time when we couldn't run gtfn from python.
…or (#1533) Currently we have a mix of specifying the backend which already comes with a lift_mode default and separately the lift_mode fixture in some tests. The default was not overwritten in the roundtrip backend. Now we remove the separate `lift_mode` and add extra backends with the lift_mode set. Note: we don't run double_roundtrip with temporaries. Longer term we should refactor all itir tests to use the ffront test infrastructure. --------- Co-authored-by: Till Ehrengruber <t.ehrengruber@me.com>
## Description GTC `cuda` backend was made available a few years ago for AI2 team research. It has been kept updated but a recent poll shows that it is not in use. Recent new features break the backend and we propose here to hard deprecate it rather than keep spending time maintaining it. `GT4PY_GTC_ENABLE_CUDA=1` can be used to force the use of the backend, but will warn that any feature from February 2024 are not available/not tested. Additionally a mechanism to deprecate all GTC backends are now in use. Using ```python @disabled( message="Disable message.", enabled_env_var="EnvVarToEnable", ) ``` ## Requirements - [x] All fixes and/or new features come with corresponding tests. --------- Co-authored-by: Hannes Vogt <hannes@havogt.de>
## Description ### New: - `ffront.stages.FieldOperatorDefinition` - all the data to start the toolchain from a field operator dsl definition - `ffront.stages.FoastOperatorDefinition` - data after lowering from field operator dsl code - `ffront.stages.FoastWithTypes` - program argument types in addition to the foast definition for creating a program AST - `ffront.stages.FoastClosure` - program arguments in addition to the foast definition, ready to run the whole toolchain ### Changed: - `decorator.Program.__post_init__` - implementation moved to `past_passes.linters` workflow steps - linting stage added to program transforms - `decorator.FieldOperator.from_function` - implementation moved to workflow step in `ffront.func_to_foast` - `decorator.FieldOperator.as_program` - implementation moved to workflow steps in `ffront.foast_to_past` - `decorator.FieldOperator` data attributes - added: `definition_stage` - removed: - `.foast_node`: replaced with `.foast_stage.foast_node` - `.definition`: replaced with `.definition_stage.definition` - `next.backend.Backend` - renamed: `.transformer` -> `.transforms_prog` - added: `.transforms_fop`, toolchain for starting from field operator - `otf.recipes.FieldOpTransformWorkflow` - now has all the steps from DSL field operator to `ProgramCall` via `foast_to_past`, with additional steps to go to the field operator IteratorIR expression directly instead (not run by default). The latter `foast_to_itir` step is required during lowering of programs that call a field operator.
…n error (#1541) Suppress the deprecation error for cartesian "CUDA" backend on CSCS-CI.
Move pytest `addopts` setting from pytest config in pyproject.toml to an enviroment setting in tox.ini to keep pytest CLI invocation clean during local development. --------- Co-authored-by: DropD <rico.haeuselmann@gmail.com>
## Description Recent work with new user showed that basic mistake can lead to pretty gnarly stack trace instead of a clean error message. This fixes one of the most common: a bad backend name required. ## Requirements - [x] All fixes and/or new features come with corresponding tests.
Extend the implementation of the `premap` field operation (previously named `remap`, conceptually equivalent to a Contravariant Functor's `contramap`) to support more efficient implementations of different use cases depending on the contents of the connectivity field. ### Added - `gt4py.eve`: new typing aliases and minor utilities ### Changed - `gt4py.next.common`: - new typing aliases. - small refactoring of `Domain` to support creation of subdomains via slicing using the `.slice_at` attribute. The actual implementation comes from the now deleted `gt4py.next.embedded.nd_array_field._relative_ranges_to_domain()` function. - refactor `ConnectivityKind` to represent all known use cases - extend `CartesianConnectivity` to support translation and relocations - rename `remap` to `premap` - `gt4py.next.embedded.nd_array_field`: - full refactoring of `premap()` (old `remap`) and add usage documentation - some renamings (`_hypercube()` -> `_hyperslice()`, `_compute_mask_ranges()` -> `_compute_mask_slices()` ### Removed - `gt4py.next.embedded.nd_array_field`: `_relative_ranges_to_domain()` function moved to an `Domain` attribute in `gt4py.next.common` --------- Co-authored-by: Hannes Vogt <hannes@havogt.de>
Additional: fix a generator in pretty_printer after mypy update showed the error. --------- Co-authored-by: Enrique González Paredes <enriqueg@cscs.ch>
## Changed: Toolchain does not own input arguments to DSL programs anymore, instead the input datastructure owns them and the toolchain can dispatch them to a subset of the steps. ## Toolchain migration: Old: ```python FieldopTransformWorkflow().replace(foast_inject_args=FopArgsInjector(*args, **kwargs))(fieldoperator_definition) ProgramTransformWorkflow().replace(program_inject_args=ProgArgsInjector(*args, **kwargs))(program_definition) ``` New: ```python FieldopTransformWorkflow()(InputWithArgs(fieldoperator_definition, args, kwargs)) ProgramTransformWorkflow()(InputWithArgs(program_definition, args, kwargs)) ``` ## Added: - `otf.workflow`: - new workflow type: `NamedStepSequenceWithArgs` takes an `InputWithArgs` and dispatches `.args` and `.kwargs` to steps that set `take_args = True` in the field metadata - new data type `InputWithArgs` wraps a workflow stage and call args - `backend`: Replace `*ArgsInjector` using the new `NamedStepSequenceWithArgs` infrastructure
Remove unnecessary unique ID, which makes code generation easier for ICON-DSL bindings.
Failure introduced after #1544, but was wrong before.
This allows to pull the image (from within CSCS network) without authentication. Useful for debugging images. See https://gitlab.com/cscs-ci/ci-testing/containerised_ci_doc/-/blob/6f83631d9ad2c63edd63bb19b88706d108638c0f/clone_image.md#allow-anonymous-access-to-container-image
gridtools_cpp adds a new minimum boost version in the next release
…v variables (#1491) This PR allows downstream applications to specify compiler optimization level and flags on a stencil basis.
Upgrade DaCe package version to release v0.16.1. Include GT4Py-next code changes enabled by bugfixes included in the new DaCe package. Package versions updated in requirement files, but keeping `nanobind==1.9.2` (GT4Py does not support v2.0.0) and `mypy==1.10.0` (v1.10.1 not available for pre-commit). A couple of code changes in GT4Py-cartesian DaCe backend: - temporary config in test case to force serialize all fields in SDFG JSON representation, because the default config value was chaged in the new dace release - handle scalar parameters in stencil objects as scalars rather than symbols, to avoid undefined symbol issues when the parameter is not used
New type inference algorithm on ITIR unifying the type system with the one used in the frontend. Types are stored directly in the ITIR nodes. This replaces the constraint based type inference giving significant performance and usability improvements. Types of builtins are expressing using simple to write `TypeSynthesizer` of the form: ```python @_register_builtin_type_synthesizer def power(base: ts.ScalarType, exponent: ts.ScalarType) -> ts.ScalarType: return base ```
requires gridtools_cpp >= 2.3.4 for nanobind 2.x Additional changes: freezes numpy to 1.x until #1559
Support NumPy 2.0 by collecting the functions required for creating storages in a single object, created at initialization time, from their actual location in NumPy v1 or v2 APIs.
Updates itir.embedded to work with `itir.Program`s, i.e. `set_at` and `as_fieldop`. For programs to be able to run in embedded, the domain needs to be provided as second argument to `as_fieldop`. Introduces a `DimensionKind` to `itir.AxisLiteral` to be able to reconstruct the kind from the IR. This is needed now as the `set_at` assigns from field to field, which requires matching dimensions. However, previously the python program generated from IR would always construct horizontal dimensions (but the information would not be used). --------- Co-authored-by: Till Ehrengruber <t.ehrengruber@me.com>
…e inference (#1566) Since `cast_` is a grammar builtin whose return type is given by its second argument it is easy to forget inferring the types of the first argument and its children. This surfaced in icon4py and is fixed here.
When the new ITIR type inference was introduced this broke the code generation in icon4py. This PR adds a small assert and improves the docstring.
This PR adds the CI/CD configuration for GH200 nodes, while keeping the currrent configuration for x86_64+CUDA on PizDaint. Some differences between GH200 (on Todi vCluster) and x86_64+CUDA (on PizDaint): - CUDA v11.2.2 / CUDA ARCH=60 on PizDaint vs. CUDA v12.4.1 / CUDA ARCH=90 on GH200 - Support for Python 3.8, 3.9, 3.10, 3.11 on x86_64 Ubuntu base image, while Python 3.8 is not supported on the ARM base image for GH200 - JAX dependency updated from v0.4.13 to v0.4.18 because this is the minimum version available on the ARM base image - A compiler is allowed to choose if `char` is signed or unsigned. The Python bindings in GT4Py cartesian rely on the signed representation, which was the default for the compiler on the x86_64 Ubuntu base image. This behavior is not the default on ARM base image, so we have to enforce it with the flag `-fsigned-char`
…compiler build (#1552) Add one more environment GT4PY_EXTRA_LINK_ARGS for flexible support of various compiler build (this contribution is not copyrightable as it's a trivial change)
In case a stencil contains an Out-Of-Bounds accesses the derivation of the dtype in ITIR embedded can fail with an obscure error like ``` ValueError: DType 'DType(scalar_type=<class 'numpy.object_'>, tensor_shape=())' not supported. ``` This PR adds an assert to catch this earlier and fail more gracefully.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.