Skip to content

Latest commit

 

History

History
745 lines (536 loc) · 31.3 KB

GTScript-Syntax-Discussion.md

File metadata and controls

745 lines (536 loc) · 31.3 KB

Syntax workshop

hackmd-github-sync-badge

tags: GTScript syntax workshop
  • URL: --
  • Next session: --

[TOC]

:::info

Issue description template

Change/Enhancement/Discussion: Short description

Author: name (institution)

Priority: ✔️ / ➖ / ❌

Description: Here goes the description of the issue ….

Proposals: Here goes the description of optional proposals (by the author or others) to discuss… :::

GTScript Syntax: Open Issues

Computation

Change: Change interval() to vertical()

Author: Johann Dahm (Vulcan), Enrique G. P. (CSCS)

Priority:

Description: The current gtscript language specifies vertical intervals using interval() with-blocks. This is intuitive when only vertical intervals are created, however with the addition of horizontal iteration spaces it may make sense to make this consistent and name one vertical() and the other horizontal().

# proposal 1
with computation(FORWARD):
    with vertical(...):
        field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

with computation(FORWARD):
    with vertical(0, -1):
        with horizontal(region[0:1, 0:2]):
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

# proposal 2
field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]
field_e = field_b[-2, 0, 0] + field_b[1, 0, 0]

with vertical(BACKWARD): 
    with interval(0, -1):
        field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

with vertical(BACKWARD): 
    with interval(0, -1):
        with horizontal(region[0:1, 0:2]):
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

# current
with computation():  
    with interval(0, -1):
        with parallel(region[0:1, 0:2]):
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]
    

Proposals:

Use two different keywords for parallel (vertical_stencil) and non-parallel (vertical_iteration) computations, and make interval() to accept slices of axes. So that:

with computation():  
    with interval(0, -1):
        with parallel(region[0:1, 0:2]):  # optionally
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

would be transformed to:

with vertical_stencil():  
    with interval(K[0:-1]):
        with interval(I[0:1], J[0:2]):  # optionally
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

and a FORWARD/BACKWARD computation:

with computation(FORWARD):  
    with interval(0, -1):
        with parallel(region[0:1, 0:2]):  # optionally
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

would be

with vertical_iteration(FORWARD):  
    with interval(K[0:-1]):
        with horizontal(interval(I[0:1], J[0:2]):  # optionally
            field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]

This could also help to explain some constraints inside non-sequential/stencil computations, like forbidden off-center writes, etc.

Decision [2020/09/29]: postpone for now

Enhancement: Definition of horizontal regions and named horizontal regions

Author: Johann Dahm (Vulcan), Hannes Vogt (CSCS), Enrique G. P. (CSCS)

Priority:

Description: discuss GDP about horizontal region definitions and optional syntax to use custom names in the definition of the regions.

Proposals:

GDP #2 was introduced to allow motifs found in the FV3 dynamical core.

Example usage:

def stencil(in_f: gtscript.Field[float], out_f: gtscript.Field[float]):
    from __splitters__ import i0, j0

    with computation(PARALLEL), interval(...):
        out_f = in_f + 1.0
        with parallel(region[i0, :], region[:, j0]):
            tmp_f = 2 * in_f
            out_f = tmp_f
            
stencil(in_f, out_f, splitters={"i0": 0, "j0": 1})

This GDP adds a feature allowing computation on horizontal region[...] relative to "splitters" which are integers passed at run-time (currently using a dict) in the call without separate stencils. The splitter values in the call are relative to the compute domain origin.

At the implementation level, this currently generates an if condition for those stages. We currently disable extent calculation on these stages to avoid false positive error checks at stencil call time.

  1. Add method of specifying stencils
  2. Change parallel() to horizontal() to make this consistent with proposal to change interval() to vertical().

Decision [2020/09/29]: postpone for now

Enhancement: Stencils calling stencils

Author: Johann Dahm (Vulcan)

Priority:

Description: Stencils should be allowed to call other stencils outside computations.

Proposals: Another stencil should be allowed to be called from outside computations, and inlined during compilation. Will need to ignore domain and origin arguments if they exist.

@gtscript.stencil(...)
def stencil(in_f: Field):
  another_stencil(...)
  with computation(...):
    ...

This may require a significant rewrite of the gtscript frontend parsing method.

Decision [2020/09/29]: create a GDP to properly define, discuss and implement this feature

Enhancement: For loops

Author: Johann Dahm (Vulcan)

Priority:

Description: Add Python-style for loops over iterables, ranges or other Iterable objects.

Proposals:

  1. For loop on a set of statements
@gtscript.stencil(...)
def stencil(in_f: gtscript.Field[float]):
    for computation(PARALLEL), interval(...):
        for n in range(num_n):
            ...

n becomes a parameter inside the block, and can be used in computations, but not as an index.

  1. For loop outside computations
@gtscript.stencil(...)
def stencil(field: Field[], *tracer: gtscript.Field[[I,J,K], Tuple[float, float, float]]):
    
    tracer = field[-1,1,1] * tracer[0,1,0]
    
    for n in range(num_n):
        for computation(PARALLEL), interval(...):
            in_f[0, 0, 0][n] = ...

This is a useful pattern for applying advection to a set of tracers.

Open Questions:

  • If allowing this, should num_n be only allowed to be a compile-time constant?
  • Type signature Tuple[float, float, float] seems limiting, could syntax like Tuple[float]*num_n be supported?

Decision: create two GDPs to properly define, discuss and implement both features and try to understand all the use cases first.


Grid and domain

Discussion: Definition of domain and extented domain concepts

Author: Enrique G. P. (CSCS), Hannes Vogt (CSCS)

Priority:

Description: discuss/review formal definitions of domain-related concepts. WIP

Proposals: ...

Discussion: Use of grid objects in cartesian grid definitions

Author: Till Ehrengruber (CSCS), Enrique G. P. (CSCS)

Priority:

Description: discuss if a explicit grid object should be declared as a parameter of the definition function even for cartesian grids. WIP

Proposals: ...

Discussion/Enhancement: Syntax to define objects in cartesian grid definitions

Author: Enrique G. P. (CSCS)

Priority:

Description: discuss syntax to define cartesian grids, optionally including axis splitter definitions. WIP

Proposals: ...

GTScript Syntax: Discussed Issues [2020/09/29]

Field indexing

Discussion/Enhancement: Support for accessing fields with absolute and indirect indices

Author: Rhea George (Vulcan)

Priority:

Description:

Discuss if support for indirect and absolute indices should be added and for which axes (only K or also I,J). Indirect indices means that run-time variables can be used as field indices. Absolute indices means that index values are not longer interpreted as relative offsets to the current iteration point but as an absolute reference to an element of the grid.

Examples/use cases:

  • Vulcan physics: ...
def ...

Proposals: it probably makes sense to restrict the discussion to the K axis, which is most common use case. Interesting points to discuss:

  • Review use-cases and discuss the currently viable alternatives
  • Implementation strategies for absolute/indirect access: complexity and performance penalty
  • Strategies in and consequences for the analysis toolchain

Decision: ❌ This proposal has been withdrawn.

Discussion/Enhancement: Support for vertical off-center writes

Author: Rhea George (Vulcan), Oliver Fuhrer (Vulcan)

Priority:

Description: some column based atmospheric science algorithms are intuitive to scientists to write in such a way that adjacent vertical layers are modified in conjunction. For example, if you take mass from the layer you are at and add it to the one below you, it is intuitive to write:

with computation(BACKWARD), interval(1, None):
        if mass[0, 0, 0] > 0:
             delta = x * y
             mass[0, 0, 0]  = mass[0, 0, 0]  - delta
             mass[0, 0, -1] = mass[0, 0, -1] + delta 

Without off-center writes, we need to write it something like

with computation(BACKWARD):
       with interval(-1, None):
               delta = x * y
               if mass > 0:
                     mass  = mass  - delta
        with interval(1, -1):
               delta = x * y
               if mass[0, 0, 1] > 0:
                     mass = mass + delta[0, 0, 1]
               if mass > 0:
                     mass  = mass  - delta
        with interval(0, 1):
               if mass[0, 0, 1] > 0:
                     mass = mass + delta[0, 0, 1]

This is not terrible, and makes sense with this small example, but it quickly can get confusing with more complex patterns. The special handling at the top and bottom can be easy to forget to do. Here is a sample we ported:

Python loop version

for j in range(js, je ):
    for k in range(1, nz - 1):
        for i in range(is_, ie):
            if qv[i, j, k] < 0 and qv[i, j, k - 1] > 0.0:
                dq = min(
                    -qv[i, j, k] * dp[i, j, k], qv[i, j, k - 1] * dp[i, j, k - 1]
                )
                qv[i, j, k - 1] -= dq / dp[i, j, k - 1]
                qv[i, j, k] += dq / dp[i, j, k]
            if qv[i, j, k] < 0.0:
                qv[i, j, k + 1] += qv[i, j, k] * dp[i, j, k] / dp[i, j, k + 1]
                qv[i, j, k] = 0.0

Stencil version avoiding offcenter writes-introduce fields upper_fix and lower_fix:

@utils.stencil()
def fix_water_vapor_down(qv: Field, dp: Field, upper_fix: Field, lower_fix: Field):
    with computation(PARALLEL):
        with interval(1, 2):
            if qv[0, 0, -1] < 0:
                qv = qv + qv[0, 0, -1] * dp[0, 0, -1] / dp
        with interval(0, 1):
            qv = qv if qv >= 0 else 0
    with computation(FORWARD), interval(1, -1):
        dq = qv[0, 0, -1] * dp[0, 0, -1]
        if lower_fix[0, 0, -1] != 0:
            qv = qv + lower_fix[0, 0, -1] / dp
        if (qv < 0) and (qv[0, 0, -1] > 0):
            dq = dq if dq < -qv * dp else -qv * dp
            upper_fix = dq
            qv = qv + dq / dp
        if qv < 0:
            lower_fix = qv * dp
            qv = 0
    with computation(PARALLEL), interval(0, -2):
        if upper_fix[0, 0, 1] != 0:
            qv = qv - upper_fix[0, 0, 1] / dp
    with computation(PARALLEL), interval(-1, None):
        if lower_fix[0, 0, -1] > 0:
            qv = qv + lower_fix / dp
        upper_fix = qv

In the stencil version, it's hard to tell what's going on. The examples can get more complex. I think that unrestricted off-center writes might create a lot of problems, but this adjacent vertical adjustments pattern comes up a few times in the FV3 dycore and physics.

Proposals: off-center writes could only be allowed in the vertical direction and only for sequential/non-parallel computations with any additional restrictions that'd need to be made.

Decision: off-center writes should only be allowed with the following restrictions:

  • in the vertical direction
  • in non parallel computations
  • relative offsets (in any direction)

Open question: check if all cases can be converted to the current parallel model with workarounds

Issue: GridTools/gt4py#200


Computation

Change: Remove PARALLEL keyword as iteration order

Author: Enrique G. P. (CSCS), Hannes Vogt (CSCS)
Priority:

Description: the PARALLEL keyword used as vertical iteration order in the computation definition is often misunderstood by users. We should find a cleaner way to specify stencils without a FORWARD or BACKWARD iteration order.

Proposals: remove PARALLEL and define computation() as:

def computation(iteration_order: Optional[IterationOrder] = None):
    pass

so it can be used with an empty computation():

with computation():  # Ok
    ...

Decision: it should be removed in the future (maybe with a transition period where is deprecated)

Issue: GridTools/gt4py#205

Change/Discussion: Specification order of vertical intervals

Author: Till Ehrengruber (CSCS), Enrique G. P. (CSCS)

Priority:

Description: currently vertical intervals can be specified in any order inside a computation(), but they will be executed in the order defined by the iteration_order parameter, which means that reading GTScript code from top to bottom does not always reflect the execution order.

with computation(BACKWARD):
  with interval(0, 1): # executed last
    ...
  with interval(-1, None): # executed first
    ...
  with interval(1, -1): # executed "inbetween"
    ...

Proposals:

  • Allow the user to specify in order of preference and reorder automatically (current behaviour in GT4Py)
    • ➕ Boundary cases can be grouped together. Might be closer to how you think and develop the underlying algorithm
  • Force the user to specify in order of execution
    • ➕ Comprehension of the execution order and hence the result of the computation is directly apparent
  • The specification order should match execution order when is a sequential computation, and can be any other order when is "parallel"

Issue: GridTools/gt4py#201

Discussion/Enhancement: Expose the iteration index (positional computations)

Author: Till Ehrengruber (CSCS), Enrique G. P. (CSCS)

Priority:

Description: Users may want to access the current iteration index in the computation and use it as a numerical value. Since a stencil program will normally contain several iteration loops with different bounds (as part of the extended computation domain), if GT4Py exposes the iteration index, it should be clearly defined how the exposed index maps to the computation domain, and possible what are the possible consequences/interactions with the stage merging step in the analysis phase.

Proposals: The syntax could be something like the following approach, but it must be always clear to the user which iteration loop is being uexposed:

with computation(), interval(...):
    field_a = field_b[-2, 0, 0] + field_b[1, 0, 0]
    
    I = index(I)
    field = field[idx, idx, idx]
    
    with axes(I, J, K) as i_index, j, k:  # option 2
        tmp = 3 * i_index * index(I)
        field_c = 3 * i_indextmp * (field_a[-1, -1, 0] - field_a[1, 1, 0])
        
    out = field_c[I-1]
    ...

    with axes(I, J, K) as i_index, j, k:  
        tmp = 3 * i_index
        field_c = 3 * i_indextmp * (field_a[-1, -1, 0] - 

Example: Cloud physics (TBD @tobias)

https://github.com/ofuhrer/HPC4WC/blob/55c9c9242ed6bc7810541725c2cfcf0dfa655aec/projects2020/group05/shalconv/kernels/stencils_part2.py#L101

https://github.com/ofuhrer/HPC4WC/blob/55c9c9242ed6bc7810541725c2cfcf0dfa655aec/projects2020/group05/shalconv/kernels/stencils_part2.py#L196

Should have the same restrictions as for loop variable (see discussion below).

Decision: Syntax: use index(AXES_NAME) to get the actual iteration index
Functionality:

  • only as a run time int value in an expression
  • (Not as an index in field)

GTScript Syntax: Discussed Issues [2020/08/29]

Data types

Discussion: Definition of supported data types in GTScript

Author: Enrique G. P. (CSCS)

Priority:

Description: there has not been discussion about the supported data types in GTScript stencils. This is not a big issue for Storage objects since they can deal with NumPy dtypes transparently, but a clear definition of which ones are supported would be helpful for the users and needed anyway for analysis and code generation tasks.

Proposal:

  • Promote the list of the currently supported data types in the IR definition as the standard data types for GTScript (bool, int8, int16, int32, int64, float32, float64) and define a list of data types which might be interesting to add (or remove) in the future.
  • (Optional) Interaction with Dusk/Dawn toolchain?

Decision: Approved. Interesting data types to be added or discussed in the future are unsigned and complex types with different precision.

Enhancement: Definition of typed scalars (externals and literals)

Author: Enrique G. P. (CSCS)

Priority:

Description: data types are currently specified (both for fields and scalar parameters) in the type annotations of the definition function:

@gtscript.stencil(...)
def my_stencil(field_a: Field[float], field_b: Field[float], value: float):
    pass

The content of each annotation sets argument's data type following these rules:

  • if the annotation is a Numpy dtype-like instance, it will be used as it is
  • if the annotation is a builtin numeric type (float, int) it will be first transformed to a NumPy dtype using NumPy rules (which means: float -> np.float64 and int most likely np.int64)
  • if the type annotation is a string, the actual data type will be the value of the string in the dtypes mapping provided as argument in the stencil decorator call. Example:
@gtscript.stencil(..., dtypes={"small_float": np.float32, "large_float": np.float64})
def my_stencil(field_a: Field["small_float"], field_b: Field["large_float"], value: np.float64):
    pass

Different approaches can be mixed freely in the same definition signature, and data type aliases are allowed, therefore the following example is also valid:

my_float_type = np.float64

@gtscript.stencil(..., dtypes={"my_float": np.float32})
def my_stencil(field_a: Field[my_float_type], field_a: Field["my_float"], value: float):
    pass

These options allow the user to write "generic" stencil definitions where the actual data types are choosen at compile time (the stencil decorator call). However, since there is no way to define a scalar literal (or a external constant) with a specific data type in GTScript, applying these rules to scalar values means that literals are always promoted to int64 or float64 data types in the current implementation. This hidden type promotion can lead to generated code with unneccessary type casting operations and numerical issues, since some operations might be executed at the wrong precision.

Proposal: extend the syntax to specify the data types of all the values used in GTScript code.

Example:

ALPHA = 4.5

@gtscript.stencil(backend=backend, externals={"ALPHA": np.float32(ALPHA)})
def my_stencil(...):
    pass
  • For literals, different non-exclusive strategies:
    1. Inline creation of typed literals (so it can be used inside expressions): float32(4.5)
    2. Provide a cast_to() operator for dtypes defined with strings: field_b = field_a + cast_to("small_float", 43.5)
    3. Type annotations for symbol definitions (it should also work with string annotations for types provided at compilation time): ALPHA: float32 = 4.5, ALPHA: "my_float_type" = 4.5
    4. Use dtypes argument of gt.stencil, with the builtins int and float as keys, to define the actual implementation data types of Python builtin dtypes (for compile-time definition of the actual dtypes, instead of hardcoding them in the definition code):
@gtscript.stencil(dtypes={int: np.int32, float: np.float32})
def my_stencil(...):
  field_a = field_a + 43.5 # => field_a = field_a + float32(43.5)
  1. Use user-defined type aliases as keywords for casting: small_float(43.5)

Decision: As a first step, we will implement the following options: 1, 3, 4, and later decide about 5.

Follow-up questions:

  • Where are type annotations required? Currently not required on functions, but when stencils can call other stencils this will not be parallel.

Field definitions

Discussion: Clarification of the Field concept

Author: Enrique G. P. (CSCS), Till Ehrengruber (CSCS), Hannes Vogt (CSCS)

Priority:

Description: before discussing specific syntatic proposal for the definition of fields in GTScript, it would be useful to clarify the concept of Field and its link with the domain/grid. The main open question is how to classify the dimensions of a field (domain dimensions, data dimensions, ...) and how to map them to the domain.

Proposal: The following table sketches a possible definition of a Field as a mapping from some set of elements in the domain to scalar or vector quantities, and the representation of a similar mapping concept in Python via type hints. Following this idea, the left-hand side of the mapping always represent domain dimensions, and the right-hand side data dimensions, and it is then clear that any high-dimensional field (dims > 3) includes additional data dimensions.

Meaning Math Python/GTScript
Mapping/function $f: X \rightarrow Y$ Callable[[X], Y] ## Mapping[X, Y]
Cartesian 3D $F: I \times J \times K \rightarrow \mathbb{R}$ Field[[I, J, K], float]
Cartesian 2D $F: I \times J \rightarrow \mathbb{Z}$ Field[[I, J], int]
Cartesian 3D (vector field) $F: I \times J \times K \rightarrow \mathbb{R}^3$ Field[[I, J, K], Tuple[float, float]]

This proposal also works for unstructured meshes:

Meaning Math Python/GTScript
Field on vertices $F: V \rightarrow \mathbb{R}$ Field[[Vertex], float]
Field on vertex->edge ("sparse") $F: V \times E^{adj} \rightarrow \mathbb{R} \ v \in V, e \in E^{adj} \iff e \text{ is adj. to } v$ Field[[Vertex, Adjacent[Edge]], float]
Alt: Field on vertex->edge ("local field") $F: V \rightarrow (G: E^{adj} \rightarrow \mathbb{R}) \ v \in V, e \in E^{adj} \iff e \text{ is adj. to } v$ Field[[Vertex], LocalField[[Edge], float]]

Enhancement: Enhanced definition of fields for cartesian fields

Author: Enrique G. P. (CSCS), Till Ehrengruber (CSCS), Hannes Vogt (CSCS)

Priority:

Description: discuss a syntax to support the definition of lower (1d, 2d) dimensional (and optionally higher dimensional) fields for cartesian grids.

Proposal: following the interpretation of the Field concept as a mapping, and since the Python type hint for functions and callables is Callable[[args], result], this is a possible way to define fields in a sufficiently flexible way, while still keeping readiblity (all keywords are assumed to live in the gtscript module). The general formula is: Field[[domain_elements], data_item], with specific adaptations to more complicated cases.

Examples:

  • 3d field on cell centers: Field[[I, J, K], float32]
  • 2d horizontal field on cell centers: Field[[I, J], float32]
  • 1d vertical field on cell centers: Field[[K], float32]
  • 3d field on J cell edges (both):
    • Field[[I, J[LEFT, RIGHT], K], float32]
    • Field[[I, J_INTERFACES, K], float32]
    • Field[[I, J ^ INTERFACES, K], float32]
    • Field[[I, J | INTERFACES, K], float32]
    • All of the previous, but with North, East, South, West
    • Others ...
  • 3d field on J left cell edges:
    • Field[[I, J[LEFT], K], float32]
    • Field[[I, J_LEFT, K], float32]
    • Field[[I, J ^ LEFT, K], float32]
    • Field[[I, J | LEFT, K], float32]
    • Field[[I, J_LEFT_INTERFACE, K], float32]
    • Field[[I, J_INTERFACES[LEFT], K], float32]
    • Avoid horizontal axes altogether and use offsets to reference neighbors (just something to ponder)
    def stencil(f_a : Field[[Edge, K], float32], f_b : Field[[Cell, K], float32])
      with location(Cell), ...:
        # since f_a is a field on edges [-1, 0, 0] refers to the left edge
        #  of the currently considered cell. [1, 0, 0] would be the right edge
        #  and [0, 0, 0] is invalid
        f_a[-1, 0, 0] = f_b[0, 0, 0]
    • Others ...

In the case of high dimensional fields with data dimensions, it should be first discussed if the sizes of the dimensions should be fixed (and thus the sizes are part of the type) or if they can be flexible (and then specified only at compile time or run time).

Examples:

  • 5d field => a 3d field of matrices (M x N) on cell centers:
    (should or could M, N be fixed values?)
    • Field[[I, J, K], Matrix[float32]
    • Field[[I, J, K], Tensor[2, float32]]
    • Field[[I, J, K], Tensor[[M,N], float32]
    • Field[[I, J, K], Matrix[[M, N], float32]]
    • Field[[I, J, K], Vector[N, float32]]
    • Allow both, fixed and dynamically sized: Matrix[float32], SMatrix[[M, N], float32]
    • Others ...
  • 4d field => a 3d field of N-dimensional row-vectors:
    • Field[[I, J, K], Vector[N, float32]]
    • Field[[I, J, K], Tensor[[N], float32]
    • Others ...
      Note: Matrix[1, N] has different meaning.

Decision: Accept trivial extension to field annotations (structured grids) and delay decision on edges and non-scalar data types until later.

Enhancement: Definition of fields for unstructured meshes

Author: Enrique G. P. (CSCS), Till Ehrengruber (CSCS)

Priority:

Description: discuss a syntax to support the definition of fields for unstructured meshes, taking into account sparse dimensions/neighbor chains.

Proposal: following the same proposal of the enhanced field definition for cartesian grids (in order to be compatible with it), a possible way to define fields for unstructured meshes would be:

Examples:

  • field on vertices: Field[[Vertex], float]
  • field on edges: Field[[Edge], float]
  • field on cells: Field[[Cell], float]

Sparse concept for neighborhood chains:

  • field on edges adjacent to cells (sparse concept):

    • Field[[Edge, Adjacent[Vertex]], float]
    • Field[[Cell, Neighbor[Edge]], float]
    • Field[[Cell >> Edge], float]
    • Field[[Cell ^ Edge], float]
    • Field[[Cell @ Edge], float]
    • Others ...
  • field on vertices adjacent to edges adjacent to cells (sparse concept):

    • Field[[Cell, Adjacent[Edge, Vertex]], float]
    • Field[[Cell, Neighbor[Edge, Vertex]], float]
    • Field[[Cell, Adjacent[Edge | Vertex]], float]
    • Field[[Cell, Neighbor[Edge | Vertex]], float]
    • Field[[Cell >> Edge >> Vertex], float]
    • Field[[Cell ^ Edge ^ Vertex], float]
    • Field[[Cell @ Edge @ Vertex], float]
    • Others ...

Local field concept for neighborhood chains:

  • field on edges adjacent to cells (local field concept): Field[[Cell], LocalField[[Edge], float]]
  • field on vertices adjacent to edges adjacent to cells (local field concept):
    • Field[[Cell], LocalField[[Edge, Vertex], float]]
    • Others ...

Decision: More examples and discussions are needed before taking a decision.

Field indexing

Enhancement: Enhance syntax for field indexing in cartesian grids

Author: Enrique G. P. (CSCS), Oliver Fuhrer (Vulcan)

Priority:

Description: index operations with non-3d fields can be quickly become ambiguous or ill-defined, and lack of direction/axes variables end up involving a lot of code duplication repeating exactly the same statements several times, one per axes

Proposal:

  • Support variables containing axes:
dir = I
field[dir + 3]

def gtscript_function(field, dir):
    return field[dir + 3] + 5.0
  • Indexing domain dimensions:
    (examples for field: Field[[I, K], float32] => [I-1, J=, K+1])

    • Map indices to dimensions according to the Field declaration: field[-1, 1]
      • ✔️ succint
      • ❌ error-prone
    • Always specify the 3 domain indices in order: field[-1, False, 1]
      • ✔️ not ambiguous
      • ➖ strict I, J, K ordering
      • ❌ verbose
    • Explicit axes for non-zero offsets in any order: field[K+1, I-1]
      • ✔️ succint
      • ✔️ readable
      • ➖ unusual
    • Others ...
  • Indexing data dimensions:
    (examples for field: Field[[I, K], Vector[2, float32]] => [I-1, J=, K+1])

    • Use __getitem__ of the indexed point: field[K+1 | I-1][1]
    • Use __getitem__ of the field: field[K+1 | I-1, 1]
    • Others ...

Decision:

  • Support variables containing axes: approved
  • Indexing domain dimensions: Explicit axes for non-zero offsets in any order (taking into account collisions with variables declarations and with the syntax for absolute indexing)

Enhancement: Definition of syntax to specify temporary fields dimensionality

Author: Oliver Fuhrer (Vulcan), Enrique G. P. (CSCS)

Priority:

Description: currently GT4Py infers the type of temporaries from the resulting type of the RHS of the assignement to the temporary. For large stencils with many input / output fields of different type (real, integer, boolean), it feels that we are loosing type checking by not being explicit about the type of temporaries.

Also, in operations mixing fields of diferent dimensionality/rank (e.g. 3d and 2d fields), it is not always clear the shape of the output result and it would likely need to be explicitly declared by the user. For example:

c = a[i, j] * b[k]

could imply c[i, j, k] or simply c[i, j].

Data dimensions: if it is allowed to define temporary fields with extra data dimensions, the ambiguities could be harder to solve automatically without explicit user annotations (examples missing).

Proposals:

  • Use type annotations in the declaration of temporary fields:
c: Field[I, J, K] = a[i, j] * b[k]
  • Define that all temporary fields always behave as 3d fields, and implement an analysis step to reduce the actual rank of the temporary in the generated code when this is possible.

Decision: as a first step, we will try to implement an analysis step to optimize the rank of temporaries without requiring the user to provide type annotations, but this decision could be reverted in the future if we find corner cases where it is not possible to select the right shape without additional information provided by the user.