Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove unnecessary array temporaries for arrays-of-arrays #15767

Merged
merged 4 commits into from
Jun 4, 2020

Conversation

mppf
Copy link
Member

@mppf mppf commented Jun 3, 2020

For https://github.com/Cray/chapel-private/issues/1002

PR #15239 inadvertently added array temporaries for arrays-of-arrays in
the chpl__eltTypeFromArrayRuntimeType function.

This lead to out-of-memory failures with performance testing of isx on
some systems.

This PR resolves the problem by adjusting
chpl__eltTypeFromArrayRuntimeType to use a similar strategy to
chpl__domainFromArrayRuntimeType to avoid allocating a temporary array
in case the element type has a runtime type (as with arrays of arrays).
This required extending PRIM_GET_RUNTIME_TYPE_FIELD to allow for the
case that the result should be treated as a type. (For history in this
area, see PR #11140).

In investigating this problem, I noticed that
chpl_incRefCountsForDomainsInArrayEltTypes /
chpl_decRefCountsForDomainsInArrayEltTypes also were allocating
temporary arrays, which is no longer necessary. I adjusted these to call
chpl__domainFromArrayRuntimeType. Once I did that,
chpl_decRefCountsForDomainsInArrayEltTypes stopped working correctly
and so I went further in adjusting it and related code to better match
array deinitialization. (For history in this area, see also commits
6f435f1 and
65a73e2).

Lastly, this PR adds a test that uses the --memMax functionality in
memory tracking to detect future cases where array-of-array
initialization/deinitialization creates sub-array sized temporaries.

Reviewed by @vasslitvinov - thanks!

  • isx out-of-memory failures are resolved
  • primers pass with valgrind and do not leak
  • full local testing

@mppf mppf changed the title Remove array temps elttype Remove unnecessary array temporaries for arrays-of-arrays Jun 3, 2020
@mppf mppf requested a review from vasslitvinov June 4, 2020 00:47
Copy link
Member

@vasslitvinov vasslitvinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. See my question.

return __primitive("static typeof", arr.eltType);
}

// does the element type have a runtime component?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to add a case when the element type is a domain?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It already is there, on line 826

@mppf mppf merged commit 4c4b222 into chapel-lang:master Jun 4, 2020
@mppf mppf deleted the remove-array-temps-elttype branch June 4, 2020 12:47
e-kayrakli added a commit that referenced this pull request Jun 22, 2020
Add performance annotations

- String regressions

  Caused by: #15758
  Likely fix: #15870

- Trivial leak

  Caused by: #15713
  Fixed by: #15871

- AoA Forall Intent Improvement

  Caused by: #15767

- Sampler compile time regression

  Likely caused by: #15800

[Reviewed by @ronawho]
mppf added a commit that referenced this pull request Jul 13, 2020
Improvements to runtime type support to prepare for array noinit

This PR makes improvements to runtime type support to prepare for
implementing `noinit` support for arrays in support of issue #15673.

When implementing `noinit` on an array type, it is important that the
compiler be able to create a different initialization pattern for an
array that is no-inited. However the way that arrays are
default-initialized is tied to the runtime type support in the compiler
in a way that makes that difficult. The following comment describes the
previous situation:

```
  // NOTE: the bodies of functions marked with runtime type init fn such as
  // chpl__buildDomainRuntimeType and chpl__buildArrayRuntimeType are replaced
  // by the compiler to just create a record storing the arguments. The body
  // is moved by the compiler to chpl__convertRuntimeTypeToValue.
  // The return type of chpl__build...RuntimeType is what tells the
  // compiler which runtime type it is creating.
```

To implement noinit support for arrays, we would like to add a `param
initElts` argument to the function that default-initializes an array
(`chpl__convertRuntimeTypeToValue`). However, because
`chpl__buildArrayRuntimeType` both constructs the runtime type (when
called `chpl__buildArrayRuntimeType`) and implements default
initialization (when called `chpl__convertRuntimeTypeToValue`), this was
not possible in a straightforward manner. In particular, the new `param
initElts` argument should not apply to constructing the runtime type;
only to creating a value of that type.

In order to overcome that obstacle, this PR changes the strategy taken by
the compiler to handle these runtime type support functions. Now
`chpl__convertRuntimeTypeToValue` functions are implemented directly in
module code (instead of being constructed by the compiler by cloning
`chpl__buildArrayRuntimeType` etc). This required some adjustment to
several functions handling runtime types in functionResolution.cpp. 

This PR adjusts function resolution to populate the fields of the runtime
type right away once the relevant array type is created by
`chpl__buildArrayRuntimeType` etc. It does this by calling a new
function, `adjustRuntimeTypeInitFn`, once a function marked with `pragma
"runtime type init fn"` is resolved. This new function generates the
runtime type record and adds it to the runtime types map. Additionally,
it populates the runtime type record with type and param fields that were
previously skipped in order to keep these details preserved. (They will
be removed in late resolution just like with any other record). As a
result, the primitive `PRIM_GET_RUNTIME_TYPE_FIELD` no longer needs an
argument indicating the field type. The argument indicating the field
type was problematic because it was most naturally computed by default
initialization (in several functions added in #11140 and #15239) but that
could have performance impact (see #15767). Now that
`PRIM_GET_RUNTIME_TYPE_FIELD` no longer needs the field type (since the
compiler knows it during resolution), there is no need for the
problematic code.

Since the runtime type record is created earlier, in
`adjustRuntimeTypeInitFn`, now `buildRuntimeTypeInitFns` just adjusts the
runtime type init functions to return the runtime type. It no longer
generates `chpl__convertRuntimeTypeToValue` by cloning e.g.
`chpl__buildArrayRuntimeType` as `chpl__convertRuntimeTypeToValue` is now
written in module code.

In order to get arrays-of-arrays to work correctly with these changes, it
was also necessary to change how default initialization for variables
with runtime types was handled. Now, default initialization for these
types is handled in `lowerPrimInit` which calls `lowerRuntimeTypeInit`.
At that time, the runtime type records exist but are not yet used. So,
the lowering makes use of `PRIM_GET_RUNTIME_TYPE_FIELD` to gather the
arguments to pass to `chpl__convertRuntimeTypeToValue`. (That, in turn,
relies upon the changes to the creation of runtime type records described
above; including in particular having the runtime type record exist and
have `param` fields at this point).

`insertRuntimeTypeTemps` is renamed to `populateRuntimeTypeMap` to better
reflect what it does. In addition, it now includes a check to skip work
in the event the relevant map entry is already present.

`replaceValuesWithRuntimeTypes` is renamed to
`replaceTypeFormalsWithRuntimeTypes` to better reflect what it does. So
too is `replaceReturnedValuesWithRuntimeTypes` renamed to
`replaceReturnedTypesWithRuntimeTypes`

`replaceRuntimeTypePrims`, now limited to just handling
`PRIM_GET_RUNTIME_TYPE_FIELD`, is moved to `removeRandomPrimitive` in
order to prevent an additional traversal over all calls in the AST.

To reduce the potential for confusion, this PR also renames the fields in
the runtime type records. For the runtime type record for a domain, this
PR renames the distribution field from`d` to `dist`. For the runtime type
record for a sparse domain, this PR renames the parent domain field from
`dom` to `parentDom`.

In order to resolve a problem encountered during development, this PR
adds `chpl_createBytesWithLiteral`, makes it a well-known function, and
calls it in `new_BytesSymbol`.

Here is a summary of the way runtime types are handled in the compiler
after this PR:
* `adjustRuntimeTypeInitFn` is called in `resolveFunction` for `pragma
  "runtime type init fn"` functions such as
  `chpl__buildArrayRuntimeType`. It generates the runtime type record,
  complete with param and type fields, and stores a link to that record
  in `runtimeTypeMap`.
* `lowerPrimInit` calls `lowerRuntimeTypeInit` to change default
  initialization of variables with runtime type into calls to
  `chpl__convertRuntimeTypeToValue`
* `populateRuntimeTypeMap` populates a map from types with runtime type
  to `chpl__convertValueToRuntimeType` functions. These functions are
  implemented in ChapelArray.
* `handleRuntimeTypes` runs late in resolution and takes these steps:
   * run `populateRuntimeTypeMap` a second time
   * `buildRuntimeTypeInitFns` converts `chpl__buildArrayRuntimeType`
     etc. into functions returning a runtime type record
   * `replaceTypeFormalsWithRuntimeTypes` changes `type` formals with
     runtime type into runtime type records
   * `replaceReturnedValuesWithRuntimeTypes` changes returned `type`
     values into runtime type records
   * `replaceRuntimeTypeVariableTypes` changes `type` variables into
     runtime type records
* `removeRandomPrimitive` (called in `pruneResolvedTree`) lowers
  `PRIM_GET_RUNTIME_TYPE_FIELD`.
 
Reviewed by @vasslitvinov - thanks!

- [x] primers pass with valgrind/verify and do not leak
- [x] full local testing
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants