Skip to content

[NFCI][SYCL] Rework of DeclBodyCreator/Array Handling in prep of non-Decomposition #2370

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 23 commits into from
Sep 2, 2020

Conversation

erichkeane
Copy link
Contributor

@erichkeane erichkeane commented Aug 25, 2020

The intent of this patch is simply to clean up how arrays work in the
SYCL visitors. The patch unifies the handling of array elements and
normal fields, such that they should be more maintainable.

This required a handful of changes to make integration headers work
a little better.

Work on this DID require quite a bit of a re-imagining of how the
DeclBodyCreator works. There are a number of small refactors to fix
things I found while working that I believe make it easier to maintain
this in the future.

However, the biggest change here is with the init-list-expr creation.
This patch creates them in advance, rather than waiting until we are
done with a struct/array/etc. This allows us to generate them as we go.

The side effect of this is that the InitExprs array now ONLY contains
collection InitListExprs, since it no longer has to assemble them on the
other side. However, now adding a field ends up being really easy,
since appending to the init-list is easier.

MemberExprBases also becomes easier to maintain, since they now only
get added when changing to/from a field or collection.

This is a WIP patch, just for information of Mariya/Elizabeth to see how
things are going.

The intent of this patch is simply to clean up how arrays work in the
SYCL visitors.  This patch already changes the visitor in a way that I
believe is optimal and has what is necessary to finish this.

Integration header generation also seems to be correct.

Work on this DID require quite a bit of a re-imagining of how the
DeclBodyCreator works. There are a number of small refactors to fix
things I found while working that I believe make it easier to maintain
this in the future.

However, the biggest change here is with the init-list-expr creation.
This patch creates them in advance, rather than waiting until we are
done with a struct/array/etc.  This allows us to generate them as we go.

The side effect of this is that the InitExprs array now ONLY contains
collection InitListExprs, since it no longer has to assemble them on the
other side.

TODO List:
There are a number of TODOs in the patch itself that should be worked
out at one point or another.  There are also currently 9 failing lit
tests.  However, I know that the following doesn't work:
- Base Class initialization lists
- Base Class MemberExprBases
- Array MemberExprBases
@erichkeane
Copy link
Contributor Author

Note, this is a draft PR, just enough so that you two can see what I chose to do, and question my design decisions :)

@erichkeane
Copy link
Contributor Author

Also just discovered that arrays of structs isn't working correctly, since it only has the FieldDecl, instead of its actual type for enter-struct. So there IS some work in the visitor left :)

@erichkeane
Copy link
Contributor Author

There, got arrays of structs and special-type base class initialization working right. Last 4 failures are:
Failed Tests (4):
Clang :: CodeGenSYCL/kernel-param-acc-array.cpp
Clang :: CodeGenSYCL/kernel-param-member-acc-array.cpp
Clang :: CodeGenSYCL/stream.cpp
Clang :: SemaSYCL/array-kernel-param.cpp

I'm sure I have a bunch of array problems with the MemberExprBases, since I haven't touched those yet. I know I also have a problem with the 'stream' type since I haven't touched it very much, but this is all I'm going to do for the evening.

I think this test was just wrong, it seemed to be checking that we were
constructing 2 arrays of accessors instead of accessors themselves. This
patch changes those to be just the accessors themselves, which are
inside an array of 2 accessors.
@erichkeane
Copy link
Contributor Author

erichkeane commented Aug 26, 2020

@Fznamznon and @elizabethandrews , can you pay particular attention to
77b12c7 (77b12c7 Fix Array Kernel Param Test)?

I'm reasonably sure that the test was wrong (and that we had the wrong behavior!), but I want to make sure since this is a change in behavior.

EDIT: This gets me down to the 3 CodeGenSYCL tests! Obviously cleanup of the patch is still necessary after this, but it seems like I'm getting closer :)

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not had the chance to look at your patch yet. I will look at it today

@erichkeane
Copy link
Contributor Author

I have not had the chance to look at your patch yet. I will look at it today

At the moment, it is an "FYI". I'm pretty close to getting it passing all the the CFE tests, but likely have a few hours of cleanup to do as well (to deal with/think about all the TODOs I have around). This is mostly to make sure this isn't a surprise to you if you'd like to take a look.

The three of us can have a quick call at one point if the two of you want an explaination of what I am doing. @premanandrao is welcome too :)

@erichkeane
Copy link
Contributor Author

erichkeane commented Aug 26, 2020

Alright, all the LIT tests pass for me, we'll see what the buildbots have to say :) Two of the tests for accessors in arrays seem to have changed, but I believe they are correct (and further, I think in at least 1 of them, it was wrong before, as it was accessing index2 of a 2 item array!). I hopefully improved the tests.

Whats left:
I think I've accomplished all the tasks I've listed, so I've still got to do:
1- Review all my TODOs/etc.
2- Fix any test failures.

@erichkeane
Copy link
Contributor Author

Hm... looks like I picked up 3 failures :/ @elizabethandrews or @Fznamznon Any chance you can help me take a look?

@Fznamznon
Copy link
Contributor

Hm... looks like I picked up 3 failures :/ @elizabethandrews or @Fznamznon Any chance you can help me take a look?

Hmm, all failing tests use stream. I'll take a look, but I will be offline in a couple of hours and do not guarantee that I will give something working before I go offline. The first thing I would do is compare llvm-ir for these tests before and after your changes. If your change is NFC change against stream handling it should be the same.

@Fznamznon
Copy link
Contributor

Hm... looks like I picked up 3 failures :/ @elizabethandrews or @Fznamznon Any chance you can help me take a look?

Hmm, all failing tests use stream. I'll take a look, but I will be offline in a couple of hours and do not guarantee that I will give something working before I go offline. The first thing I would do is compare llvm-ir for these tests before and after your changes. If your change is NFC change against stream handling it should be the same.

I know what happened. I see that __init call for stream class is generated before __init calls for accessors which are wrapped by stream. I think it breaks stream because __init of stream actually uses one of internal stream accessors, see

WIOffset = GlobalOffset[1].fetch_add(FlushBufferSize);
(GlobalOffset is accessor).
It would be great to re-implement stream class one time, so we don't have to handle it specially in compiler (see #2268).

@erichkeane
Copy link
Contributor Author

Hm... looks like I picked up 3 failures :/ @elizabethandrews or @Fznamznon Any chance you can help me take a look?

Hmm, all failing tests use stream. I'll take a look, but I will be offline in a couple of hours and do not guarantee that I will give something working before I go offline. The first thing I would do is compare llvm-ir for these tests before and after your changes. If your change is NFC change against stream handling it should be the same.

I know what happened. I see that __init call for stream class is generated before __init calls for accessors which are wrapped by stream. I think it breaks stream because __init of stream actually uses one of internal stream accessors, see

WIOffset = GlobalOffset[1].fetch_add(FlushBufferSize);

(GlobalOffset is accessor).
It would be great to re-implement stream class one time, so we don't have to handle it specially in compiler (see #2268).

Ah, you're awesome! Thanks!

I didn't think that would cause a problem, one thing that DID change was that the 'stream' body handling was moved to be before going down its members. This was to make it more consistent with the other special types/collections, but I didn't even think that this would matter.

This is at least easy to solve, and it is a relief that the failures are all stream releated :)

I'll get a new patch up to fix that later today as I look into some of the TODOs that I listed (of which, there are 5, 3 of which I think I should deal with, the other two are perhaps for you/Elizabeth).

QualType) {
return true;
}
// TODO: Does enter-union need to be worried when it is in an array?!
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it does not. addFieldInit handles cases where it is in an init.

BodyStmts.push_back(InitCall);
}
addBaseInit(BS, Ty, InitializationKind::CreateDefault(SourceLocation()));
// TODO: Needs MemberExprBases entry?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, the 'addBaseInit' handles that properly it seems.


// TODO: This name seems outdated now :)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I split the code that made this true into a separate location, so it is mostly true. Unless someone has a suggestion, I think this TODO is going away.


// TODO: This name seems outdated now :)
void addFieldInit(FieldDecl *FD, QualType Ty, MultiExprArg ParamRef) {
// TODO: Why is this by copy rather than 'forinit' or value init?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy ends up being more generic, the each have specific issues based on what is coming in. We could potentially optimize the init at one point by making this one of the other two, but it would have to be on a per-type basis.

CXXCastPath BasePath;
QualType DerivedTy(RD->getTypeForDecl(), 0);
QualType BaseTy = BS.getType();
// // TODO: Why is this here? Do we think this check could fail?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is actually necessary, it sets up casts correctly.

++StructDepth;
// TODO: Is this right?! I think this only needs to be incremented when we
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This IS wrong thanks to my refactor... Still working through some issues on it, but this is my last issue as far as I know.

Erich Keane added 2 commits August 27, 2020 08:30
It sorta worked before this, just by chance, but it would have broken in
a couple of other cases I can think of.  This makes sure that the
integration-header code is array-aware.
@erichkeane
Copy link
Contributor Author

Great, now I seem to have broken arrays :/ Probably my int-header issue, at least the array-kernel-param-run.cpp actually PRINTS stuff of value!

Erich Keane added 2 commits August 27, 2020 09:46
…sn't here."

Temporarily rever the int-header changes to see if this fixes the
validation machine issues I see.

This reverts commit 90ae330.

Conflicts:
	clang/lib/Sema/SemaSYCL.cpp
@Fznamznon
Copy link
Contributor

I have it fixed locally, but with as many problems as we've had with streams I'm trying to write a fairly comprehensive test here.

It can fix #1553

@Fznamznon
Copy link
Contributor

/summary:run

@erichkeane
Copy link
Contributor Author

I have it fixed locally, but with as many problems as we've had with streams I'm trying to write a fairly comprehensive test here.

It can fix #1553

Well, it at least fixes the AST part of that problem :) There is probably a number of other tests that could be written, but the coverage for how streams are generated into AST boht in/out of arrays and structs have been done.

@erichkeane
Copy link
Contributor Author

@Fznamznon : This is ready for review! The test mentioned above is in
7ed04d7

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This overall looks better than it was before, but still requires some time to understand what is going on.
In general I'm fine with changes. There are a lot of code-style problems, but I'm not sure if I should go and create a comment for each one. In addition I'm not sure that stream testing should go together with this patch.
Please note that I ran comprehensive testing and I prefer waiting for results of it before merge, because it is significant refactoring and I want to make sure that we don't revert it after commit.

@Fznamznon
Copy link
Contributor

BTW, since it is a preparation for non-decomposition, I'd like to hear a feedback from @elizabethandrews as well.

@erichkeane
Copy link
Contributor Author

This overall looks better than it was before, but still requires some time to understand what is going on.
In general I'm fine with changes. There are a lot of code-style problems, but I'm not sure if I should go and create a comment for each one. In addition I'm not sure that stream testing should go together with this patch.
Please note that I ran comprehensive testing and I prefer waiting for results of it before merge, because it is significant refactoring and I want to make sure that we don't revert it after commit.

Unfortunately, this ended up being one of those patches that required changing a lot of things in order to set everything right. I'd offered above to go through some of the changes with you, @elizabethandrews and @premanandrao if you'd like!

However, mostly, the DeclBodyCreator was switched to creating the InitListExpr upon ENTERING a collection, rather than trying to set it up afterwards. Array init-list-exprs didn't like being updated as we went, and was difficult to keep track of (hence the massive amount of hackery around the MemberBaseExprs).

Everything else falls out of that :) Part of THAT was unifying the order of operations in a few places (making sure handleSyclStream happened before its elements).

@Fznamznon
Copy link
Contributor

I'd offered above to go through some of the changes with you, @elizabethandrews and @premanandrao if you'd like!

I don't mind, that would be helpful.

@erichkeane
Copy link
Contributor Author

Ok, I should now have the streams test with the additional accessor init calls, plus fixed all the wrong function names that I added. I didn't see any variable issues/etc, but please let me know what else you see!

Copy link
Contributor

@Fznamznon Fznamznon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to leave nits that are outside of your changes as is.

@Fznamznon
Copy link
Contributor

/summary:run

Fznamznon
Fznamznon previously approved these changes Aug 31, 2020
Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks ok to me except for few nits and some questions

// Contains a count of how many containers we're in. This is used by the
// pointer-struct-wrapping code to ensure that we don't try to wrap
// non-top-level pointers.
uint64_t ContainerDepth = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is called StructDepth in other classes. I think we should keep it consistent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works. I originally thought we would care about being in an array (array of pointers), but it doesn't seem we care otherwise.

Copy link
Contributor

@elizabethandrews elizabethandrews left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for this!

@erichkeane
Copy link
Contributor Author

@pvchupin @bader The jenkins/summary bot seems stuck, it keeps timing out at ~20 hours on a number of patches, so I think it is broken and this patch can go in anyway. Thoughts?

Comment on lines +5 to +9
template <typename Name, typename Func>
__attribute__((sycl_kernel)) void kernel(const Func &kernelFunc) {
kernelFunc();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can use single_task from included header instead.

@@ -0,0 +1,1273 @@
// RUN: %clang_cc1 -S -I %S/Inputs -fsycl -fsycl-is-device -triple spir64 -ast-dump %s | FileCheck %s

#include <sycl.hpp>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like other tests use following pattern:

#include "Inputs/header.hpp"

instead of

// RUN: %clang_cc1 -I %S/Inputs ...
#include "header.hpp"

I suggest we refactor our SYCL LIT tests to follow the patter w/o additional compiler options.
I can create a separate PR, if there are no objections.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a problem with it. Getting them all the same would be nice. In this case, I ended up copy/pasting the cc1 invocation/include from another file, so there's definitely more like it that have the same problem.

@bader
Copy link
Contributor

bader commented Sep 2, 2020

@pvchupin @bader The jenkins/summary bot seems stuck, it keeps timing out at ~20 hours on a number of patches, so I think it is broken and this patch can go in anyway. Thoughts?

I'm okay, considering that this is supposed to be non-functional change.

@erichkeane
Copy link
Contributor Author

FYI, looks like the tests finally passed!

@bader bader merged commit 7e1b4b5 into intel:sycl Sep 2, 2020
Chenyang-L pushed a commit that referenced this pull request Feb 18, 2025
[L0] Check that program is in exe state in urProgramGetGlobalVariablePointer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants