-
Notifications
You must be signed in to change notification settings - Fork 787
Autogenerate boilerplate code for expressions from a single tool #3264
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This reverts commit d389a79.
Oh, @aardappel do you mean still having a separate generation script but having it be written in C++ instead of Python? I'd be fine with that. |
@tlively I suggested that would be an improvement, yes. But still, not having it at all would be even better. |
Thinking more on this, I agree we should do as much as possible in C++ itself. The best I can think of is a combination of autogeneration, as in this PR, plus C macros, for later work, to avoid autogeneration except where actually necessary. That is, I agree it's not good to use autogeneration where C macros can solve things reasonably well. I do not see a way to replace some amount of core autogeneration with either C macros or with C++ templates (at least not without really weird template tricks, that I don't think would be maintainable). It's possible I'm missing something, so I'd be interested to hear concrete ideas! To summarize how I see things: There are places where you just want to "do X for each class", and in that case, C macros are fine. But autogeneration is better when you need to do something different based on the properties of each field of the class. (Again, I know there are ways to do it with macros or templates - but I can't think of a good way that I'd like to maintain.) I also want to stress the simplicity of the autogeneration here. This is orders of magnitude less powerful and less complex than tblgen! Maybe that wasn't clear earlier, but I think it makes a difference. I'll give some examples of what I mean. Note that these examples also represent future work: I've explored further in side branches, and have not had to add significant complexity to the system. And I've run out of new things to autogenerate, so at least for all initial goals, I think this will be very simple. (Speculative later goals may require more complexity, but we can assess them later - maybe those are bad ideas.) Here's what I mean by simplicity, using an example of the output for comparing expressions, here a LocalSet: case Expression::LocalSetId: {
auto* castLeft = left->cast<LocalSet>();
auto* castRight = right->cast<LocalSet>();
if (castLeft->index != castRight->index) {
return false;
}
leftStack.push_back(castLeft->value);
rightStack.push_back(castRight->value);
break;
} It casts the two expressions being compared, then it compares their index, then it pushes the children to the stack. This is pretty much the C++ you would write by hand. So compiler errors, debugging, etc. are really great, all in C++. Most of the time that's all you need to care about, as a binaryen developer. In the rare cases when you need to add or modify an instruction, you need to edit a few lines of Python, something like this: class LocalSet(Expression):
# The index of the local.
index = Index()
# The value to assign to that local.
value = Child() Again, this is really simple! I don't think this is harmed by being in Python. The even-more-rare case is that you want to write a new code autogeneration rule. This will be very very rare! It's entirely possible that after creating the first batch, very few will ever be written. But if you do, then here's a key fragment of the generator for the above code: for key, field in fields.items():
if is_a(field, Child):
operations.append(f'leftStack.push_back(castLeft->{key});')
operations.append(f'rightStack.push_back(castRight->{key});')
elif ... You iterate the fields, then do a sequence of |
To me that's not simple, for reasons mentioned above. I'd much rather deal with even moderately complex C++ templates, since they have the huge advantage of being directly part of the program I am working on. But anyway, this sounds like it is coming down to personal engineering sensibilities. Have we tried what these crazy templates and macros look like? Would be good to have them for comparison, even if just for a single method, and a few node types. I could imagine something like:
Where Of course, there's variadic templates, but I sense they wouldn't be welcome here :) |
Actually I am realizing that I am more arguing for what I think is "right", than any practical need to not have this codegen step. I am sure I will handle it just fine if I ever need to use it, so feel free to forge ahead. |
Edit: Oh, found it :)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for joining the party late.
For ExpressionDefinitionRenderer
, I'm not sure writing python code for a new instruction is significantly easier than writing similar code in wasm.h / wasm.cpp, but I also understand you intend to use this data structure to generate other parts of code, such as ExpressionComparisonRenderer
.
But I am also a little unsure about approaches like ExpressionComparisonRenderer
too, especially about embedding actual code logic in python code. I also wonder what other parts of the codebase you want to rewrite in this way? I understand why you would prefer it over the existing ExpressionAnalyzer.cpp though.
When I heard about code auto generation I was thinking more in line of.... cases like when you want to add a single SIMD instruction, running some script adds bunch of skeleton template with WASM_UNREACHABLE
in 20 different places so that it is easier to track which part is missing or something, and not the autogeneration of actual programming logic, but maybe I miunderstood what people talked about.
This is just a random vague idea and I'm not sure if it would work, but can we make something akin to ChildIterator
for other fields too, like, FieldIterator<Address>
or something? AFAIK C++ does not offer an easy way for reflection, but I guess we can specialize each Expression
class for each specialized type of FieldIterator
. Then maybe we can compare two expressions in the same manner using ChildIterator
and FieldIterator
..? (Names need some special treatment, but that wouldn't be too hard..?) This kind of efforts wouldn't be worth it if we only intend to use it for comparison, but just in case you plan to use this for other parts too.
I haven't managed to read all the code in detail, but while reading it I just made some random nitpicky comments, and then I also realized I was supposed talk about the high level stuff 😅 But anyway I'll post them...
def is_subclass_of(x, y): | ||
"""Whether x is a subclass of y.""" | ||
|
||
return y in getattr(x, '__bases__', {}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same with issubclass
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be, yes. I experimented with a few ideas here and preferred to add an indirection layer over raw python, while doing so. But yes, it seems we can just use python stuff.
def is_instance(x, cls): | ||
"""Whether x is an instance of cls.""" | ||
|
||
return x.__class__ == cls |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this the same with isinstance
?
// must be exactly identical. | ||
if (iter != rightNames.end()) { | ||
left = iter->second; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand this existing code very well.. Can you elaborate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want (block $x)
to compare the same as (block $y)
, that is, to see things as equal if they are isomorphic, ignoring the concrete name choices. So we keep a map of names on the left to names on the right, and then when we see (br $x)
vs (br $y)
, we check if $x
means "the same" as $y
(which in those blocks it would).
I've experimented in side branches with rewriting the following:
Other things in the possible future:
Overall I agree more can be done than I thought using C++ templates and C macros. However, some things just can't, like the core walking and the C and JS APIs, so this will still save work. There is also the performance aspect. I agree this is less important, but it's still a nice benefit of the autogeneration approach. The benefit it gives is that a ChildIterator or FieldIterator uses a vector to store the fields etc., instead of just writing out the actual operations ( It's possible I haven't fully understood your Iterator idea, though, I'm not sure what you meant by specializing there? |
There's a pretty big bar for a change like this, and it looks like some of us are (understandably) skeptical, so I opened #3294 with another option for discussion. |
Where do we have something akin to
I think this logic is not that ugly to be written as is now, and I don't think there are a lot of churn there, given that we only add something there not every time we add a new instruction but only when we add a new instruction category? In theory they can be also written using things like
These also we need handwritten routines only for control flow instructions, and I believe others can be treated similarly to #3290.
Yeah I'm not very sure about this one can be done fully without a script. Overall I'm not very opinionated and if I'm the only person who's little skeptical about this approach I'm happy to go with what most people like. I was mostly just wondering if there would be a cleaner approach than printing code for actual logic (not repetitive boilerplate) from Python code. |
We have the binaryen/src/ir/ExpressionAnalyzer.cpp Line 99 in 8cd3ad2
I'm not sure what you mean? Overall my goal is to avoid writing large amounts of boilerplate like we need to now. I agree there are incremental things like #3290, and I think some of your suggestions are along those lines, and make sense. But also, fundamentally I want us to have a single "place" where we define fields, so that we can derive everything from there. That would avoid bugs and save time. And, I want us to do that without becoming slower (the |
FWIW, I do still prefer the Python script solution over the pure-macro solution as well. I think it will be more readable overall. |
I tried something like this with macros hoping that we can use this as a single source of truth and use macros to generate wasm.h and even C/JS API, but this looks very uglier ;( (I'm NOT suggesting we do this 😂) wasm.h: ...
#define EXPRESSION(id) \
class id : public SpecificExpression<Expression::id##Id> { \
public:
#define FIELD_ALLOC(id, field) \
id(MixedArena& allocator) : field(allocator) {}
#define NO_FIELD_ALLOC(id) \
id() {} \
id(MixedArena& allocator) : id(allocator) {}
#define FIELD(type, name) type name;
#define METHOD(method) method
#define EXPRESSION_END }
#include "wasm-delegations.h"
#undef EXPRESSION
#undef NEED_ALLOC
#undef FIELD
#undef METHOD
... wasm-delegations.h: EXPRESSION(Nop)
NO_FIELD_ALLOC(Nop)
EXPRESSION_END
EXPRESSION(Block)
FIELD_ALLOC(Block, list)
FIELD(Name, name)
FIELD(ExpressionList, list)
METHOD(void finalize(type type_);)
EXPRESSION_END
... Anyway I think I'm OK with Python code generation, but I'm still wondering if we can get away with not generating the logic part of the code if we use some struct ImmediatesBase { // copied from ExpressionAnalyzer.cpp
SmallVector<Name, 1> scopeNames;
SmallVector<Name, 1> nonScopeNames;
SmallVector<int32_t, 3> ints;
SmallVector<Literal, 1> literals;
SmallVector<Type, 1> types;
SmallVector<Index, 1> indexes;
SmallVector<Address, 2> addresses;
operator=(const Immediates& other) { ... }
};
template<class T>
struct Immediates {};
// I think the below can be generated from the same Python data structure
template<> struct Immediates<Nop> : public ImmediateBase {
Immediates(Nop* curr) {}
};
template<> struct Immedates<Call> : public ImmediateBase {
Immediates(Call* curr) {
nonScopeNames.push_back(curr->target);
ints.push_back(curr-isReturn);
}
};
... And then comparing or copying can be handled within a single method in a uniform way within ~30 lines of code... hopefully? This involves I can be mistaken about this approach, and I haven't actually coded and tested this myself yet, so I'm not totally sure if this will work or will be reasonably fast or anything though. |
I think a reasonable compromise here might be to not aim for a single source of truth, but to have two: wasm.h which is handwritten, and a univeral delegation thing - either a C macro as in #3294 or a C++ template if we can find one - to implement things with. The benefit of two places is that wasm.h is still readable (I agree with @aheejin that it's hard to find any nice way to autogenerate wasm.h! At least that isn't a separate script), and also it avoids a separate script (with the downsides of depending on python more in the build system, etc. - some downstream user of binaryen will be annoyed by adding it, I'm sure...). And two places isn't that bad. I do think we shouldn't compromise on performance, though. @aheejin in your Overall I think I am changing my mind in favor of #3294 . It's not as nice as the python, but as mentioned above 2 places as the source of truth seems acceptable, and speed is ok. Overall very few people (maybe only me?) will write the C macros themselves. More commonly people will add or modify an instruction, and it's as easy to add class NewInst(Expression):
field = Name() as it is to add (in addition to wasm.h) case Expression::NewInstId:
DELEGATE_FIELD_NAME(NewInst, field); |
I'm not very concerned about using Python itself; we are already doing it with gen-s-parser.py and it works great. My concern was mostly about generating not only the boilerplate but logic itself from Python, and this concern applies in the same way to #3294, so it is not really related to Python vs. macro stuff. But maybe it is not really avoidable, if we want to avoid using vectors. |
Closing in favor of #3294. That one just uses C macros, which avoids the need for python, but still removes almost all boilerplate. Less risky/controversial. |
This proposes a new way of developing Binaryen: Instead of hand-writing the
boilerplate for expressions - like the logic for walking, comparing, etc. - we
have a single declaration of each expression, and a script that generates
the code.
The goal is to reduce the hand-written boilerplate code, mainly to make it
easier to work on Binaryen - reduce the time to add new instructions, or
change them - and to reduce the risk for bugs in those places. A minor
goal is to also improve the speed of the code.
This PR implements the declaration and emitting in Python. This can't
be done in C++ itself - in theory macros or templates can get close, but
we want a lot more power here than macros provide even in other
languages (AFAIK - but there is lisp...). Fundamentally we want the power to
process the declarations of expressions and to generate something
pretty arbitrary from that, using a full programming language. A possible
future use of this, for example, is to automatically reorder our instructions
in some optimal way, so that it's a single comparison to check if something
is control flow structure or not; or to reorder (reorderable) fields on an
expression; and more complex things are possible.
Example of what this PR does: We declare
call
once,This reads as: Call is an expression. It has a field "operands" which is
a list of child expressions. It has a field "target" which is a name (the
called function). And it has a field "isReturn" which is a bool that has
an initial value.
The declarations are all ported from the existing header, with initial
values and comments and so forth all copied over, so this should be
NFC (hopefully!)
We then run the tool and it autogenerates the C++ header:
And also it autogenerates C++ code to compare calls:
That code shows how this can prevent bugs: the tool won't forget a field
or anything like that.
This PR is just a proof of concept for discussion. It emits just the two things
just shown: the main header declaration of the class, and the logic to
compare things. If we like this, the goal will be to automate all the other
boilerplate as well - walking, hashing, even the C and JS APIs.
Reading the Python, it's not perfect: f-strings would be nice (
f'{foo}'
willreplace
{foo}
with the local varfoo
), but you need to escape curly braces,and we are emitting C++ here... so I avoid f-strings in some places. In
theory JavaScript could be better here, as the formatting there doesn't have
the issue. I don't have a strong feeling myself between the two. Benefits
of python include that it's already used in Binaryen, and we can import and reuse
a little code from
scripts/shared
, but it's not major.This PR passes the test suite as well as fuzzing and emscripten tests, so it
looks stable. As mentioned earlier speed is not a major goal here, but still,
the comparison code is faster as it avoids the stack of immediates we had
before (
perf stat
reports 9% fewer instructions are run, but the wall timeis not much changed, likely because my test was memory bandwidth bound).
This integrates with the existing CMake build system, running the tool
automatically at the start. It's pretty fast, and it only writes out the output if
there are changes, so it doesn't impact build times that I can tell.
Process-wise, the generated C++ is all checked in. It's also pretty
readable (the tool runs clang-format on it for you). So it should be ok in
terms of debugging, for example - no weird stack trace issues like with
macros. You generally don't need to care about how those files were
generated, unless you are adding a new instruction (and then hopefully
it saves you a lot of time!).
Very interested to hear feedback here! Overall, I think I've convinced myself
at least that this is a good idea. While the script code for a single thing is
less readable than the emitted C++ code for it, you do write that script code
once - and not once per instruction. So it's one slightly worse task rather than
many slightly better ones.
One thing I am not happy with in the current code is how
Method()
s aredeclared. Open to suggestions there.