-
Notifications
You must be signed in to change notification settings - Fork 214
How are macros passed names of declarations to produce? #2093
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
This feels off to me as well. I personally would expect, in that case, to be passing something which is purely syntactic, like a string or a symbol. Is there not a sort of option 3 inherent here, which is that the argument is a resolved ast node, from a restricted grammar? That is, I can imagine saying that arguments to macros must be one of:
|
Yeah, I think this is basically my option 1, I just didn't spell out the special support for other literals. The key question is whether an identifier is resolved or not. If it is, then it's weird to use identifiers as a way to pass in the name of a declaration you'll create because at the point in time that the macro is executed, the name isn't resolved. If it's not resolved, then it breaks the other use cases we have where interpolating an argument expression into generated code should avoid capture. |
This seems right to me. Note that allowing unresolved identifiers also seems error prone to me. It feels to me to be part of the contract of the macro: either "this macro expects to receive a pointer to an identifier which is in scope, and please give me a static error if I typo it" or, "this macro expects to receive a string which will be used to generate a specific new name". Having a macro with the first contract silently accept an unresolved identifier and then just go off the rails feels maybe bad? |
Here's a strawman solution: We could support something like "generated declarations". This is a new kind of declaration syntax that statically specifies the name of the declaration, but delegates generating the implementation to a macro. We'd have syntax for the various kinds of declarations: class, function, member, etc. Something like (complete strawman syntax): class MyWidget = SomeClassMacro();
class C {
void foo() = @SomeMethodMacro();
} Here, The important part is that now you aren't passing the name to the macro as an identifier expression argument. Instead, it's written explicitly using a declaration syntax. It's clear to the macro implementation and the human reader that this application creates a declaration with that name instead of being passed one. The macro implementation implicitly knows to create a declaration with that name from the generated code the macro returns. (If the macro also needs to know the name for its own purposes, like creating constructors, it can be provided through an API.) For cases where a macro needs to do some more interesting computation to generate a name, then it can still produce declarations with arbitrary names. If that process needs some kind of parameter, it gets passed to the macro as a string literal. So a macro that produced a class with a given name, but capitalized the name, would do: @LoudClass("someClass")
library; What this means then is that arguments passed to macros are either:
|
I'm not too experienced in this area, but this sounds like a job for /// Here we can just make an empty class that the macro can normally augment.
@SomeClassMacro()
class MyWidget { }
class C {
/// Here we declare the method external and the macro can take over.
@SomeMethodMacro();
external void foo();
} |
I don't think we need to endorse this pattern, but I do think we should support it, if only because support for it will naturally fall out of the general support for We do have a concrete need for the latter - consider for instance mockito:
This needs to be able to resolve Once we add that support, somebody can just as easily use it to take a non-existent Identifier, and generate it (by just grabbing the
We should have both of these, there are good use cases for both. If you want an arbitrary piece of "syntax" you should express that by accepting a parameter of type
It is a reference to something though. It is just a reference to a generated thing, and it will even be clickable through to the declaration (it is no different than any other Identifier pointing at any other generated declaration). The fact that it was generated directly as a result of this macro is a bit interesting (and possibly unexpected) I agree, but I don't think we should go out of our way to block it, unless it causes implementation or specification problems. Macro authors can also choose to use a Symbol or String instead, it's up to them.
Identifiers are not resolved, they are resolve-able, and only after phase one. This is a key aspect of how they work. An Identifier passed to a macro constructor is no different than any other Identifier. In phase one macros can only read its name, and thus they could define a new class (or member) with that name. In phase 2 and beyond they can attempt to resolve it (note that we only allow resolving type identifiers today, attempting to resolve a different kind of identifier will fail). If by the time macros are done running any identifiers in the library cannot be resolved, then that is an error, and should be reported as such (there is nothing special about identifiers in macro applications here). |
I don't think we get support for this "for free". Right now, the semantics of arguments passed to macro invocations are underspecified and pinning that down will mean either explicitly figuring out how to define valid semantics for this use case, or ruling it out.
I don't think we can support both use cases. Consider: @MyMacro(foo)
library; There is no To support case 1 where all arguments are understood to be eventually-resolvable real code, the answer must be "yes". To support case 2 where you can treat arguments as pure syntax, the answer must be "no". I think we have to pick.
Our use cases may have to meet the language in the middle. We don't have to exactly support Mockito's current API and way of doing things, we just need some usable way for Mockito to use macros. We could potentially have Mockito look something like: @Mock
class MockThisType implements ThisType {}
@Mock
class MockThatType implements ThatType {} Or maybe: class MockThisType = @Mock;
@Mock
class MockThatType = @Mock; (In this latter example, Supporting a lot of use cases is definitely important, but we have to balance that against not contorting the language into something we'll struggle to maintain and users will struggle to understand. Just because we can technically implement something doesn't mean it's a good design. JavaScript specified
My point is that it's not up to them. If the declaration they are generating doesn't end up in the top level scope, then if they try to accept an identifier, they'll end up with an unresolved name compile error after macros run.
Sure, I get how the system works. I think it's entirely reasonable (and necessary) to say that code in a library that has macro applications may have unresolved names before the macro applications have run. But it seems like a big stretch to say that arguments passed to macros themselves may contain unresolved names. Once a piece of code is an actual input to a macro which may introspect over it as a value, then it seems very sketchy to me to have that value possibly depend on the output of other macros or even the macro that it was itself passed to. Even if it's technically possible for us to specify and implement this, I don't think it's a particularly usable feature. One of our goals with macros is that users can read code and for the most part understand what it means. But with what we're talking about here, an identifier passed to a macro could mean very different things, entirely up to the macro's discretion. It seems like a base level of usability is that someone reading a macro application should know which things are inputs and which are outputs. Consider: @MyMacro(Foo, Bar);
library This macro could look up It feels like we've reinvented dynamic scoping, but worse. |
Sorry yes I misread option 2 there, we should go with option 1 imo. So not actually "pure syntax", but actually a valid piece of code, resolved where it was written (not where it is interpolated). To the macro author it is an opaque chunk of Code that they can pass to other Code objects. If you want pure syntax, resolved wherever it is interpolated into a Code object, just accept a String. |
I was digging around and we do actually have the scope specified for Code objects here. This needs to move (or be re-iterated) in other sections I think. It says:
I think this should also be tweaked to say the scope is the scope of the macro annotation itself - so if it is on a member of a class you could reference static members unqualified. In other words, the scope is the same as if it was a normal annotation.
I don't see this feature (type literals and identifiers as macro constructor parameters) as being something we will regret or something that will be difficult to implement. We already have to support So while I agree we can't support everything, I think this feature has a good cost/benefit tradeoff.
The macro authors do know what they expect to generate. If they aren't going to generate something into the top level scope matching the identifier they are given, then yes they cannot accept an Identifier as the name for that thing. The identifier won't resolve to the correct place. Also if the identifier given is a private name that wouldn't work (the name they generate would be private to the augmentation, and not visible to the library). Those are all good reasons for them not to accept an
What is the distinction between arguments passed in the constructor, versus arguments passed to a method of that macro? To run a macro we pass it unresolved Identifier objects as well.
I think it is concretely very useful to be able to reflect over the types you are given in a macro constructor. This is a common practice in code generators. If it isn't feasible to implement then the implementation teams can push back, but I see no reason to believe it would be any less feasible than providing an Identifier instance to a phase 1 macro.
People can write any manner of bad, unreadable, unusable code and/or apis. That is and always will be the case. I don't think we should block known useful patterns just because you could technically abuse them in a way users might not like. If you do that, people just won't use your package. |
It seems like there are two different types of "arguments" you're talking about here:
class ClassA { int get temp => 0; }
/// There are (at least) two values here the [Mock] macro can see: `"Mocked instance of ClassA"` and [ClassA].
@Mock(debugName: "my test class")
class ClassB implements ClassA { }
@Mock() // to show what happens with no arguments
class ClassC implements ClassA { }
// ----- generates -----
class classB implements ClassA {
int get number => 0;
String toString() => "Mocked instance of my class"; // name is overriden
}
class ClassC implements ClassA {
int get number => 0;
String toString() => "Mocked instance of ClassA";
} I think that's an important distinction to make because of the mental model users have when working with macros and code generation. Conceptually, a macro -- as currently proposed -- is a class that modifies a declaration. Slapping a macro on a class, library, function, or variable declaration can modify or add to that item. The macro needs to see what it's modifying or adding to ("augmenting"). But sometimes, the macro needs some more info not found in the source code. In my above example, it's the name I want So it's fair to distinguish between whether identifiers in the macro constructor are resolved versus whether identifiers in the source code are resolved. Since the source code is being modified by the macro, it makes sense for it to be unable to compile until the macro is run. But the macro constructor itself is an instruction telling the macro how to augment the declaration. It wouldn't make much sense for that to contain unresolved code, as that would make the macro's purpose itself ambiguous. If you want some string of text to be injected into the code, it would be natural to use a string literal, which we use for that purpose anyway (the only difference is I/O -- macros output code and |
I think we really reduce the original question ( Those objects are really at the core of what allows you to introspect on types (after https://dart-review.googlesource.com/c/sdk/+/231327 lands which is imminent). If we allow them, then there is really no difference between The difference (I believe) is just whether we provide you direct access to those identifiers, when that is what you really want/need. Giving you that access allows you to potentially generate the thing that identifier refers to, but that is really just a side effect, and not something we need to endorse as a pattern.
No identifiers are resolved, you can only resolve them through a separate API, which is only provided to you in the appropriate phase. This is why it is safe to pass an Identifier which resolves to a generated declaration to a macro constructor, and why I see no need to make a distinction. It is no different than putting an identifier which resolves to a generated type on a declaration (ie: in a type annotation). |
See #2094 for my attempt at resolving this (the question of how macros should actually be passed names is still up to them ultimately, but taking an |
The use cases we're talking about for identifier arguments to macros that I know of are:
Do I have that right? If so, here's where I'm at:
If I'm reading the comments right, I think we're in agreement on 2 and 4. I think we can reach agreement on 1 if you're OK with restricting the resolution API to fail on cases like I mention. I believe we all want to ensure that unrelated macro evaluation order isn't user visible, so this is just plugging an unintended hole. So is it just 3 where we have significant disagreement? |
The different use-cases for resolved identifiers and unresolved syntax are distinct. This is why when I suggested 'Code' parameters on the prototype repo jakemac53/macro_prototype#26 I suggested that macros could accept regular parameters, Tagged strings could be used to pass syntatic code blocks that highlight properly in the IDE, but are clearly not intended to be resolved in the surrounding context, but rather in the generated context i.e. As far as ordering, I would assume that both of the So I guess my suggestion is to treat the two use-cases separately and make them distinct from each other so that users understand the difference. Macros can introspect on resolvable code (not in the first phase of course), accept basic types / parameters, and finally accept tagged syntax strings, which could just be treated as strings for now, but eventually should support syntax highlighting for syntax prefixed strings. Macro authors could then parse syntax strings into an AST if they need to do more complex manipulation for translation / caching etc, and it would be nice if they could take a raw string from that parsed AST and try to resolve it in later phases, but that would be a stretch goal. |
Some sort of special tagged string which is understood by the IDE to be Dart syntax seems reasonable - I don't think we want it to be literally a We don't have plans at this point to make |
#2094) Attempt to close #2093, and #2092. Related to #2012. - Adds `Identifier`, `List`, and `Map` as valid parameter types for macro constructors (and thus valid arguments for macro applications). - List and Map are allowed to have type arguments that are any of the supported types. This allows for `List<Identifier>`, etc. - Specify the scope for identifiers in macro application arguments better (both bare and in code objects). - Some other unrelated cleanup (can remove if desired). - Fixed up some old links - Removed the section on `Fragment` (you can just use `Code` for this now).
See the PR #2094 which closed this issue. The tldr; is: We now allow This pattern should likely be discouraged, because it won't work for private names, and it also might be unexpected/confusing, but we decided it would be more complex to block it than just allow it, but discourage its use. |
Also:
Jake explained to me offline that we don't have to worry about this scenario. Macros can resolve the identifiers passed as arguments, but only in phase 2 or later. So in the phase where top level declarations can be added, no identifier arguments can be resolved. That should ensure that unrelated macro application order is still hidden from users. |
Uh oh!
There was an error while loading. Please reload this page.
Some macros might want to create a declaration whose name is chosen by the caller. The approach we're currently leaning towards is that the macro takes the name as an argument, like:
Here, the
@CreateClass
macro generates a new class whose name isFoo
, based on the argument passed to the macro. The argument is syntactically just a identifier expression.When are macro arguments resolved?
That raises the question of what that
Foo
expression means. In most cases, the macro argument expression is treated as an actual expression that when interpolated into generated code produces the result of evaluating that expression. For example:Here, this contrived macro generates a function body that looks like:
In the example above, it generates this for
f()
:Where
prefix1
is an import prefix of an import to this same library so that the expression reliably refers to the actual top-levelfoo
declaration. This way it doesn't inadvertently instead evaluate to a reference to the local variablefoo
in the body generated by the macro.In other words, identifiers in macro arguments expressions are interpreted as resolved references to actual declarations and not simply meaningless syntactic identifiers that are resolved after they get inserted into generated code.
Macro arguments for names
That interpretation doesn't naturally make sense for the first example. In that example there is no
Foo
declaration that the macro argument can resolve to. It doesn't exist.The current thinking is that that's OK. The implementation of the
@CreateClass
will be careful to access the name of that identifier argument and insert that as bare syntax into the class declaration it's generating. Then the macro generates a class with that name. And now after the macro has run, the identifier expression being passed to@CreateClass
does exist. Because@CreateClass
created it. Once the macro is done, now the identifier can be resolved and things like go to definition in an IDE can take you from that macro argument to the generated class.This feels honestly pretty sketchy to me. I understand that it's consistent with other places in hand-authored code where users can refer to identifiers that won't exist until macros have run. But this feels different because the identifier that refers to a non-existent declaration is passed to the very macro that creates it.
We're passing an argument that isn't meaningful until after the macro receiving it has run. It looks like we're passing a reference to the thing being declared to the macro but we're actually passing a reference to the thing it will declare, which the macro then conveniently conjures into being.
Macro arguments for non-top level names
Let's consider a different case:
Here, the
@GenerateMethod
macro takes an identifier argument. It creates an instance method with that name and declares it in the class. Does this work? Unfortunately, no.Because the model we have is that identifier arguments to macros are resolved expressions. We may need to defer resolving them until after macros have run (so that the declaration they resolve to exists) but the assumption is that eventually they can be resolved. But that's not the case here. Even after the macro runs, there will be no top level declaration named
foo
that the macro argument expression can resolve to. There's only an instance method, but that name isn't in scope. So after all the macros run, this program will have a compile time error in the macro application because thefoo
argument refers to an unknown declaration.This implies that you can't pass a name to a macro as an identifier expression unless it produces a top level declaration with that exact name. You'd have to instead do something like:
Or maybe:
An analogous problem is when a macro does produce a top level declaration but the name isn't verbatim identical to the identifier being passed. If the macro was to modify the name in any way (for example, functional_widget capitalizes the name), then using an identifier expression no longer works.
This seems like a footgun to me.
Two interpretations
Overall, I feel like we are on a very unstable foundation with regards to what the arguments passed to a macro mean. I can see two straightforward models:
An argument is a thunk, a deferred wrapped expression that can be injected into generated code and will evaluate to the same thing that the expression would if you were to execute it eagerly where it appears at the macro application site. This is consistent with how metadata annotation arguments work. They are expressions evaluated right there. (Not quite the same, though, since macro arguments don't necessarily have to be const expressions).
Since macros work at the metalevel, we can't actually eagerly evaluate the expression to a value before passing it to the macro. Instead we pass an object representing the code for an expression that will produce that value if inserted into code where an expression is expected.
An argument is pure syntax. It's more or less just an AST node. When it gets interpolated back into generated code, it means whatever it would mean at that location. A macro can introspect over it and do whatever it likes with it (subject to how transparent we want the Code API to be). For example, if it's an identifier, the macro could decide that that identifier means a class name, an instance method name to define, a function to call, or just an arbitrary piece of string to print.
When using identifier expressions as arguments to specify the name of a generated declaration, it feels to me like we are mixing these two together in a way that gives me the heebee jeebees. It might technically work if a macro is carefully authored such that a given macro argument does behave as it would under both interpretations, but that feels like a brittle boundary.
Thoughts, @jakemac53 @srawlins @leafpetersen (or anyone else who wants to chime in)?
The text was updated successfully, but these errors were encountered: