-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have a way to talk about lvalues in macros that need them #214
Comments
As I come back to this idea only a few days later, I'm not immediately sure what an "assignment protocol" such as
Etc. Or, to be precise, I can see how it would be nice in itself to expose this thing from the compiler out into user space (or macro author space) as the But I don't immediately see how it gains us anything compared to just using the assignment operator directly in quasis. Maybe it wasn't obvious before I mapped this out that they'd come down to the same thing? Or maybe I'm missing some big advantage that I used to see? I notice that I am confused. Even
I guess what this means is that, even if we turn out to like the idea of an assignment protocol, it's no longer a blocker for #122, #152 and #203. Which I guess is a good thing. |
Oh!
Yes, the assignment protocol is needed. And no, it wasn't obvious, but to a trained macro author's eye it might well be. Consider again this small non-usage of the assignment protocol:
Now think about what happens when I term this the "Single Evaluation Rule". With it should come a kind of "allergy" that makes a seasoned macro author spot multiple unquotes of the same thing in a macro, and flag them as very likely bugs. (The exception being, of course, if the macro is control-flowy enough to want to multiply evaluate things. The usual fix to uphold SER is to unquote once and store in a temp variable. That doesn't work with assignments, because we only get an rvalue, which is less than what we need to store something in that location. Hence the need for the assignment protocol. |
I feel like it's a good idea to draw a parallel with Common Lisp's Note: it's not totally related to the first message, but resonates with that
(defvar *a* 0) -- CL doesn't have toplevel lexical variables, a dynamic one will do.
-- "set q"uoted. This form is more recent (at least for lisp).
(setq *a* 2)
-- This is the "old way"
(setf (symbol-value '*a*) 2)
-- (the oldest way, just for reference purpose)
(set '*a* 2)
-- As CL is a lisp-2, we'll also desugar a defun:
(defun a () 1)
-- is the same as:
(setf (symbol-function 'a) (lambda () 1)) That's because CL (Lisp-2) uses "slots" for symbols. There's a slot for the value, one for the function. (defvar *xs* (cons 1 nil))
(setf (car *xs*) 2) -- => '(2) Defining them is pretty easy: (defvar *user* (list "John" "Doe"))
(defun (setf name) (new-name list)
(setf (car list) new-name))
(setf (name *user*) "Jane")
(princ (car *user*)) -- Jane Older versions had (defsetf car rplaca) -- "replace car" (not exactly correct Which means we can reveal a trick I mentioned earlier: (defsetf symbol-value set) -- easily reimplemented If more control is needed,
I'm not gonna describe It seems there's an even more complex |
Yes, |
Making a mess of things is not that hard:
If someone wants a location to leak out into user code, all they have to do is throw it across the fence. I guess we could look into preventing the assigning of But I think I have a much more important reason it has to be
An example:
Though there's only one call to I don't know if I've mentioned it, but the idea behind |
I'm really eager to see this one happen, but I'm also filled with a certain unease. My driving question is "What code gets generated for this to work?" That is, once the macro and the quasi have done their work, what's the resulting code that's been injected? Preferably, the answer to that question should still make sense if we also imagine 007 running with a backend that doesn't interpret the AST directly, but that instead runs some kind of byte code or machine code. |
Oh! Wait! Postulate a built-in macro
Into code like this:
(Of course that The way
If any of these assumptions fails, an informative error will be thrown at compile time. This is a vastly better idea than the (First I thought of the The macro itself is interesting in its own right, as it is a legitimate use case for modifying (or rather, deriving something new from) a macro argument by pattern-matching its inside. Best of all, it's an excellent representative of a macro because it allows you to write code that feels right, and then under the hood it turns it into code that just works. This is what macros are all about. (Never mind that the problem we're solving was introduced by having macros in the first place.) Here are the use cases from the OP, instead expressed as
Next up, assignment metaops:
And lastly, the dotty assignment op:
|
I see the point of the macro, but since |
But... The thing that's a prevalent risk is interpolating the same Qfragment more than once. Since when we have "mutating macros", like the the use cases in this issue, we're pretty much guaranteed to interpolate twice: once for reading, once for assigning. But the Single Evaluation Rule is more a rule of thumb than an iron-clad thing. See #278 for a counterexample. |
Of course, this issue no longer prescribes an assignment protocol as such. With |
I'm less enthused by I'm suddenly thinking there's a risk that I'm thinking the fundamental mechanism for talking about locations should be something like Parts of what makes this hard to think about is that on a bytecode level, there's usually no first-class support for lvalues/locations/slots. Maybe starting in that end would make things fall in place? |
Any "value access" instruction which typically returns an rvalue — whether't be a lexical access or some kind of indexed access on a container — factors into first getting hold of the value's underlying Location and then calling .read() on it. Each lexical and indexed access opcode could have a corresponding one that gives back a Location. Calling .write() on a location is what the assignment operator/statement code-generates to. (This is a week argument for it being a statement form, IMHO. See #279.) Whichever solution we arrive at, I would expect it to get the Location once, and then use it for reading and writing as needed. |
I just thought of a simple solution to this: place
This is so nice and simple. Sometimes I'm shocked by my ability to let red herrings obscure the view of, um, much better herrings.
|
I think I forgot to mention at the time, but the hard-won gist that currently represents our best guess at how macro hygiene will actually be implemented — and see especially the appendix — contains Namely, those variable lookups that are "dislodged" because they need to find something inside the |
I think in the end usage of |
It's possibly worse than that. If we want to have a fair chance at targeting backends other than a dedicated VM, then Hm. Maybe that can be detected in a late-bound fashion? Why does this feel similar in spirit to #388? |
As I start implementing
|
All through this issue I've pretty consistently used Speaking of which, I think I nowadays also favor I wouldn't be super-averse to calling the whole thing |
I keep coming back to this issue. It's an important one to macro authors. There are two related things a macro author might want access to:
I've been dealing with C++ a bit lately, for the first time in years. I've come to the conclusion that We might also consider whether we want to call it I also think there are insights to be had just by studying how C++ handles references. Everything from optimizations to common pitfalls — all of it might apply to and inform our use cases. |
I just came across a page that says FORTRAN can be so fast because it lacks pointers/references. A typical case of "weaker is better". On the other hand, this issue is open (and long) because there's a real need here. Macros need to talk about lvalues, and (seemingly inevitably) they need to be reified so that we can choose not to accidentally dereference them multiple times. |
(Off-topic: will I see you at Riga? There could be some on-site bikeshedding if so) The FORTRAN thing is only due to one thing: no aliasing is guaranteed. C has since then gained a keyword, |
@vendethiel Yes, I heard you're coming too. 😄 It's a few months ago, but we talked about me going already. I have a talk there about 007. #481 (comment) Yes, "no aliasing" makes a lot of sense. I'm still hoping that most cases of lvalues in 007 will be possible to optimize away at compile time. Maybe they'll jump a function boundary or two, but that's all. We shall see. |
Above discussion is already pretty good, and makes most of the main points I wanted to make. We're in a bind here, for the following reason:
All this time, I've been wanting to go "screw it, let's just implement @vendethiel's Design-wise, we're trapped between a rock and a hard place. Looking at the One thing that this comment fails to report is the (absolutely frightening) realization that if you take a location on Tackling this issue sometimes feels like grabbing hold of a small fish in the water, only to realize that it's not a fish, it's the largest mammal on Earth — and now neither letting go or catching your prey is really an option. |
Heh. Also, this comment correctly points out that #410 hygiene could be implemented using locations/lvalues. Not first-class ones, since they are only needed in a layer beneath the code itself, so to speak. Similar to how some compilers implement assignment to lexical variables by compiling those variables to a 1-element vector (or equivalent). (Since it's all lexical, we know exactly which variables we will need to do that to!) What I realized the other day (because the ROADMAP says so) is that the hygiene/lvalues equivalence cuts both ways: whether we provide first-class lvalues or not, hygienic first-class code objects (produced by macro arguments and (Edit: 😱) |
I'm not sure it's that bad; that is, I don't automatically agree with October-me. Consider:
This macro, when called, would generate code that provides write access to an array (labeled with the variable name Specifically, the It's possible that if we introduced some form of "access path", we'd also be in trouble. Depends how it's done. |
Somewhat in the manner of zombies, my old long-running issue threads have a way of laying dormant for long stretches of time, and then get up and assault me in inconvenient ways. Around the 20 minute mark in this talk, Walid Taha introduces a very simple inlining macro mechanism in ML, with this example code:
And then immediately turns around and says
He instead evolves the added feature to look like this:
And says.
|
Having read this blog post about Oh, uhm, HN discussion. |
So, I was thinking about this. I almost feel ready to open a PR about it.
We're in a nice place with this one, because we already have several "waiting clients": #122, #152 and #203. In fact, let's use those as our example.
This protocol centers around objects of type
Location
; such an object allows us to save values into variables, array elements, object properties, dictionary values, etc. It has three methods:loc.assign(value)
: stores the value in the location.loc.modify(sub (old) { return ... })
: reads the old value in the location, runs the supplied function on it, and stores back the new computed value while also returning it.loc.postModify(sub (old) { return ... })
: reads the old value in the location, runs the supplied function on it, and stores back the new computed value but returning the old value.The main client of the
.assign
method is the 007 internals; this will be the canonical way to do assignments.The
.modify
method figures in all our three use cases; see below. It's very likely that a macro user who wants to develop similar types of macros will want to go for this one.The
.postModify
method is forpostfix:<++>
andpostfix:<-->
; hence the name. I know of no other uses; of course it's mostly a convenience since you could always do this yourself with.assign
if you wanted. I pondered calling.modify
.preModify
for symmetry, but that would underplay its central importance in the API. Also,.postModify
is the strange one, in an asymmetric way.Now, let's implement
prefix:<++>
andpostfix:<++>
:Next up, assignment metaops:
And lastly, the dotty assignment op:
Note that
lvalue(expr)
is something that happens at macro time. It returns an opaque object that you're not supposed to be that interested in, but that can go through the same kind of programification as Qnodes, and come out the other end as aLocation
. I believe this is the first time we allow something other than Qnodes andNone
through the programification tunnel; maybe it's the start of an exciting trend.Anyway, the reason I decided to have
lvalue(expr)
act at macro time is that it limits somewhat the amount of crazy you can do with this API (which is something that concerned me). In order to use it for anything nontrivial, you basically have to start at macro time, so you have to be a macro writer. Hopefully, that'll sell less of program analyzability down the river.The text was updated successfully, but these errors were encountered: