Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the scope of definition to its constructor #997

Closed
xsebek opened this issue Jan 10, 2023 · 7 comments · Fixed by #1928
Closed

Add the scope of definition to its constructor #997

xsebek opened this issue Jan 10, 2023 · 7 comments · Fixed by #1928
Labels
C-Low Hanging Fruit Ideal issue for new contributors. L-Type system Issues related to the Swarm language type system. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. T-Editors Involves editor support. Z-Developer Experience This issue seeks to make life easier for developers writing Scenarios or other Swarm code.

Comments

@xsebek
Copy link
Member

xsebek commented Jan 10, 2023

Currently, the types of let and def differ:

data Term
  = SLet Bool Var (Maybe Polytype) Syntax Syntax
  | SDef Bool Var (Maybe Polytype) Syntax
  | -- ...

The last argument of let is code in which the variable can be used, while def:

...binds a variable to a value in subsequent commands.

It would be useful to unify let/def parameters to simplify usage analysis and enable editor features like:

  • showing where the variable was defined OnHover
  • jump to the definition

As is, the code providing those features would have to search the AST for the definition.

Solution

Add Syntax parameter to SDef and initialise it with TConst Noop.

Then do a second pass on the AST and put the scope into SDef:

((a; (def b = move end; c)); d); e ---> a; (def b = move end; c; d; e)

Notes

We restrict the definitions to be used only at the top level, so at worst this can happen:

(def x = 3 end; def y = move end); build {y}  // code provided by @byorgey
@xsebek xsebek added C-Low Hanging Fruit Ideal issue for new contributors. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. L-Type system Issues related to the Swarm language type system. Z-Developer Experience This issue seeks to make life easier for developers writing Scenarios or other Swarm code. T-Editors Involves editor support. labels Jan 10, 2023
@byorgey
Copy link
Member

byorgey commented Jan 10, 2023

I think I am against making this change. def really is different from let: let defines a name in a local scope and def defines a name in a global scope. For example, the name defined by a let can never be in scope outside the expression containing the let. On the other hand, names defined by def can be in scope (1) in another file, after executing run/import; (2) in subsequent expressions entered at the REPL, after executing a def at the REPL. So including an extra Syntax node in SDef and saying it represents code in which the variable can be used is going to be a lie: it is always possible to use the defined name in more code besides just the code that is to the right of the def in the same expression.

As for showing or jumping to a definition site, I don't think it is that hard. For locally-bound variables, we can keep a Map Var SrcLoc as a function parameter / Reader monad context as we recurse through a term. For globally-bound variables (i.e. def), we just have to keep a separate Map Var SrcLoc in a State monad context. And note that refactoring the code as suggested wouldn't actually make this any easier or get rid of the need for the State monad context, because of the issues mentioned above: variables created by def scope past the end of the expression in which they are defined, so we need to return a Map containing them at the end of processing the expression anyway, which we can't do if we are just passing around a Reader context. This is true even if we're only implementing jump-to-definition, if we want it to work across files with import (which we definitely do).

@xsebek
Copy link
Member Author

xsebek commented Jan 10, 2023

Good points @byorgey. 👍

Just to clarify, my issue is that except for a few shallowly nested defs, they already do kind of “have their scope nested inside”.
Except its not in the Def, but the parent Bind.

To me it seems like a waste to not use the Def directly and get more sensible structure. After all the “future uses” of the def will only be in REPL or once we have import in the file that imports it. So it will make sense for its AST.

@xsebek
Copy link
Member Author

xsebek commented Jan 10, 2023

we need to return a Map containing them at the end of processing the expression anyway

@byorgey what about redefinition though?

(move; def m = make end); m "board"; def m = move end

In this example the result of the processing pipeline that the OnHover uses will look something like this:

Bind
  (Bind
    move
    (Def m make))
  (Bind
    (m "board")
    (Def m move))

If we straighten the Binds which I arranged in this silly way in the example, we get:

Bind
  move
  (Bind
    (Def m make)
    (Bind
      (m "board")
      (Def m move)))

Here the extra Bind is useless and we could move the right parameter to Def:

Bind
  move
  (Def m make
    (Bind
      (m "board")
      (Def m move noop)))

TLDR The result in context is that m is move and the only way to find where the used m came from (m = make) is to go through everything.

I would prefer just recursing down the tree, not doing an in-order traversal to find if any redefinition occurred.

@xsebek
Copy link
Member Author

xsebek commented Jan 10, 2023

Btw. this would be less performant than in-order traversal unless we keep the pipeline result for an unchanged file.

I.e. if the developer stops writing and starts hovering over stuff to figure it out.

@byorgey
Copy link
Member

byorgey commented Jan 10, 2023

I still don't really see the benefit. In theory this might make dealing with name scoping slightly easier (though I am not even sure that's true, because of the need to return information about defined names back to the outside after processing a term). But on the other hand, doing this refactoring would be a ton of work and would introduce some weird special cases we have to deal with. For example, the parser would become more complex, due to the need to "cap off" definitions with dummy terms while parsing them, and then fix them up by reassociating them later. The typechecker and pretty-printer would need refactoring. And I don't even want to think about the complexity of the changes that would be needed for the evaluation engine; dealing with binds and definitions is already fiddly enough. This might be worth it if it could drastically simplify evaluation but I don't think that is the case. It could also lead to some code duplication since essentially now every def would also be acting like a special bind.

I don't understand your point about redefinition. Of course the Map returned at the very end would map m to the last definition, but we would not use that Map to answer queries about specific occurrences of names within the term. That Map would only be used for e.g. answering queries about names contained in some other file that imported this one. To answer queries about names contained within the term, we could do one of two things: (1) we could traverse through the term from left-right while keeping track of a Map for definitions as a current State context and a Map for local variables as a Reader context; once we get to the place with the name we want to query, we just look it up in the current Maps. This is not hard, it works perfectly well even for multiple defs nested on the left side of a bind, and reassociating binds does not really make it any easier. (2) If we don't want to repeatedly traverse the term every time we want to answer an OnHover query, we could do a single traversal and return a Map LocVar LocVar, i.e. a map from variable occurrences to corresponding binding sites. This would solve your issue with m being redefined because different occurrences of m would be in the Map separately. I don't know whether the LSP framework lets us do that (i.e. do a single traversal of the term and then answer queries repeatedly without recomputing every time until the text is edited again) but if so this could work well.

Also, performance is not really an issue here. Remember that we run the entire parsing-typechecking-requirements checking-elaboration pipeline on every keypress at the REPL! As long as we are not doing anything that is O(n^2) or O(2^n) it should be totally fine.

@xsebek
Copy link
Member Author

xsebek commented Jan 11, 2023

Well considering all this it would definitely not be a Low Hanging Fruit. 😄

Thanks for the detailed explanations, they will soon come handy. 👍

@byorgey
Copy link
Member

byorgey commented Jun 14, 2024

Reopening this since it's actually a great idea, I just didn't know how to accomplish it back at the beginning of 2023 --- and I think I have actually accomplished it as part of #1928, so hopefully this will be closed again shortly, but this time actually completed. 😄

@byorgey byorgey reopened this Jun 14, 2024
@mergify mergify bot closed this as completed in 23b5398 Jun 19, 2024
@mergify mergify bot closed this as completed in #1928 Jun 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-Low Hanging Fruit Ideal issue for new contributors. L-Type system Issues related to the Swarm language type system. S-Nice to have The bug fix or feature would be nice but doesn't currently have much negative impact. T-Editors Involves editor support. Z-Developer Experience This issue seeks to make life easier for developers writing Scenarios or other Swarm code.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants