Reset gensym for predictable names and better query caching #3515
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
With Guido, we found that this change breaks query caching: 8efd8f6
The reason is that this causes the gensym for the desugaring/typing environment to change across invocations. So, if you write a definition in the IDE, then retract it, and feed it again with say one more line in it at the end, all the internal unique names of the definition are changed.
So, in this PR what I do instead is to reset the gensym before parsing each declaration, and reset it again before desugaring each decl, and reset it again before typechecking each decl. This results in stable unique indexes for Tm_names and results in better caching. It also produces names with smaller indexes, starting at 0.
There is one catch: although the gensym is mainly used to generate unique names when opening binders, it is also used to produce unique names for anonymous top-level definitions
E.g., When a user write
let _ = 0
let _ = 1
The gensym is used to turn this into
let _0 = 0
let _1 = 1
Resetting the gensym conflicts with this. So, when generating unique names for top-level definitions, my branch repeatedly takes names from the gensym until it finds the smallest next name that is not yet used in the top-level environment.
Fwiw, another possibility could have been to use two separate gensyms: one for top-level names that is only reset at module boundaries, and another for local names that is reset at each decl.