Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove the nOrig from all over the compiler #162

Merged
merged 1 commit into from
Jan 28, 2022

Conversation

krux02
Copy link
Contributor

@krux02 krux02 commented Jan 16, 2022

nOrig is pretty much only used for macros.callsite which is
deprecated/redundant since Nim has untyped as macro arguments. Since
then it has huge negative value on both developers as well as the
computer because all of this copyTree that is happening to preserve it.

Before this change, callsite returned an AST like it got spit out by
the parser. But overload resolution does modify the ast. arguments get
converted into nkArgList for varargs for example. To actually preserve
the unchanged AST, there is this nOrig parameter all over the
compiler. The idea is, befere a change is applied to the ast, deep copy
the tree and store it as nOrig and pass that down along n. But it
this strict AST preservation isn't tested, so I wouldn't assume that it
actually work.

Now I am passing down the node n instead of nOrig to callsite. So
when the macro expects typed arguments, the corresponding children in
the call node are also sem checked. Varargs are grouped to
nkArgsList.

So callsite in myMacro(a,b) is now often
(nnkCall (ident "mymacro") (nnkArgsList (ident "a") (ident "b")))
instead of (nnkCall (ident "mymacro") (ident "a") (ident "b")) which
causes indices to be wrong and the macro crashes or generates an
invalid ast.

The one thing to preserve callsite for, that can't be done with simply
reading the argument (which should have been done for years now), is to
get line information of the actual macro invocation. But for this it
might be better to just pass the the first child of call, the symbol
of the macro down to callsite.

@saem
Copy link
Collaborator

saem commented Jan 17, 2022

I didn't realize that that was the prime motivating factor for nOrig, that's really good to know in terms of history alone.

I thought that untyped is not entirely a replacement for macros.callsite, as in you can't get quite the same information, such as a macros' instantiation itself. Relatedly, this hints at some insight as to how incremental compilation should work a bit more.

I'm a little nervous about removing it, but removing copies of the AST also means it's potentially easier to introduce a data oriented design AST and staged / incremental compilation. I guess the next thing to figure out is what's breaking and whether we can live with that in the meantime -- I'll take a peek at that in a bit.

@krux02
Copy link
Contributor Author

krux02 commented Jan 17, 2022

I didn't realize that that was the prime motivating factor for nOrig, that's really good to know in terms of history alone.

I thought that untyped is not entirely a replacement for macros.callsite, as in you can't get quite the same information, such as a macros' instantiation itself.

Well you pointed it out, line information from where the macro is called is lost when callsite is removed. Therefore I didn't remove it. But there are simpler ways to preserve line information from the callsite other than to prehandedly deep copy every ast node before applying transformations on it.

In this PR I changed callsite to just return the transformed ast (before macro invocation), this will preserve callsite lineinformation, but breaks other usages of callsite.

Relatedly, this hints at some insight as to how incremental compilation should work a bit more.

Good to know.

I'm a little nervous about removing it, but removing copies of the AST also means it's potentially easier to introduce a data oriented design AST and staged / incremental compilation. I guess the next thing to figure out is what's breaking and whether we can live with that in the meantime -- I'll take a peek at that in a bit.

Well macros.callsite breaks, and I probably introduced a bug because I am not 100% perfect.

Note

Aparently I broke unittest, it uses callsite. What is bothering me here is, callsite is marked as deprecated for years now and why isn't CI glowing like a christmas tree to point to all these usages of deprecated features? The compiler is aparently hiding these warning messages, otherwise somebody would have fixed it by now. And what is getting me uncomfortable even more is, this is just the tip of the iceberg, there are probably thousands of warnigs all over the compiler and standard library, and the compiler is hiding these messages. Something needs to be done about it. It can't be that something gets marked as deprecated and years later its usages are still not touched at all in the compiler.

@saem
Copy link
Collaborator

saem commented Jan 17, 2022

Ouch, ok I didn't think we'd hit such a deep issue on this PR. Yeah, need to figure out how these warnings are being suppressed and see what we want to actually deprecate. 😅

@krux02
Copy link
Contributor Author

krux02 commented Jan 21, 2022

Regarding this PR, there is a serious compiler bug triggered in unittest when callsite is replaced to just use the argument of the macros directly (even though the ast is identical). I am currently refactoring unittest to use fewer underspecified and problematic features of nim.

@haxscramper
Copy link
Collaborator

there are probably thousands of warnigs all over the compiler and standard library, and the compiler is hiding these messages. Something needs to be done about it. It can't be that something gets marked as deprecated and years later its usages are still not touched at all in the compiler.

Now every single message goes through the cli_reporter.reportHook, and only then gets filtered out, so we can check whether a specific message has been actually suppressed, where it was generated, write it out forcefully in some log file (that can be later shown in CI annotations for example) and so on.

@krux02 krux02 force-pushed the no-more-norig branch 5 times, most recently from 96b9c7e to df0bc4c Compare January 25, 2022 20:30
@krux02
Copy link
Contributor Author

krux02 commented Jan 26, 2022

The failing test doesn't fail on my computer. What do I do now?

@krux02 krux02 force-pushed the no-more-norig branch 2 times, most recently from 8867cc7 to 14affcc Compare January 26, 2022 20:39
Copy link
Contributor

@alaviss alaviss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Something that I worry about is that removing callsite means it becomes impossible for macro to create sane line information (that doesn't point to the newNimNode statements). Though I suppose the compiler should be responsible for this instead of users putting hacks in to get something together (we should note this down in an issue or something).

Overall this looks fine, except for a few nits below:

Comment on lines 3 to 6
when not defined(netbsd):
# Ref: https://github.com/nim-lang/Nim/issues/15452 - NetBSD doesn't define an `ip` protocol
doAssert getProtoByName("ip") == 0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this removed at all?

Copy link
Contributor Author

@krux02 krux02 Jan 27, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test fails at my manjaro machine, and it fails on my Arch linux machine. Aparently it also causes issues at BSD and who knows where else. Maybe it is better to not test for the wrapper that the backend actually supports an API? After all getProtoByName is just a wrapper, not some actual implementation. If the protocol name isn't supported or doesn't return that name, it doesn't mean the wrapper is wrong, does it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds fine to me. Would you mind splitting this into its own PR since it doesn't have much to do with this change? That way I can get it merged first.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I can do it. But I also want you to know that it really costs a lot of energy from me to manage many different pull requests that have dependencies on each other.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think about making a PR that only updates the tests, but all of them that are required for this PR to work? The tests should have been updated or even partially deleted long time ago. They don't specifically test the callsite() feature, they just use it for historic reasons, nothing else. They don't represent current Nim anymore at all.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds fine, though I think leaving them in here is a bit better for people reading this in the future, as this change have an effect on callsite()'s output and you get to figure out why was callsite purged from tests and what changed all at once.

I'd say that only this one test should be splitted out due to being completely irrelevant to this change.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mild preference for this PR. Just my two cents.

Comment on lines +39 to +40
let arg1 = args[0]
let arg2 = args[1]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why alias instead of a simple n[1] -> args[0] replacement?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually I think it would be best to have arg1 and arg2 directly in the argument list to the macro. For some reasons that I don't really want to dig in, I didn't work out. The test is really old and has many more problems than this. I didn't plan to really refactor this test.

@alaviss
Copy link
Contributor

alaviss commented Jan 27, 2022

Actually there is something that should be described by this PR/commit message: What will happen to callsite with this change?

@krux02
Copy link
Contributor Author

krux02 commented Jan 27, 2022

@alaviss yes I can do that, but shouldn't this be logged in a changelog as well? Or are the commit messages our changelog now?

@saem
Copy link
Collaborator

saem commented Jan 27, 2022

@alaviss yes I can do that, but shouldn't this be logged in a changelog as well? Or are the commit messages our changelog now?

I think we should go with derive a quick and dirty change log from commit messages. Then later for folks interested in better change logs we at least have a nice starting point to clean up into something better. It's one of the intended uses of the commit messages.

Copy link
Collaborator

@saem saem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So one thing I'm confused about is how callsite got impacted, does it work the same way?

If it does then why do we need to change the tests? If it doesn't, then why not remove it?

@krux02
Copy link
Contributor Author

krux02 commented Jan 27, 2022

@saem Before this change, callsite returned an AST like it got spit out by the parser. But overload resolution does modify the ast. arguments get converted into nkArgList for varargs for example. To actually preserve the unchanged AST, there is this nOrig parameter all over the compiler. The idea is, befere a change is applied to the ast, deep copy the tree and store it as nOrig and pass that down along n. But it this strict AST preservation isn't tested, so I wouldn't assume that it actually work.

Now I am passing down the node n instead of nOrig to callsite. So when the macro expects typed arguments, the corresponding children in the call node are also sem checked. Varargs are grouped to nkArgsList.

So callsite in myMacro(a,b,c) is now often (nnkCall (ident "mymacro") (nnkArgsList (ident "a") (ident "b") (ident "c"))) instead of (nnkCall (ident "mymacro") (ident "a") (ident "b") (ident "c")) which causes indices to be wrong and the macro crashes or generates an invalid ast.

The one thing to preserve callsite for, that can't be done with simply reading the argument (which should have been done for years now), is to get line information of the actual macro invocation. But for this it might be better to just pass the the first child of call, the symbol of the macro down to callsite.

@saem
Copy link
Collaborator

saem commented Jan 28, 2022

@krux02 that was an amazing explanation, I learned a bunch. I totally zoned on the sutbtle item about how overload resolution mutates the ast. Anyhow, it was so good, I hope you don't mind, but I added it to the PR description. Don't worry, I took the time to reformat it to fit into the 72 char line width.

@saem
Copy link
Collaborator

saem commented Jan 28, 2022

@krux02 can you shove it into the commit (straight copy is fine) and then kickoff a bors, it's g2g.

nOrig is pretty much only used for macros.callsite which is
deprecated/redundant since Nim has untyped as macro arguments. Since
then it has huge negative value on both developers as well as the
computer because all of this copyTree that is happening to preserve it.

Before this change, callsite returned an AST like it got spit out by
the parser. But overload resolution does modify the ast. arguments get
converted into nkArgList for varargs for example. To actually preserve
the unchanged AST, there is this nOrig parameter all over the
compiler. The idea is, befere a change is applied to the ast, deep copy
the tree and store it as nOrig and pass that down along n. But it
this strict AST preservation isn't tested, so I wouldn't assume that it
actually work.

Now I am passing down the node n instead of nOrig to callsite. So
when the macro expects typed arguments, the corresponding children in
the call node are also sem checked. Varargs are grouped to
nkArgsList.

So callsite in myMacro(a,b) is now often
(nnkCall (ident "mymacro") (nnkArgsList (ident "a") (ident "b")))
instead of (nnkCall (ident "mymacro") (ident "a") (ident "b")) which
causes indices to be wrong and the macro crashes or generates an
invalid ast.

The one thing to preserve callsite for, that can't be done with simply
reading the argument (which should have been done for years now), is to
get line information of the actual macro invocation. But for this it
might be better to just pass the the first child of call, the symbol
of the macro down to callsite.
Copy link
Contributor

@alaviss alaviss left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

bors r+

@bors
Copy link
Contributor

bors bot commented Jan 28, 2022

Build succeeded:

@bors bors bot merged commit 138a842 into nim-works:devel Jan 28, 2022
@krux02 krux02 deleted the no-more-norig branch January 28, 2022 18:32
@haxscramper haxscramper added this to the Sem phase refactoring milestone Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants