Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

beginning incremental compilation for casual review #14009

Closed
wants to merge 77 commits into from

Conversation

disruptek
Copy link
Contributor

No description provided.

@cooldome
Copy link
Member

cooldome commented Apr 18, 2020

Please don't combine renaming and refactoring and new functionality in one PR.
It makes review process a pain and PR review will likely face delays and unnecessary discussions.

Keep these two separate please.
+20 on incremental compilation

@disruptek
Copy link
Contributor Author

I'm not sure on what you're referring to, but, of course you are correct. It just wasn't feasible here. The review is going to be ridiculous; we talked about it and there's no point in splitting hairs.

@disruptek
Copy link
Contributor Author

Sorry, I should have added that this will likely get split into multiple PRs by hand as I go through and minimize it, etc. It has just become unwieldy to keep branching on every little rethink; hence the branch name -- ic2. 😁

compiler/ast.nim Outdated Show resolved Hide resolved
compiler/ast.nim Outdated Show resolved Hide resolved
compiler/ccgexprs.nim Outdated Show resolved Hide resolved
compiler/ast.nim Outdated Show resolved Hide resolved
@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

  • introduction of setRope creates a massive diff by itself (and makes reviewing or rebasing other PRs hard), why not instead add:
    proc `r=`*(a: var TLoc; roap: Rope) {.tags: [TreeWrite].} =...
    which avoids changing all foo.r = expr

testes Show resolved Hide resolved
@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

  • the more you can isolate (while keeping code DRY) inside compiler/ic/ the better, right now there's a massive surface area with existing code that makes reviewing hard, increases chances of regressions and rebase conflicts for your PR (or other PR if they rebase to yours); for example instead of inlining IC specific code inside existing procs, you could instead refactor to call a helper proc inside compiler/ic/, thus reducing surface with existing code.

  • I feel like a compiler server approach is an alternative approach worth exploring for incremental compilation. Better performance, much easier (and shorter) implementation, no need for serialization/deserialization, everything is in memory.

@disruptek
Copy link
Contributor Author

You're not wrong; that's why I moved everything into ic yesterday.

A lot of the surface will be reduced as more of the AST mutation is eliminated; the goal is for the frontend to be immutable. So, all the location stuff will likely go away. I wanna aim for like 3-5 calls in the API, but that's probably my naivety talking.

The plan is to make an intermediate AST format that is used specifically by the backend. It will basically constitute a format trivially translated to backend modules; it can be geared to a much simpler set of semantics. We will rehydrate the frontend AST for macro passes (if necessary) but otherwise leave it cached. The backend AST could be cached, too, but it seems unlikely that it will ever be expensive to evaluate, and less complexity equals fewer bugs.

I think a compiler server is pretty easy once we get the AST locked down, but it's more likely that we'll want less in memory and not more. As the abstractions grow, we want to be planning for less memory consumption so we continue to work on embedded platforms.

I'm interested in pursuing a couple avenues; the easiest is probably to have new objects enter the environment via the cache, but a proper network repl would be cool, too.

First, though, I have to make it actually compile stuff. I'm still learning how it fits together, and I'm slow. Then, I have to make it all much simpler.

Thanks for taking a look.

@timotheecour
Copy link
Member

timotheecour commented Apr 25, 2020

  • > The plan is to make an intermediate AST format that is used specifically by the backend

So basically introducing an IR after semantic phase. This could be a good idea; it'd simplify backend code (js, c, cpp, objc), ease integration with https://github.com/arnetheduck/nlvm, reduce bugs where abstract Ast nodes wrongly end up in cgen (causing codegen errors), allow a future wasm backend without emscripten etc. It'd be a simplified AST. Maybe a cheap way to achieve similar goals could be to define TNodeBackendKind, a subset of TNodeKind.

That being said, I'm not sure this would help with IC, which needs to handle non-backend AST.

  • > I think a compiler server is pretty easy once we get the AST locked down, but it's more likely that we'll want less in memory and not more. As the abstractions grow, we want to be planning for less memory consumption so we continue to work on embedded platforms.

I'm not sure I buy that. My use case for IC is on a development machine (lots of RAM although I doubt it should consume much more than what nim compiler already does; or later, even hosted on cloud). It can still target embedded platforms, that's orthogonal (eg like developping for ios from a mac).

I'm actually super interested in the "compiler server" approach, and this could potentially drive down compile times (for partial recompilations) to interactive speed, enable true REPL and such. But I don't want to derail your PR on this topic.

@disruptek
Copy link
Contributor Author

I'll let @Araq comment on his plans for a new IR, but I understand they are orthogonal to this effort. I have a feeling that he wants more of the super-efficient packed AST we talked about -- that could model more of the frontend. Maybe we'd continue to use the backend AST in his new design; it'll be a good place to experiment with some very simple "raw memory" exploits.

Lots of memory is great, but we already hit limits with the compiler, so it's no longer debatable. I think memory mapping the cache is a simple middle ground that should lift all boats. The new AST will give us an opportunity to decouple more of the backend code and perform compilation in smaller chunks, so I'm hopeful that it's win/win there, too.

Early days -- everyone's an optimist!

@Araq
Copy link
Member

Araq commented Apr 26, 2020

I'm actually super interested in the "compiler server" approach, and this could potentially drive down compile times (for partial recompilations) to interactive speed, enable true REPL and such. But I don't want to derail your PR on this topic.

That's effectively what I already tried with nimsuggest and nimsuggest can leak memory in a way we have never been able to track down. The current mutability soup is not sustainable in the long run and we need to experiment with alternative IRs, I see no way around it. The existing AST / compiler architecture is over ten years old and I think I learned some things in the meantime.

@Clyybber
Copy link
Contributor

Continues in a different branch now :)

@Clyybber Clyybber closed this May 17, 2020
@Araq
Copy link
Member

Araq commented May 18, 2020

Ping @disruptek, is this true?

@disruptek
Copy link
Contributor Author

Yeah, I squashed and rebased recently.

@timotheecour
Copy link
Member

@disruptek well you can simply git push --force which avoids having to open a new PR and loose all the comments

@disruptek disruptek deleted the ic2 branch September 12, 2020 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants