Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rustc Driver Chapter #76

Merged
merged 8 commits into from
Mar 12, 2018
Merged

Conversation

Michael-F-Bryan
Copy link
Contributor

I've started working on a rustc-driver chapter.

It's still very early days and I've written it as someone with experience instrumenting rustc from the outside so I don't know if it'll flow with the other chapters, so feedback, comments, and criticism are most welcome! 😁

The basic points covered are:

  • What is it? glue code/the compiler's main() function
  • What phases does it go through?
  • At which points in the compilation process can I inspect the compiler's state, and what can I see

I also want to add something like @nrc's https://github.com/nrc/stupid-stats (which is awesome btw, thanks @nrc!) as an appendix for future explorers. If it's fine with him, I might end up copying most of it across and update it to take into account any changes to the compiler since the tutorial was written.

(fixes #74)

@mark-i-m mark-i-m mentioned this pull request Mar 8, 2018
15 tasks
@nikomatsakis
Copy link
Contributor

Thanks for doing this!

Copy link
Member

@mark-i-m mark-i-m left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! A few minor nits...

stuff). The `rustc_driver` crate also provides external users with a method
for running code at particular times during the compilation process, allowing
third parties to effectively use `rustc`'s internals as a library for
analysing a crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably be noted somewhere that doing so is completely unstable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also that compiler devs should avoid making breaking changes where possible, since it is an API.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"analysing a crate" - or emulating the compiler - that is if you want to compile some code and do so in-process, then you use the driver API (e.g., the RLS)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should probably be noted somewhere that doing so is completely unstable.

That's a good point. I'll add a big warning saying we're using compiler internals and, similar to nightly, the internal APIs are always going to be unstable.

`ParseSess` | struct | This struct contains information about a parsing session | [The parser](the-parser.html) | [src/libsyntax/parse/mod.rs](https://github.com/rust-lang/rust/blob/master/src/libsyntax/parse/mod.rs)
`Session` | struct | The data associated with a compilation session | [the Parser](the-parser.html), [The Rustc Driver](rustc-driver.html) | [src/librustc/session/mod.html](https://github.com/rust-lang/rust/blob/master/src/librustc/session/mod.rs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: s/the Parser/The parser/

(for consistency elsewhere)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's actually a typo from when I started the original The parser chapter. It's a title, so both words should be capitalized.

2. *Configure and Expand:* Resolve `#[cfg]` attributes and expand macros
3. *Run Analysis Passes:* Run the resolution, typechecking, region checking
and other miscellaneous analysis passes on the crate
4. *Translate to LLVM:* Turn the analysed program into executable code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... Should it be

  1. Turn the analysed program into LLVM IR
  2. Run LLVM

or are these actually 1 step in the code?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd imagine they are two discrete steps, but from rustc-driver's perspective they all happen at the same time. From what I can tell there are no callbacks between the after_analysis phase and the end of compilation, so for all intents and purposes they both happen in the same "LLVM codegen" phase.

I should mention that explicitly though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is one phase - we never actually generate LLVM IR (unless you specify that as the output) we use LLVM as a library where the IR is implicit (at least the last time I hacked on the back end we did).

@@ -0,0 +1,396 @@
# Appendix A: A tutorial on creating a drop-in replacement for rustc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't read @nrc's post, but if this is a copy of it, then we should probably add a citation with a link to the original.

Copy link
Contributor Author

@Michael-F-Bryan Michael-F-Bryan Mar 10, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! It's just a copy-paste at the moment so we should mention @nrc is the original author. It also means I'll be able to find the link to the original document when I want to refer to it later on.

s: &Session,
i: &Input,
odir: &Option<Path>,
ofile: &Option<Path>)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this function needs to be updated to work on the most recent compiler:

     fn late_callback(
         &mut self,
+        t: &TransCrate,
         m: &getopts::Matches,
         s: &Session,
+        c: &CrateStore,
         i: &Input,
         odir: &Option<PathBuf>,
         ofile: &Option<PathBuf>,
     ) -> Compilation {
+        self.default_calls.late_callback(t, m, s, c, i, odir, ofile);
         Compilation::Continue
     }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made a PR nrc/stupid-stats#8

@mark-i-m
Copy link
Member

Also, it looks like a rebase is needed?

@nrc
Copy link
Member

nrc commented Mar 11, 2018

If it's fine with him

It is fine with me :-) You should check it still compiles and work - the driver is unstable, so it may have changed since I last checked stupid-stats.

@@ -7,6 +7,10 @@ compiler.
Item | Kind | Short description | Chapter | Declaration
----------------|----------|-----------------------------|--------------------|-------------------
`CodeMap` | struct | The CodeMap maps the AST nodes to their source code | [The parser](the-parser.html) | [src/libsyntax/codemap.rs](https://github.com/rust-lang/rust/blob/master/src/libsyntax/codemap.rs)
`CompileState` | struct | State that is passed to a callback at each compiler pass | [The Rustc Driver](rustc-driver.html) | [src/librustc_driver/driver.rs](https://github.com/rust-lang/rust/blob/master/src/librustc_driver/driver.rs)
`ast::Crate` | struct | Syntax-level representation of a parsed crate | | [src/librustc/hir/mod.rs](https://github.com/rust-lang/rust/blob/master/src/libsyntax/ast.rs)
`hir::Crate` | struct | Top-level data structure representing the crate being compiled | | [src/librustc/hir/mod.rs](https://github.com/rust-lang/rust/blob/master/src/librustc/hir/mod.rs)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Top-level" isn't very helpful here - the HIR is basically a compiler-internal version of the AST, i.e., tools and other users should not use it (but can use the AST). It does not match the source text as closely as the AST, being more designed for compiler use, but still fits the 'textbook' definition of an AST.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to say "more compiler friendly" form of the AST? I believe the HIR is a desugared, more useful form of AST where we've broken things out into their categories (items, trait impls, etc).

stuff). The `rustc_driver` crate also provides external users with a method
for running code at particular times during the compilation process, allowing
third parties to effectively use `rustc`'s internals as a library for
analysing a crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But also that compiler devs should avoid making breaking changes where possible, since it is an API.

stuff). The `rustc_driver` crate also provides external users with a method
for running code at particular times during the compilation process, allowing
third parties to effectively use `rustc`'s internals as a library for
analysing a crate.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"analysing a crate" - or emulating the compiler - that is if you want to compile some code and do so in-process, then you use the driver API (e.g., the RLS)

From `rustc_driver`'s perspective, the main phases of the compiler are:

1. *Parse Input:* Initial crate parsing
2. *Configure and Expand:* Resolve `#[cfg]` attributes and expand macros
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Name resolution is part of this phase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ooh, I didn't know that! I though name resolution happened some time after the after_hir_lowering and before after_analysis gets called.

My dyna-bindgen tool was doing analysis at the after_hir_lowering stage and I couldn't figure out how to get the fully qualified name of a function arguments type. I thought I needed to wait until later and use the TyCtxt to make queries.


1. *Parse Input:* Initial crate parsing
2. *Configure and Expand:* Resolve `#[cfg]` attributes and expand macros
3. *Run Analysis Passes:* Run the resolution, typechecking, region checking
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure what 'resolution' means here, but name resolution now happens earlier. I think 'borrow-checking' is more common than 'region-checking'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think trait resolution happens here though, I think...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, correct

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haha, I'm not sure either! The rustdocs just say "resolution" so I thought I'd fudge it 😛

2. *Configure and Expand:* Resolve `#[cfg]` attributes and expand macros
3. *Run Analysis Passes:* Run the resolution, typechecking, region checking
and other miscellaneous analysis passes on the crate
4. *Translate to LLVM:* Turn the analysed program into executable code
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it is one phase - we never actually generate LLVM IR (unless you specify that as the output) we use LLVM as a library where the IR is implicit (at least the last time I hacked on the back end we did).

@mark-i-m
Copy link
Member

@nrc

It is fine with me :-) You should check it still compiles and work - the driver is unstable, so it may have changed since I last checked stupid-stats.

Thanks! I made a PR to make it compile and run on a recent compiler: nrc/stupid-stats#8

@Michael-F-Bryan Could you update the contents of the PR accordingly please :)

@Michael-F-Bryan
Copy link
Contributor Author

@mark-i-m I've rebased and updated the stupid stats appendix. Would you be able to have another look?

If you know anyone who's hacked on the driver recently it may be useful to CC them in. There's probably a lot of stuff I've missed.

@mark-i-m
Copy link
Member

@Michael-F-Bryan Thanks!

I personally don't know of anyone else to CC... The git blame of librustc_driver/lib.rs seems to show quite a lot of people :)

I think we can probably merge this and let others raise issues later for additions.

@mark-i-m mark-i-m merged commit 9874533 into rust-lang:master Mar 12, 2018
@Michael-F-Bryan Michael-F-Bryan deleted the rustc-driver branch March 13, 2018 00:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

"Navigating the compiler from the outside"
4 participants