-
Notifications
You must be signed in to change notification settings - Fork 12.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support borrow, scope and move visualization through RLS #42733
Support borrow, scope and move visualization through RLS #42733
Conversation
Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @nikomatsakis (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. Please see the contribution instructions for more information. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies if I missed anything. I've had a ton of merge conflicts over the past few months, so please, be as brutal as possible with your review.
@@ -343,7 +345,43 @@ impl<'a, 'tcx> GatherLoanCtxt<'a, 'tcx> { | |||
// Create the loan record (if needed). | |||
let loan = match restr { | |||
RestrictionResult::Safe => { | |||
// No restrictions---no loan record necessary | |||
let loan_scope = match *loan_region { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should have moved this and the below match
expression into its own function. I'll take care of that.
tcx.borrowck(body_owner_def_id); | ||
pub fn check_crate<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, save_analysis: bool) -> Option<HashMap<DefId, AnalysisResult<'a, 'tcx>>> { | ||
// FIXME: nashenas88 support this through TyCtxt | ||
if save_analysis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like this at all. The main reason I didn't modify how tcx.borrowck
works is that I didn't want to tie the output to the non-mir implementation. I could do the rls-data
transformations early, and have those types supported in tcx.borrowck
, but I'm also unsure how you guys would feel about an external library defining the output for such a low level api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think that we want to be moving towards a world where save-analysis doesn't "alter" anything in the normal control-flow, but just performs queries and uses the results.
To that end, I think we should probably just keep check_crate
as it is, but have save analysis just repeat the tcx.borrowck(body_owner_def_id)
calls. The results are memoized anyway, so there woudln't be any extra computational cost to speak of. Then we don't have to modify driver either.
self.span | ||
} | ||
|
||
pub fn span_in_def(&self, def_id: DefId, tcx: TyCtxt) -> Option<Span> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any ideas for a better name? This is to generate the span (if it exists) for the "scope" of the value pointed to by the DefId
. Should I add comments adding more details?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a comment would be good -- with some examples. I would think span_of_def
or span_for_def
would be a better name, to start. Is this the right place for this helper, anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason the helper is here is that gen_scope
and kill_scope
are private to Loan
. I had previously made public getters for them, though this seemed cleaner because it exposed less of Loan
's internals. I didn't like having this logic here though since it was so specific to the visualization logic, but it also felt like something the Loan
should be responsible for given the use of the private fields.
} | ||
} | ||
|
||
fn get_unexpanded_span(input_span: Span, wrapping_span: Span) -> Option<Span> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sometimes the input span is being defined within a macro, but I want it in context of the wrapping_span
. Is this name sufficient or should I change the name / add comments explaining this more clearly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to a question like this is always "add a comment". =) That would not have been obvious to me. I think an example of some Rust code where this is relevant would be even better.
@@ -43,7 +43,8 @@ extern crate core; // for NonZero | |||
|
|||
pub use borrowck::check_crate; | |||
pub use borrowck::build_borrowck_dataflow_data_for_fn; | |||
pub use borrowck::{AnalysisData, BorrowckCtxt, ElaborateDrops}; | |||
pub use borrowck::{AnalysisData, AnalysisResult, BorrowckCtxt, ElaborateDrops, Loan, SafeLoan}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is tied to the comments I made regarding the use of tcx.borrowck
. In order to support the MIR version of borrowck, should I hide these again and instead expose the rls-data
structs?
fn process_borrow_analysis(&mut self, def_id: DefId, mut analysis_result: AnalysisResult<'l, 'tcx>) { | ||
let mut safe_loans = { | ||
let mut safe_loans = vec![]; | ||
mem::swap(&mut safe_loans, &mut analysis_result.safe_loans); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm doing the mem::swap
and into_iter
calls to avoid cloning all of the data (all scopes, borrows and moves for the entire crate). Is that necessary? I'm not sure if I'm just being overly paranoid about the memory usage getting too high.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The downside to this approach is that the data can't be used by subsequent passes.
.kill_scope(&bccx) | ||
.span(&self.tcx.hir) | ||
.map(|span| { | ||
let expanded_span = match mov.map(|m| &m.span) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here I'm using the kill_scope
of an assignment to determine the "scope" of a value. I shorten it to the end of the soonest related move
if it's sooner to reflect what the compiler is doing internally for the non-mir case. This was something I clarified with @pnkfelix. Post-MIR borrowck complete, this will most likely have to be moved to within the old borrowck.
assignments.into_iter() | ||
.filter_map(|a| { | ||
let id = ::id_from_node_id(a.assignee_id, &self.save_ctxt); | ||
let mov = move_map.get(&id); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
move_map
holds references to MoveData
(&MoveData
) so the get
call here returns an Option<&&MoveData>
. Is there a way that can be simplified?
debug!("loan_scope = {:?}", loan_scope); | ||
|
||
let borrow_scope = region::CodeExtent::Misc(borrow_id); | ||
let loan_scope = self.compute_gen_scope(borrow_scope, loan_scope); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would this be enough to show the span when the loan is being held (this is for temporary loans, I believe)? It seems correct when visualizing, but I'm definitely no expert on the internals of the borrow checker.
☔ The latest upstream changes (presumably #42676) made this pull request unmergeable. Please resolve the merge conflicts. |
cc me |
ping r? @nikomatsakis although @nrc if you'd like to assign yourself as reviewer the title at least seems like that would be appropriate! |
loan_scope: loan_scope, | ||
span: borrow_span, | ||
}; | ||
self.safe_loans.push(safe_loan); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also like to not run through this section unless we're running save analysis. However, I don't see a clear, clean way to notify this section of code about that. Any recommendations on that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wouldn't worry too much about that. Save-analysis isn't really the direction we want things to go anyway -- that is, having a distinct "save-analysis mode". I think we hope that eventually the RLS will just be polling the results of regular queries.
I think @nikomatsakis would be better to give feedback on the borrow checker stuff. I'll take a look at the save-analysis end. @Nashenas88 sorry review is taking a while here, Niko has been at a conference and we've both been travelling. |
@nrc no worries and no rush. I've been very busy the past couple of weeks myself. Just let me know when I should do a rebase so I don't lose track of all the comments. |
@@ -289,6 +291,30 @@ impl<'a, 'tcx> GatherLoanCtxt<'a, 'tcx> { | |||
} | |||
} | |||
|
|||
fn loan_scope_from_region(&self, loan_region: ty::Region<'tcx>) -> Option<region::CodeExtent> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment here would be nice. What is the "loan scope" etc? :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this make sense? Given a region for a loan, returns the corresponding loan_scope that could represent the gen_scope and/or kill_scope of a loan.
It's the exact same code that used to be in lines 351-374 in the base file. Based on the usage, it made sense to me that this would represent the combination of the gen_scope and kill_scope for what I called a SafeLoan (do you think there's a better name?). For normal loans, it's a possible value for the gen_scope and/or kill_scope. (See compute_gen_scope
and compute_kill_scope
).
}; | ||
|
||
Scope { | ||
ref_id: ::id_from_node_id(assignment.assignee_id, &self.save_ctxt), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I need to fix these. Working on the visualization integration, I ran into obvious problems here. id_from_node_id
looks for local DefId
s before converting to rls-data
's Id. The problem is that these NodeId
s (and the ones below) are referring to Local nodes, which have no DefId
. If the type of this object isn't in the same crate, these all get converted to NULL Ids. I didn't run into this earlier because I was testing with a single file crate (sorry about that).
@nrc does it make sense to create a new type of Id
that's used only so these BorrowData
types can reference each other?
@nikomatsakis is there anything else here that I need to update (besides the merge conflicts)? |
@Nashenas88 I'll make another pass; sorry for delay, been traveling a lot in the time period since you opened this PR =) |
No worries. I know you and the team are always working really hard, so thank you. When you do get a chance, could you give your thoughts to the questions I posted in a few areas? Some were big concerns that might warant closing this PR and doing a major redesign. |
OK, sorry again. I keep running out of time to cover this PR each time I do a daily sweep, so this time I'm going to do it first. Let me start with some high-level thoughts: One reason I've been a bit wary is that you are modifying the old borrowck, but we expect some major modifications to be coming down the pike:
That said, I'd hate to block this work on either of those things. I wouldn't expect the other parts of this work (e.g., the RLS changes) to be much affected by either of those changes, though naturally the region work would require generalizing how you display regions to be more flexible. At this point, let me stop and ask you a question. =) I've been wanting to find someone to mentor through the non-lexical lifetimes implementation: specifically, I want to start prototyping the new region calculation on MIR (independent of the borrowck half; the two are readily separable). The idea would be to, given an appropriate rustc flag, do the region calculation on MIR and have some assertions that check the results. This would let us unit test the region calculation. Then, once the MIR borrowck has landed and stabilized, we could connect the two. While talking to @pnkfelix, it occurred to us that you might be an ideal candidate for this work, if it interests you! So let me know. Presuming we were to do that, we could also use the results of said inference to inform the RLS visualization. (In other words, we could target the RLS borrow visualization at NLL, and maybe aim to "debut" them simultaneously.) Also, I noticed that some of the changes I see in this PR (e.g., tracking the "safe loans") are also unnecessary in the newer formulation of borrowck that I describe in the NLL RFC. In that formulation, we track all loans, but we just change the ones we report errors for. Anyway, those are my "up-front" thoughts, let me go through the code in a bit more detail, and please accept my renewed apologies for being slow. I will endeavor to be more timely to respond from this point on -- feel free to ping me on IRC or Gitter (or privmsg) also! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, didn't have time to finish-up, here are some thoughts so far. Not quite sure about how to balance queries vs the map etc.
@@ -37,11 +37,12 @@ mod move_error; | |||
|
|||
pub fn gather_loans_in_fn<'a, 'tcx>(bccx: &BorrowckCtxt<'a, 'tcx>, | |||
body: hir::BodyId) | |||
-> (Vec<Loan<'tcx>>, move_data::MoveData<'tcx>) { | |||
-> (Vec<SafeLoan>, Vec<Loan<'tcx>>, move_data::MoveData<'tcx>) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I feel like the right fix here might be removing the notion of SafeLoan
altogether, and adjusting the follow-on code, per the strategy I describe in the NLL RFC.
tcx.borrowck(body_owner_def_id); | ||
pub fn check_crate<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, save_analysis: bool) -> Option<HashMap<DefId, AnalysisResult<'a, 'tcx>>> { | ||
// FIXME: nashenas88 support this through TyCtxt | ||
if save_analysis { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, I think that we want to be moving towards a world where save-analysis doesn't "alter" anything in the normal control-flow, but just performs queries and uses the results.
To that end, I think we should probably just keep check_crate
as it is, but have save analysis just repeat the tcx.borrowck(body_owner_def_id)
calls. The results are memoized anyway, so there woudln't be any extra computational cost to speak of. Then we don't have to modify driver either.
pub all_loans: Vec<Loan<'tcx>>, | ||
pub loans: DataFlowContext<'a, 'tcx, LoanDataFlowOperator>, | ||
pub move_data: move_data::FlowedMoveData<'a, 'tcx>, | ||
} | ||
|
||
fn borrowck<'a, 'tcx>(tcx: TyCtxt<'a, 'tcx, 'tcx>, owner_def_id: DefId) { | ||
pub struct AnalysisResult<'a, 'tcx: 'a> { | ||
pub bccx: BorrowckCtxt<'a, 'tcx>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not crazy about keeping the BorrowckCtxt
around either. This feels to me like a "temporary" data structure, not one that is meant to be preserved. =) But I guess I'll have to read on to see what you're using it for exactly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I remember bringing this up in IRC at some point (not BorrowckCtxt
specifically, but using internals of the borrow checker), but I can't recall exactly who I was speaking with. The issue was that this is used to calculate some spans for assignments, but I don't know which ones to compute it for until later, and the worry was that computing it early could just be unnecessary computation for when someone isn't using the save context.
This might should change entirely with the NLL approach though. For reference, it's used in dump_visitor.rs
, fn process_assignment
line 222.
self.span | ||
} | ||
|
||
pub fn span_in_def(&self, def_id: DefId, tcx: TyCtxt) -> Option<Span> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely a comment would be good -- with some examples. I would think span_of_def
or span_for_def
would be a better name, to start. Is this the right place for this helper, anyway?
} | ||
} | ||
|
||
fn get_unexpanded_span(input_span: Span, wrapping_span: Span) -> Option<Span> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The answer to a question like this is always "add a comment". =) That would not have been obvious to me. I think an example of some Rust code where this is relevant would be even better.
@@ -269,6 +372,15 @@ impl<'tcx> LoanPath<'tcx> { | |||
LoanPath { kind: kind, ty: ty } | |||
} | |||
|
|||
pub fn ref_node_id<'a>(&self) -> ast::NodeId { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a comment too -- I guess it returns the "node-id of the base variable or upvar of the path"? I feel like there probably exists a helper like this...but maybe not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this well over 6 months ago, so during that time someone might have written a helper for this. Back then I only saw other usages of this in match
statements. The purpose here is to get the NodeId
for the value that a Loan
is being created for. I should have come back to this because I never really learned what Upvar, Downcast and Extend actually do.
@@ -177,7 +180,7 @@ pub fn compile_input(sess: &Session, | |||
&arena, | |||
&arenas, | |||
&crate_name, | |||
|tcx, analysis, incremental_hashes_map, result| { | |||
|tcx, analysis, incremental_hashes_map, borrow_analysis_map, result| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I am unkeen on threading this around. I think we should just modify the save-analysis code to repeat the borrowck
queries -- though this implies that the results will be stored in the memoization maps.
I chatted a bit with @eddyb on IRC. He suggested a setup like the following -- well, i'm extrapolating a bit, but I think this is what he suggested. I'd also be happy with this setup, though it may be something we want to adjust in the future, as it has the downside (same as your current branch) of running the borrowck twice.
Do you know how to add queries and so forth? You have to modify |
I'd suggest, if possible, to skip some of the top-level borrowck steps (not have flags propagated down from there), especially those that produce errors, so that you only keep the information you want and not do needless work. For example, I don't think you need to check loans (unless you want even fancier error rendering?) or do anything about moves. |
I am absolutely interested in working with you guys on the non-lexical lifetimes implementation! I'm hoping that my current work doesn't imply that I have a very solid foundation on the type theory behind all this (I don't). If there's anything I should be comfortable with ahead of time, let me know so I can try to catch up in my spare time. Thank you for this, it really made my day! I'll follow up with the rest of the questions later tonight. |
@nikomatsakis I like the multiple queries idea. I'm also familiar with the maps in @eddyb so last year @nikomatsakis actually brought up the idea of visualizing borrow errors (a little more involved with arrows pointing you to points of interest), though it's not part of my current design. I'd rather that wait until the switch to the MIR implementation since I imagine it'd be touching a lot more code and would be harder to port over. Could you clarify what you mean by "do[ing] anything with moves"? Currently I do export moves for visualization. |
Given the above comments regarding the mentorship opportunity, should this be closed while we regroup? I can implement the suggestions currently, but it will be hard to keep this open and comments valid especially with all the conflicts coming up so far (current count is 5 files). |
Personally, I think that it makes more sense to focus on getting the NLL inference up-and-going, and let MIR borrowck (and RLS for that matter) stabilize a bit, and then revisit the borrow visualization after that point. @nrc may disagree. The downside of landing anything now is that we'll have to maintain and "forward port" it, so there is some cost over time. |
I'll close this for now. I'll revisit this when NLL is farther along. |
ping-- this may be worth revisiting? |
I was actually making plans yesterday to start back on this work :). I have next week off and part of that free time I'm going to use to jumpstart back on visualizations. |
This PR is not ready to merge yet. I'm looking to get more detailed feedback with all of the code for context. This is also dependent on
rls-data
being updated in crates.io (so I can fix the relative paths in theCargo.toml
files) and on the code review inrls-analysis
, mostly in that theHashMap
used to store theBorrowData
structs might change to aVec
.