Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Print all paths if no argument is given to twiggy paths #63

Merged
merged 1 commit into from
May 18, 2018

Conversation

data-pup
Copy link
Member

@data-pup data-pup commented May 14, 2018

This is some initial work to try and solve #57. If no argument is given, then all paths should be printed. I was thinking about this, and thought that it might be nice to have this descend the paths rather than printing a whole list of ascending call paths.

I'm not totally sure if that is a good idea or not, and thought I would open up a PR for some thoughts. I have a little bit of refinement work to do on this, right now type[n] items are printed out, which I don't think should be included in the output if I understand this correctly.

So, questions I have at the moment:

  • Does descending the call paths make sense if no argument is given?
  • Should type items be included in the output? If not, is there an easy way to identify these?
  • How should these be sorted? Name, depth, size? That might be worth adding a new issue for.
  • If the ascending/descending choice is exposed as an option flag, what should that be called?

@data-pup data-pup requested a review from fitzgen May 14, 2018 23:55
@data-pup data-pup changed the title Added some initial work on the twiggy paths command to fix #57 Print all paths if no argument is given to twiggy paths May 14, 2018
@data-pup
Copy link
Member Author

Oh, some disclaimers: The successors tree thing does not really make sense, and I'll need to take that out. Was trying to see if that was a way around the issue of type items showing up. The tests I added don't pass right now, but I wanted to open this up & get some thoughts before I push further in this direction :)

@fitzgen
Copy link
Member

fitzgen commented May 15, 2018

I was thinking about this, and thought that it might be nice to have this descend the paths rather than printing a whole list of ascending call paths.

Not sure I follow, but I expect looking at the code will clear this up.

I have a little bit of refinement work to do on this, right now type[n] items are printed out, which I don't think should be included in the output if I understand this correctly.

The thing is that "call graph" is a slight misnomer, and maybe we should fix it; I'd like to hear your take on it. If we were really working with a call graph then the only vertices we would have would be functions. But we want (and right now have) vertices for anything taking up space in the binary. This includes data segments and custom sections, as well as each type entry. Instead of having edges only where one function calls another function, we also have edges from X to Y whenever X requires that Y also be present in the binary. For example, if you have a function in the binary, it requires that its type entry also exist, so we will make an edge from the function to the type entry.

I'm not 100% sure what to call this graph. "Retaining graph", because X->Y means that X retains Y in the binary? It also is vaguely reminiscent of GC heap graph tooling, which is where a lot of twiggy's inspiration comes from...

Does descending the call paths make sense if no argument is given?

Need to read the code before I can properly answer.

Should type items be included in the output? If not, is there an easy way to identify these?

Yes, I think they should be for now. I'd like to avoid any special casing kinds of items, and instead implement a generic way to summarize items in the graph and condense things.

How should these be sorted? Name, depth, size? That might be worth adding a new issue for.

Definitely by shallow size. Optionally by retained size if a flag is set.

If the ascending/descending choice is exposed as an option flag, what should that be called?

Again, need to read the code to understand this better.

@data-pup
Copy link
Member Author

data-pup commented May 15, 2018

The thing is that "call graph" is a slight misnomer, and maybe we should fix it; I'd like to hear your
take on it. If we were really working with a call graph then the only vertices we would have would be
functions. But we want (and right now have) vertices for anything taking up space in the binary.
This includes data segments and custom sections, as well as each type entry.

Ahh, that's really good to know. I think that helps a lot.

Yes, I think [type items] should be [included] for now. I'd like to avoid any special casing kinds of items,
and instead implement a generic way to summarize items in the graph and condense things.

🙏 Awesome. There's definitely some cruff in this commit that comes from experimenting with ways to keep those out of the output, so that is good to know.

P.S. Everything in the code regarding successors is almost certainly redundant as far as I can tell, that'll get removed. Your "call graph" misnomer point I think answers everything I was dealing with on that front.

Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good start!

The descending stuff seems like a fine option to have (still think default should be ascending) but we do need to invert the arrows when doing it -- see inline comment.

Thanks!

ir/ir.rs Outdated
@@ -214,6 +227,37 @@ impl Items {
);
}

/// Force computation of successors.
pub fn compute_successors(&mut self) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have this available, so we don't need to add this. The neighbors are the successors :)

┊ ┊ ⬑ func[3]
┊ ┊ ⬑ bark
┊ ┊ ⬑ func[1]
5 ┊ 2.20% ┊ ⬑ calledTwice
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If descending, we would want to invert the arrows too, so that X->Y still means something like "X calls/uses/requires/etc Y"

@fitzgen
Copy link
Member

fitzgen commented May 15, 2018

As far as the name for the flag, I think "--descending" is fine 👍

@data-pup
Copy link
Member Author

Awesome! So, in the case that someone just runs a plain twiggy paths $THING with no function names, should that also default to ascending? We could fill the Paths.items member with all of the leaf nodes that are Code items in that case. Otherwise, we could descend down to max_depth etc., starting from the roots :)

I'll start working on this, I'll merge more changes in and let you know when this is ready for further review.

@fitzgen
Copy link
Member

fitzgen commented May 15, 2018

So, in the case that someone just runs a plain twiggy paths $THING with no function names, should that also default to ascending?

Yes, default to the current ascending behavior.

We could fill the Paths.items member with all of the leaf nodes that are Code items in that case.

So do an actual graph traversal such that each item is only ever printed once in the output, you're saying? We could do this, but then we would also have to start emitting the size information for every item along a path. I actually like the current behavior of a list of each item in the graph, and then each item has the tree of callers fanning out below it. I'm OK with duplicates across each tree because the way I use the tool is to look at the biggest functions, and then say "huh, who is calling this function?" And inspect just that call tree before moving to the next item and inspecting its call tree, etc.

Or maybe I am misunderstanding something you were saying?

@data-pup data-pup force-pushed the default-paths-argument branch from db58de7 to 1cbd59f Compare May 17, 2018 01:27
… paths if

no arguments are given. In adittion, there is now a descending flag that
can be used to move down rather than up the retaining paths.
@data-pup data-pup force-pushed the default-paths-argument branch from 1cbd59f to 631f701 Compare May 17, 2018 01:30
let mut paths = Paths {
items: Vec::with_capacity(opts.functions().len()),
opts: opts.clone(),
let functions: Vec<ir::Id> = match opts.functions().is_empty() {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed this to a Vecir::Id rather than a BTreeSet so that the items could be sorted by size :)

@data-pup
Copy link
Member Author

I think I got this working now :) Pardon the lack of clarity in the previous message. Thanks for explaining the purpose of the command a little more, that helped me get a better understanding of what to do. Let me know if there are any details I should change!

@data-pup data-pup self-assigned this May 17, 2018
Copy link
Member

@fitzgen fitzgen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome! Thanks for these tests too, great stuff :)

@data-pup
Copy link
Member Author

Yay! No problem, it's always fun to have an excuse to write wasm by hand :)

@data-pup data-pup deleted the default-paths-argument branch February 28, 2019 19:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants