Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IRGraphVisitor should increment refcount of visited items #3776

Merged
merged 4 commits into from
Mar 28, 2019

Conversation

steven-johnson
Copy link
Contributor

Currently, visiting temporarily-created items has the risk of the memory for the temporary item being re-used during the visit; if that happens, subsequent temporary items with the same memory will be skipped.

To check for possible slowdown in compile times, I prebuilt the tests with make build_tests and then ran time nice -n10 make test_correctness >/dev/null 2>&1; on current master I get

real	7m45.323s
user	11m51.051s
sys	0m24.147s

while with this branch I get

real    7m46.999s
user    11m53.953s
sys     0m23.739s

Currently, visiting temporarily-created items has the risk of the memory for the temporary item being re-used during the visit; if that happens, subsequent temporary items with the same memory will be skipped.
Copy link
Member

@zvookin zvookin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Andrew should get a chance to look this over too, but it looks just like what I'd expect.

@abadams
Copy link
Member

abadams commented Mar 28, 2019

Might be safer and no slower to have the IRGraphMutator maintain a set<IntrusivePtr<IRNode>> instead of manually managing ref counts. The potential downside relative to this approach is that it makes the contained type not trivially-copyable, and maybe the set implementation can't do something clever under that constraint. Still, it's simpler than writing constructors, so it's worth trying.

@steven-johnson
Copy link
Contributor Author

IRGraphMutator maintain a set<IntrusivePtr>

But in that case, set might need to internally do additional inc/dec on node contents if if rebalances (etc) the internal tree; with this approach, we know that the inc happens exactly once (upon insertion) and the dec happens exactly once (upon visitor destruction). (Unless maybe the set implementation can get away with only moves and never copies?)

@abadams
Copy link
Member

abadams commented Mar 28, 2019

I believe it's very likely that set implementations work with moves as much as possible. I think they're usually trees (because you can iterate in-order and insertion is cheap), so they may never need to move or copy elements.

@steven-johnson
Copy link
Contributor Author

k, I'll try it locally with IntrusivePtr and see how performance compares.

@steven-johnson
Copy link
Contributor Author

(semi-related: it doesn't look like we rely on traversal order of this set anywhere; I wonder if we could get a performance win with unordered_set instead)

@steven-johnson
Copy link
Contributor Author

Using IRHandle instead shows timing within the noise factor of the previous and is indeed simpler. PTAL.

@steven-johnson
Copy link
Contributor Author

(If this looks good, I'd like to land it soon, to try a downstream Halide integrate today.)

@steven-johnson steven-johnson merged commit 5b5a2da into master Mar 28, 2019
@steven-johnson steven-johnson deleted the srj-gv branch March 28, 2019 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants