Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid recursion in InstNamer::CollectNamesInBlock #4706

Merged
merged 9 commits into from
Dec 20, 2024

Conversation

danakj
Copy link
Contributor

@danakj danakj commented Dec 18, 2024

Use a deque to maintain the set of instructions to be walked over. so that the loop can append more instructions (with their related scope) during iteration without requiring recursion.

@danakj
Copy link
Contributor Author

danakj commented Dec 18, 2024

I used a std::deque here cuz it makes sense at a high level, but maybe there's some more optimal data structure in the llvm project?

Use a deque to maintain the set of instructions to be walked over. so
that the loop can append more instructions (with their related scope)
during iteration without requiring recursion.
@danakj danakj force-pushed the inst-namer-recursion branch from 15fca00 to 5627762 Compare December 18, 2024 17:32
@@ -82,20 +82,20 @@ fn G() {
// CHECK:STDOUT: %v.var: ref %array_type = var v
// CHECK:STDOUT: %v: ref %array_type = bind_name v, %v.var
// CHECK:STDOUT: %F.ref.loc14_34: %F.type = name_ref F, file.%F.decl [template = constants.%F]
// CHECK:STDOUT: %.loc14_42.2: ref %tuple.type.2 = splice_block %.loc14_42.1 {
Copy link
Contributor Author

@danakj danakj Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have not figured out how these suffix numbers come to be, but the non-recursive walk is slightly differently ordered, in that it queues the recursion step instead of starting it immediately. Not sure if this is good/bad. I modified it to insert the recursive stuff right after the current instruction (at the front of the queue) instead of at the end. That is still slightly different as it would be queued after the current one instead of completing before it. And the numbering is still adjusted a bit, and I am not sure why yet...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(If you would like me to figure out why they are different here let me know and I will keep looking, and try to avoid that change, it's not clear to me if this matters or what)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this looks like an improvement since the indices increase as code progresses. I suspect this is the break; after SpliceBlock in action, if that helps.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. To preserve it we'd need to re-queue the current SpliceBlock instruction first before queueing the block id, in such a way that we'd then name the current SpliceBlock instruction after it, without also redoing the block id again infinitely.

But since you like the change I will leave it.

toolchain/sem_ir/inst_namer.cpp Outdated Show resolved Hide resolved
Comment on lines 368 to 384
std::deque<std::pair<ScopeId, InstId>> insts;

// Appends a scope and instructions to walk. Avoids recursion while allowing
// the loop to below append more instructions during iteration.
auto append_block_id = [&](ScopeId scope_id, InstBlockId block_id) {
if (block_id.is_valid()) {
for (auto inst_id : sem_ir_->inst_blocks().Get(block_id)) {
insts.emplace_back(scope_id, inst_id);
}
}
};
auto append_block_insts = [&](ScopeId scope_id,
llvm::ArrayRef<InstId> inst_ids) {
for (auto inst_id : inst_ids) {
insts.emplace_back(scope_id, inst_id);
}
};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the use of deque, had you considered instead using SmallVector? I think that just requires adding in reverse order, such as:

Suggested change
std::deque<std::pair<ScopeId, InstId>> insts;
// Appends a scope and instructions to walk. Avoids recursion while allowing
// the loop to below append more instructions during iteration.
auto append_block_id = [&](ScopeId scope_id, InstBlockId block_id) {
if (block_id.is_valid()) {
for (auto inst_id : sem_ir_->inst_blocks().Get(block_id)) {
insts.emplace_back(scope_id, inst_id);
}
}
};
auto append_block_insts = [&](ScopeId scope_id,
llvm::ArrayRef<InstId> inst_ids) {
for (auto inst_id : inst_ids) {
insts.emplace_back(scope_id, inst_id);
}
};
llvm::SmallVector<std::pair<ScopeId, InstId>> insts;
// Appends a scope and instructions to walk. Avoids recursion while allowing
// the loop to below append more instructions during iteration.
auto append_block_id = [&](ScopeId scope_id, InstBlockId block_id) {
if (block_id.is_valid()) {
for (auto inst_id : llvm::reverse(sem_ir_->inst_blocks().Get(block_id))) {
insts.emplace_back(scope_id, inst_id);
}
}
};
auto append_block_insts = [&](ScopeId scope_id,
llvm::ArrayRef<InstId> inst_ids) {
for (auto inst_id : llvm::reverse(inst_ids)) {
insts.emplace_back(scope_id, inst_id);
}
};

Copy link
Contributor

@jonmeow jonmeow Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: my understanding is that it'd probably be better to switch to push_back instead of emplace_back too (#4651 (review), #4651 (comment))

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Switching from deque to a vector will switch us from FIFO to LIFO traversal -- that is, from breadth-first to depth-first -- which should keep the instruction numbering more predictable and closer to how it currently is. (We'll still have some differences because we're switching from in-order to pre-order traversal so we'll number later instructions in the parent block before we number the child block.)

Copy link
Contributor

@jonmeow jonmeow Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, this'd switched from emplace_back/pop_front to emplace_front/pop_front (using llvm::reverse similar to what I'd suggested here). Although that means push_back/pop_back won't change traversal, it also means the deque's FIFO feature isn't used, making a vector more appealing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: my understanding is that it'd probably be better to switch to push_back instead of emplace_back too (#4651 (review), #4651 (comment))

Ah, I didn't think that conclusion was so general, and pair is the poster child of emplace, but I would personally be happy to never write emplace again. :)

Originally I was pushing to the back, and popping from the front, but I changed that, so yeah a vector makes a lot more sense now, thanks for noticing that.

toolchain/sem_ir/inst_namer.cpp Outdated Show resolved Hide resolved
@@ -82,20 +82,20 @@ fn G() {
// CHECK:STDOUT: %v.var: ref %array_type = var v
// CHECK:STDOUT: %v: ref %array_type = bind_name v, %v.var
// CHECK:STDOUT: %F.ref.loc14_34: %F.type = name_ref F, file.%F.decl [template = constants.%F]
// CHECK:STDOUT: %.loc14_42.2: ref %tuple.type.2 = splice_block %.loc14_42.1 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TBH this looks like an improvement since the indices increase as code progresses. I suspect this is the break; after SpliceBlock in action, if that helps.

Co-authored-by: jonmeow <jperkins@google.com>
@danakj danakj requested a review from jonmeow December 20, 2024 15:39
@danakj
Copy link
Contributor Author

danakj commented Dec 20, 2024

PTAL :)

Copy link
Contributor

@jonmeow jonmeow left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@danakj danakj enabled auto-merge December 20, 2024 16:06
@danakj danakj added this pull request to the merge queue Dec 20, 2024
Merged via the queue into carbon-language:trunk with commit 6cb660f Dec 20, 2024
8 checks passed
@danakj danakj deleted the inst-namer-recursion branch December 20, 2024 16:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants