Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Iterator invalidation: detach() during traverse() #104

Closed
sunjay opened this issue Feb 17, 2019 · 2 comments · Fixed by #437
Closed

Iterator invalidation: detach() during traverse() #104

sunjay opened this issue Feb 17, 2019 · 2 comments · Fixed by #437
Assignees

Comments

@sunjay
Copy link
Contributor

sunjay commented Feb 17, 2019

Rust typically protects you from iterator invalidation by not allowing you to modify a collection while you are iterating over it. When you use constructs like RefCell, the Rust compiler won't complain, but these checks are still performed when the program is running.

I found out the hard way today that there is a case where you can cause an iterator over nodes to become invalidated and instead of panicking, it will fail silently and just stop iterating.

In Rust pseudo-code, here's what I was doing:

let root = comrak::parse_document(...);

for node in root.traverse() {
    if !want_to_modify_node(node) {
        continue;
    }

    for child in nodechildren() {
        if !want_to_modify_child(node, child) {
            continue;
        }

        child.detach();
    }
}

By detaching the child during traverse, I inadvertantly invalidated the iterator. This didn't result in any memory unsafety. Instead, the iterator just never continues past the node whose child I detached.

This wasn't immediately obvious to me because I assumed that since I was deleting a child, the original node could still be iterated further.

Furthermore, the only reason I figured this out is probably because I am quite experienced in other languages where iterator invalidation is an issue. I imagine a newer programmer might not have come to the same conclusion.

If it isn't possible to panic in these situations with a helpful error message, we should at least add some documentation to all of the iterator methods that can be invalided by a detach.

An even better solution would be to make it so you can detach and still continue traversing the rest of the tree. I'm specifically asking to be able to detach a node that hasn't been traversed yet, I don't expect detaching a parent during traverse() to ever work.


A workaround for this (as with any case of iterator invalidation) is to store the nodes you want to detach and detach them all at once after you're done traversing.

@kivikakk
Copy link
Owner

I don't have a repro handy, but I wonder if replacing None here with a panic! causes your example to panic?

comrak/src/arena_tree.rs

Lines 357 to 361 in c911d0f

// `node.parent()` here can only be `None`
// if the tree has been modified during iteration,
// but silently stoping iteration
// seems a more sensible behavior than panicking.
None => None,

If so, the fix might be relatively easy. (I think panicking is better than silently exiting the iterator.)

@sunjay
Copy link
Contributor Author

sunjay commented Feb 24, 2019

but silently stoping iteration seems a more sensible behavior than panicking.

Heh. Well that didn't work out. That is a very good comment though. I love that the assumption was documented so fixing it is very easy!

I won't have time to work on this, so someone else is free to take it on! :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants