Skip to content

fix O(n^2) perf bug for std::io::fs::walk_dir #13720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Apr 24, 2014

Conversation

aturon
Copy link
Member

@aturon aturon commented Apr 24, 2014

The walk_dir iterator was simulating a queue using a vector (in particular, using shift),
leading to O(n^2) performance. Since the order was not well-specified (see issue #13411),
the simplest fix is to use the vector as a stack (and thus yield a depth-first traversal).
This patch does exactly that, and adds a test checking for depth-first behavior.

Note that the underlying readdir function does not specify any particular order, nor
does the system call it uses.

Closes #13411.

@alexcrichton
Copy link
Member

We don't currently have a pressing reason to define a particular ordering of yielded directories from this function, so I'm ok leaving is unspecified. If we want to be conservative, perhaps the documentation could indicate that the order of traversal is subject to change? (not specified)

The `walk_dir` iterator was simulating a queue using a vector (in particular, using `shift`),
leading to O(n^2) performance. Since the order was not well-specified (see issue rust-lang#13411),
the simplest fix is to use the vector as a stack (and thus yield a depth-first traversal).
This patch does exactly that.  It leaves the order as originally specified -- "some top-down
order" -- and adds a test to ensure a top-down traversal.

Note that the underlying `readdir` function does not specify any particular order, nor
does the system call it uses.

Closes rust-lang#13411.
@aturon
Copy link
Member Author

aturon commented Apr 24, 2014

@alexcrichton Updated the patch to revert the spec comment, and changed the test to only ensure some top-down order, rather than depth-first.

r?

bors added a commit that referenced this pull request Apr 24, 2014
The `walk_dir` iterator was simulating a queue using a vector (in particular, using `shift`),
leading to O(n^2) performance. Since the order was not well-specified (see issue #13411),
the simplest fix is to use the vector as a stack (and thus yield a depth-first traversal).
This patch does exactly that, and adds a test checking for depth-first behavior.

Note that the underlying `readdir` function does not specify any particular order, nor
does the system call it uses.

Closes #13411.
@bors bors closed this Apr 24, 2014
@bors bors merged commit b536d2b into rust-lang:master Apr 24, 2014
@aturon aturon deleted the walk_dir-perf branch April 24, 2014 22:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

std::io::fs::walk_dir has O(n^2) behaviour
3 participants