Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: optimize walk iterator #3

Merged
merged 1 commit into from
Jun 2, 2018
Merged

perf: optimize walk iterator #3

merged 1 commit into from
Jun 2, 2018

Conversation

haltcase
Copy link
Owner

@haltcase haltcase commented Jun 1, 2018

Closes #2

This should already eliminate unnecessary directory traversals but before merging I'd like to have a clear before & after. Reviews would be appreciated in the meantime.

@haltcase haltcase added the type: performance Opportunity for performance improvements. label Jun 1, 2018
@haltcase
Copy link
Owner Author

haltcase commented Jun 1, 2018

I ran a comparison using this script:

https://gist.github.com/citycide/8884d5bb845c782e71fd5fa72783afa7

I also dropped a counter into src/glob.nim to keep track of the number of file system items we walk over.

It creates a file structure with a bunch of nested directories & files (using some procs I pulled from our test file), and then uses the glob pattern *.nim which is a shallow match of any files with a .nim extension in the current directory. This means we can't possibly match files in sub-directories, so we shouldn't even enter them.

branch time taken fs items walked
Unoptimized (master) 0.035885 410
Optimized (perf-walk) 0.001273 15

🎉

@data-man
Copy link

data-man commented Jun 1, 2018

For *nix you can use ftw. I'm sure it will be faster.

@data-man
Copy link

data-man commented Jun 1, 2018

There is cool Rust library: walkdir.
For the ideas. :)

@haltcase
Copy link
Owner Author

haltcase commented Jun 2, 2018

@data-man would that really be a perf gain over the stdlib walkDir implementation? Why wouldn't ftw be used there, and should it?

@data-man
Copy link

data-man commented Jun 2, 2018

@citycide
I don't know.
I'm study Nim for less than a year, and *nix a bit longer. I'm former Windows user. :)

Maybe, ftw isn't available on all kernels, libs, etc..

@haltcase
Copy link
Owner Author

haltcase commented Jun 2, 2018

Got it — I'll look more into ftw but I'll merge this as is. Can always do more down the road 😄

@haltcase haltcase changed the title perf: optimize walk iterator [WIP] perf: optimize walk iterator Jun 2, 2018
@haltcase haltcase merged commit 5232594 into master Jun 2, 2018
@haltcase haltcase deleted the perf-walk branch June 2, 2018 18:06
@data-man
Copy link

data-man commented Jun 6, 2018

BTW maybe pathMatches from PR Add os.pathMatches proc will be useful for you.

@haltcase
Copy link
Owner Author

haltcase commented Jun 6, 2018

@data-man not sure that API would be useful in glob because Windows' PathMatchSpec only supports a very limited subset of the features involved here, I think just * and ? wildcard matching. The goal of this library though is to provide all the posix features on all platforms, including extended features like {a,b} groups, [ab] character classes, and ?()/@() etc pattern matching.

@data-man
Copy link

data-man commented Jun 6, 2018

@citycide

Maybe something like useNativeMatch can be faster. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: performance Opportunity for performance improvements.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants