Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iter_gitworktree() too slow #540

Closed
mih opened this issue Nov 22, 2023 · 1 comment · Fixed by #544
Closed

iter_gitworktree() too slow #540

mih opened this issue Nov 22, 2023 · 1 comment · Fixed by #544

Comments

@mih
Copy link
Member

mih commented Nov 22, 2023

See #539 (comment) for the benchmark.

@mih
Copy link
Member Author

mih commented Nov 23, 2023

Looking at the runtime impact of code components. Using my normal 36k file test case (which has a few thousand untracked files too).

Baseline with the code of f2a6fe is ~300ms.

Making _lsfiles_line2props() a noop: ~180ms.

If _lsfiles_line2props() is just regex matching: ~220ms.

Adding back the line PurePosixPath(props.group('fname')) brings back almost the entire runtime.

It seems that _get_item() has little to no runtime impact. I made it yield a constant item (rather than creating individual ones) -- same performance.

mih added a commit that referenced this issue Nov 23, 2023
In #540 I found about ~10% of the runtime to be attributable to the
regex-matching. This patch replaces the regex with two split()
calls. In local benchmarks I see a ~10% speedup.

One major other source of slow-down was/is the construction of
`PurePosixPath` objects. It seems this is being taken care of by
python/cpython#101362 and will be resolved
eventually.

Closes #540
@mih mih closed this as completed in #544 Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant