-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid quadratic complexity when splitting output into lines #9586
Conversation
(rust-highfive has picked a reviewer for you, use r? to override) |
Thanks for this! Can you describe how you saw this arise as well? I'm curious why this hasn't been caught before or if it's only a niche-ish situation where the perf matters. Additionally would you be up for separating out this structure to a separate thing which can be unit-tested? I fear that the internals of the management of lines here has grown with complications to the point that it's probably best to have unit tests to ensure we don't accidentally regress things. |
I encountered this while debugging an issue in rustc, more than once in fact. Each time I had a logging enabled which generated very long lines. The end result was a cargo stuck while using 100% CPU without making any visible progress. I would rather not take a job of refactoring this to the point where it can be unit tested. |
Hm ok, well unfortunately I'm not personally confident enough in this code to sign off on it. I would be much more confident with some unit tests showing what's happening. I'd be more ok landing this if it were a pressing problem, but if it's mostly just debugging rustc then while I think this is important to fix it's best to fix it in a way to that doesn't run the risk of regressing other use cases in Cargo. |
I introduced an intermediate refactoring, so that there is only a single place that drains the data buffer. With that in place, the optimization itself is straightforward. Could you take another look? If you consider that too risky still, that's fair of course. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok yeah this looks more understandable, thanks! I left a few small comments inline.
When searching for newlines in a process output keep track which part of buffer was already examined to avoid processing the same data again and again.
@bors: r+ |
📌 Commit 8df771f has been approved by |
☀️ Test successful - checks-actions |
Update cargo This also updates `opener` used in bootstrap (to try to keep dependencies unified). 18 commits in 44456677b5d1d82fe981c955dc5c67734b31f340..9233aa06c801801cff75df65df718d70905a235e 2021-06-12 18:00:01 +0000 to 2021-06-22 21:32:55 +0000 - Detect incorrectly named cargo.toml (rust-lang/cargo#9607) - Unify weak and namespaced features. (rust-lang/cargo#9574) - Change `rustc-cdylib-link-arg` error to a warning. (rust-lang/cargo#9563) - Updates to future-incompatible reporting. (rust-lang/cargo#9606) - Add a compatibility notice for diesel and the new resolver. (rust-lang/cargo#9602) - Don't allow config env to modify vars set by cargo (rust-lang/cargo#9579) - Disambiguate is_symlink. (rust-lang/cargo#9604) - Update opener requirement from 0.4 to 0.5 (rust-lang/cargo#9583) - Avoid quadratic complexity when splitting output into lines (rust-lang/cargo#9586) - Bump to 0.56.0, update changelog (rust-lang/cargo#9597) - Fix dep-info files including non-local build script paths. (rust-lang/cargo#9596) - Relax doc collision error. (rust-lang/cargo#9595) - Handle "jobs = 0" case in cargo config files (rust-lang/cargo#9584) - Enhancements to testsuite error output. (rust-lang/cargo#9589) - Fix typo (rust-lang/cargo#9590) - Enable support for fix --edition for 2021. (rust-lang/cargo#9588) - Add more details for installing git repository errors (rust-lang/cargo#9582) - More information for links conflicting (rust-lang/cargo#9568)
When searching for newlines in a process output keep track which part of
buffer was already examined to avoid processing the same data again and
again.