-
Notifications
You must be signed in to change notification settings - Fork 12.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid calling replace in normalizeSlashes when it would do nothing #44100
Conversation
On Windows, there will probably be a negligible slowdown, iterating over the pre-slash prefix of each unnormalized path (though we might come out ahead if paths are normalized more than once). On *nix, this saves work - 1.8s -> 0.4s in the project I'm investigating.
@typescript-bot perf test this |
@amcasey Here they are:Comparison Report - master..44100
System
Hosts
Scenarios
Developer Information: |
Hey @amcasey, something went wrong when publishing results. (You can check the log here). |
This isn't exactly a "noop", instead we're now scanning the entire string twice (once for the export function normalizeSlashes(path: string): string {
let index = path.indexOf(altDirectorySeparator);
if (index !== -1) {
let newPath = path.slice(0, index) + "/";
let start = index + 1;
index = path.indexOf(altDirectorySeparator, start);
while (index !== -1) {
newPath += path.slice(start, index) + "/";
start = index + 1;
index = path.indexOf(altDirectorySeparator, start);
}
return newPath + path.slice(start);
}
return path;
} |
Hmm. Seems like Daniel's might be more efficient?
I'm generating a bunch of random paths with random character separators, and this is just quick and dirty. I can look at improving my microbenchmark sample size after I get some dinner. |
I wonder how well it would work with the single static buffer case. const normalizeSlashesSegmentBuffer: string[] = [];
/**
* Normalize path separators, converting `\` into `/`.
*/
export function normalizeSlashes(path: string): string {
normalizeSlashesSegmentBuffer.length = 0;
let lastSliceStart = 0;
for (let i = 0; i < path.length; i++) {
const c = path.charCodeAt(i);
if (c === CharacterCodes.backslash) {
normalizeSlashesSegmentBuffer.push(path.slice(lastSliceStart, i));
lastSliceStart = i + 1;
}
}
if (normalizeSlashesSegmentBuffer.length > 0) {
normalizeSlashesSegmentBuffer.push(path.slice(lastSliceStart));
return normalizeSlashesSegmentBuffer.join(directorySeparator);
}
return path;
} |
The shared array case seems to be slower, but I also modified @DanielRosenwasser's code to use String.prototype.concat without using an array and that seems to be fairly efficient:
function normalizeSlashes(path: string): string {
let lastSliceStart = 0;
let newPath: string | undefined;
for (let i = 0; i < path.length; i++) {
const c = path.charCodeAt(i);
if (c === CharacterCodes.backslash) {
const segment = path.slice(lastSliceStart, i);
lastSliceStart = i + 1;
newPath = newPath === undefined ? segment : newPath.concat(directorySeparator, segment);
}
}
return newPath === undefined ? path : newPath.concat(directorySeparator, path.slice(lastSliceStart));
} |
Interesting - I guess I would just check on how that works in the original code sample in case the resulting string representations end up making a difference later on. |
I meant that it was a no-op in the case where the slashes are already normalized. Obviously, it's wasted work (though not much - a traversal up to the first slash) in the case where normalization is required.
We're only going to scan the entire string for
My first attempt was to unroll it. It made things much worse (tripled the time in the real-world sample). Even if we write one that shows improved performance on a micro-benchmark, my concern is that any replacement happening in JS, rather than native code, will produce a lot of temporary strings. If you settle on a winner based on microbenchmarks, I'm happy to drop it into the real repro to see what difference it makes. |
@typescript-bot perf test this |
Heya @amcasey, I've started to run the perf test suite on this PR at 0c797a5. You can monitor the build here. Update: The results are in! |
@amcasey Here they are:Comparison Report - master..44100
System
Hosts
Scenarios
Developer Information: |
@amcasey Hope you don't mind, I'm going to borrow this PR to run the perf test pipeline to make sure it is working... |
It didn't work, so I'm going to create a different PR to test things out on... |
Building typescript on windows, I see 73549 calls to Edit: my ts was out of date and had compilation errors. Updated. |
Which did you test? The one at the bottom of #44100 (comment) was the run that performed the best in my microbenchmarks on windows. |
@rbuckton, when you benchmarked, were 95% of your inputs already normalized? I realize numbers are the final arbiter in any perf debate, but I'm having trouble imagining how anything in JS-space could be faster than a native indexOf (which also, I would think, produces no garbage). |
One thing to be aware of when running micro benchmarks on string concatenation: on v8, concatenation using the |
@dmichon-msft Good point. I also think it's easy to miss GC costs in a micro-benchmark. |
@DanielRosenwasser I think #44101 is your preferred variant. Is that correct? |
@amcasey: This is the version that seemed to be winning in my microbenchmarks: function normalizeSlashes(path: string): string {
let lastSliceStart = 0;
let newPath: string | undefined;
for (let i = 0; i < path.length; i++) {
const c = path.charCodeAt(i);
if (c === CharacterCodes.backslash) {
const segment = path.slice(lastSliceStart, i);
lastSliceStart = i + 1;
newPath = newPath === undefined ? segment : newPath.concat(directorySeparator, segment);
}
}
return newPath === undefined ? path : newPath.concat(directorySeparator, path.slice(lastSliceStart));
} |
@rbuckton Single run on linux with node12 PR version
Micro-benchmark winner
|
pprof isn't cooperating, but here are the server-reported times for node16 (single run): PR:
Micro-benchmark winner:
|
@amcasey queue a regular perf test to ensure there isn't a regression in the benchmarks? |
@dmichon-msft Sure, but there won't be because they're all linux tests without backslashes. |
@typescript-bot perf test this |
Heya @amcasey, I've started to run the perf test suite on this PR at aaaf09f. You can monitor the build here. Update: The results are in! |
@amcasey Here they are:Comparison Report - master..44100
System
Hosts
Scenarios
Developer Information: |
It doesn't seem to have a significant impact on existing benchmarks, so I'm happy with whatever version seems to improve your specific test case. |
On Windows, there will probably be a negligible slowdown, iterating over the pre-slash prefix of each unnormalized path (though we might come out ahead if paths are normalized more than once).
On *nix, this saves work - 1.8s -> 0.4s in the project I'm investigating.