Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve HTTPHeaders description performance #3063

Merged
merged 8 commits into from
Jan 20, 2025
Merged

Conversation

supersonicbyte
Copy link
Contributor

Motivation:

As outlined in this issue by @weissi, the current performance of calling description on HTTPHeaders is undermined by the dynamism of Array.

Modifications:

The proposed solution replaces the implementation of description to iterate over the items and print them out manually. This provides a faster solution since we bypass the cost of calling into description of an Array.

Result:

A more performant implementation of HTTPHeaders description.

Motivation:

As outlined in this [issue](apple#2930) by @weissi, the current performance of calling `description`
on `HTTPHeaders` is undermined by the dynamism of Array.

Modifications:

The proposed solution replaces the implementation of `description` to iterate over the items and print them out manually. This provides a faster solution since we bypass the cost of calling into `description` of an Array.

Result:

A more performant implementation of `HTTPHeaders` description.
@Lukasa
Copy link
Contributor

Lukasa commented Jan 15, 2025

I think we need a benchmark here to get an idea of whether this is actually faster.

@supersonicbyte
Copy link
Contributor Author

Sure, I'm happy to add benchmarks to confirm this. @weissi Would you mind sharing with me how did you generate the images from the original issue? I tried to reach out to you on X/Twitter. Thanks.

@weissi
Copy link
Member

weissi commented Jan 16, 2025

Sure, I'm happy to add benchmarks to confirm this. @weissi Would you mind sharing with me how did you generate the images from the original issue? I tried to reach out to you on X/Twitter. Thanks.

Oh sorry, I'm mostly on Bluesky these days.

If you're on a Mac, the easiest thing is probably to write a quick harness that does

@_optimise(none) @inline(never)
func blackhole(_ string: String) { precondition(string.utf8.count >= 0) }

@inline(never)
func GO() {
    let httpHeaders = ["some": "example", "headers": "small", "and": "looooo...ng", ...]
    for _ in 0..<100_000 {
        let str = "\(httpHeaders)"
        precondition(str.utf8.count > 100)
   }
}

GO()

then compile that in release mode and run it with time ./my-harness.

To see the Flame Graphs on Mac, using Instruments is easiest.

@weissi
Copy link
Member

weissi commented Jan 16, 2025

And ideally, you'd add the benchmark to the NIOPerformanceTester target too.

@supersonicbyte
Copy link
Contributor Author

Thank you @weissi!

I did some tests and here is what I found.

Consider the following harness:

@_optimize(none) @inline(never)
func go() {
    let headers = Array(repeating: ("Test", "Test"), count: 100)

    for _ in 1...1_000 {
        let str = headers.description
        precondition(str.utf8.count > 100)
    }
}

go()

After compiling it with :

swiftc -O -o array_description array_description.swift

and running time the output is:

./array_description  0.22s user 0.00s system 98% cpu 0.232 total

Now, let's consider the harness with which replaces the call to description with the proposed optimization:

@_optimize(none) @inline(never)
func go() {
    let headers = Array(repeating: ("Test", "Test"), count: 100)

    for _ in 0...1_000 {
        let str = headers.lazy.map {
                "\($0.0): \($0.1)"
            }
            .joined(separator: "; ")
        precondition(str.utf8.count > 100)
    }
}

go()

We can copmpile it with the same flags and run time. The output is:

./array_optimized_description  0.02s user 0.00s system 93% cpu 0.028 total

We can see that we have a roughly 8x increase in speed.

I also added a benchmark to the NIOPerfomanceTester target too.

Last, but not least, the flame graphs 🔥.

image

image

@weissi
Copy link
Member

weissi commented Jan 17, 2025

[...]

Super cool, thank you!!

@Lukasa Lukasa added the 🔨 semver/patch No public API change. label Jan 17, 2025
Copy link
Contributor

@Lukasa Lukasa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shiny, this is fantastic. Thanks so much @supersonicbyte! ✨ I'll kick off CI and see what it thinks.

@Lukasa Lukasa enabled auto-merge (squash) January 17, 2025 17:56
@Lukasa
Copy link
Contributor

Lukasa commented Jan 20, 2025

Ok, so somewhat unsurprisingly we've improved our allocation count numbers, which causes the integration tests to fail (so that we can revise the thresholds downwards). If you're on macOS, the solution for updating these is to open the CI jobs one-by-one, copy the output, and then run pbpaste | ./dev/alloc-limits-from-test-output --json > IntegrationTests/tests_04_performance/Thresholds/<matching-threshold>.json. On other platforms the command will be roughly similar except that you'll need to handle the pasting a bit differently.

auto-merge was automatically disabled January 20, 2025 09:45

Head branch was pushed to by a user without write access

@supersonicbyte
Copy link
Contributor Author

Ok, so somewhat unsurprisingly we've improved our allocation count numbers, which causes the integration tests to fail (so that we can revise the thresholds downwards). If you're on macOS, the solution for updating these is to open the CI jobs one-by-one, copy the output, and then run pbpaste | ./dev/alloc-limits-from-test-output --json > IntegrationTests/tests_04_performance/Thresholds/<matching-threshold>.json. On other platforms the command will be roughly similar except that you'll need to handle the pasting a bit differently.

Hey @Lukasa, I don't believe that is the case actually. I have run the suggested command but it didn't generate any diff from current main. Then I dig in deeper and found out actually that the reason of the failing is test_07_headers_work.sh. It expected the old output of the HTTPHeaders description. I updated the test. Can you kick off the CI again, please? I think we should be good now.

@Lukasa
Copy link
Contributor

Lukasa commented Jan 20, 2025

Oh nice, good catch, thanks!

@Lukasa Lukasa enabled auto-merge (squash) January 20, 2025 16:30
@Lukasa Lukasa merged commit bf41ad1 into apple:main Jan 20, 2025
34 of 35 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🔨 semver/patch No public API change.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants