Skip to content
This repository was archived by the owner on Dec 18, 2018. It is now read-only.

add header ordering and short circuits to known headers #1135

Merged
merged 2 commits into from
Oct 24, 2016

Conversation

nathana1
Copy link
Contributor

I revisited the 'header trim' we explored before RTM (#795) and it still seemed to be about a 3% performance win for techempower plaintext.

To avoid having to decide which headers to remove, this change allows us to put the most common headers first and can short circuit instead of having to check every single known header. This results in almost the same perf as the header trim. In million reqs/sec:

Baseline Header Trim Short Circuit
5.18 5.42 5.39

var pUI = (uint*)pUB;
var pUS = (ushort*)pUB;
switch (keyLength)
{{{Each(loop.Headers.Where(h => h.PrimaryHeader).GroupBy(x => x.Name.Length), byLength => $@"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this doing anything different than AppendNonPrimaryHeaders? Couldn't loop.Headers just be reordered to have put the PrimaryHeaders first?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GroupBy(x => x.Name.Length) breaks that ordering

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer it if the template could be parameterized with the list of headers as the argument instead of copy/pasted.

{Each(loop.Headers.OrderBy(h => !h.PrimaryHeader), header => $@"
if ({header.TestBit()})
{{
_headers._{header.Identifier} = default(StringValues);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a micro benchmark showing this is faster than _headers = default(HeaderReferences);? Is there any number of set headers where this is no longer the case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just calling ClearFast 100M times with current implementation takes about 3s on my dev box. With this change and techempower headers set it takes about .5s. However if I set all the bits it takes 8s. 12 bits seems to be near the tipover point, with check >12 bits the max time is about 3.5s. FrameHeaders.BitCount starts to show up as fairly significant in profiling the microbenchmark. Not sure if we can do something to skip it sometimes or if it isn't significant in more real world scenarios.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The HeaderReferences is a pretty big struct - does the = default() change improve after this tweak in coreclr? dotnet/coreclr#7198

Alas its not currently part of 1.1.0; was going to be added dotnet/coreclr#7341 but..?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't tested that change yet, but moving to 1.1.0 looks like a fairly consistent 1.5-2% win. Since JIT_Memset was showing up in the profiles for ClearFast that could be a win for the current implementation.

@halter73
Copy link
Member

Normally I would worried about over fitting to the plaintext benchmark, but this doesn't seem to make performance worse when there are more headers. I'm curious about whether ClearFast doesn't become slower with more headers though.

@nathana1 nathana1 force-pushed the nathana1/known-header-short-circuit branch from 1852cec to 26229fa Compare October 6, 2016 22:28
@nathana1
Copy link
Contributor Author

nathana1 commented Oct 6, 2016

New commit rebases against dev, refactors Append to have template string and adds a bailout to ClearFast if there's too many (12) headers.

}


if (((_bits & 1048576L) != 0))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This indentation looks a bit off.

@halter73
Copy link
Member

LGTM. @CesarBS @pakrym Mind taking a quick look?

@nathana1 can you rebase again? Thanks.

@@ -4601,6 +5097,8 @@ public unsafe void Append(byte[] keyBytes, int keyOffset, int keyLength, string
}
}
break;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove these linebreaks

@pakrym
Copy link
Contributor

pakrym commented Oct 19, 2016

:shipit:

refactor known header changes

- create template string from Append switch
- add a check to ClearFast to bail out if too many headers are set
@nathana1 nathana1 force-pushed the nathana1/known-header-short-circuit branch from 26229fa to 11ed34f Compare October 19, 2016 23:11
@halter73 halter73 merged commit 610601c into dev Oct 24, 2016
@halter73 halter73 deleted the nathana1/known-header-short-circuit branch October 26, 2016 00:23
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants