Skip to content
This repository was archived by the owner on Nov 20, 2018. It is now read-only.

Added custom DateTimeFormatter #716

Merged
merged 4 commits into from
Oct 5, 2016

Conversation

khellang
Copy link
Contributor

@khellang khellang commented Sep 30, 2016

Closes #715

Benchmark Results

About ~60% faster, and ~85% less allocations 😁

image

@khellang khellang force-pushed the custom-datetime-formatter branch 11 times, most recently from e20df05 to 248d389 Compare September 30, 2016 15:42
@@ -321,7 +321,7 @@ private void SetDate(string parameter, DateTimeOffset? date)
else
{
// Must always be quoted
var dateString = string.Format(CultureInfo.InvariantCulture, "\"{0}\"", HttpRuleParser.DateToString(date.Value));
var dateString = string.Format(CultureInfo.InvariantCulture, "\"{0}\"", HeaderUtilities.FormatDate(date.Value));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't the quoting part of the formatting?

Copy link
Contributor Author

@khellang khellang Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this might be a special requirement for Content-Disposition? @Tratcher?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the date only needs to be quoted in some contexts like CDHV

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it worth adding quoted argument to HeaderUtilities.FormatDate?

{
var offset = 0;
char* target = stackalloc char[Rfc1123DateLength];
var universalDateTime = dateTime.ToUniversalTime();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It may be possible to use dateTime.UtcDateTime to save the extra allocation that ToUniversalTime() does, but I would test perf for both in case DateTimeOffset is better for this scenario than DateTime

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't ToUniversalTime() return a struct?

Copy link
Contributor Author

@khellang khellang Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ToUniversalTime just returns another DateTimeOffset using UtcDateTime; http://referencesource.microsoft.com/#mscorlib/system/datetimeoffset.cs,683

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to convert if input is already utc?

if (dateTime.Offset > TimeSpan.Zero)
    dateTime = dateTime.ToUniversalTime();


namespace Microsoft.Net.Http.Headers
{
internal static class DateTimeFormatter
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure of the level of optimization that's really required for this code, but a few off the cuff thoughts:

  1. My intuition says there's gotta be a better way of copying the bytes than a foreach
  2. It might be beneficial to 'pack' all of these bytes into an array + offset for locality
  3. We should look into the inlinability of all of this (most related to 1)

Copy link
Contributor Author

@khellang khellang Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about now?

@hallatore did some testing with Buffer.Memcpy earlier today. It didn't change much. He also verified that stuff was inlined nicely.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was testing changing to foreach to use Buffer.memcpy, but the results where identical.

Ref memcpy: https://gist.github.com/hallatore/a10ffd0d69a508d09b74a4f720c0511a

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could also consider vectorizing this where the lengths are known, using (int *) or (long *). I think that what's there is probably fine, and this is important it would come up again.

private static readonly byte[] SepBytes = UTF8.GetBytes(GetMonthName(9));
private static readonly byte[] OctBytes = UTF8.GetBytes(GetMonthName(10));
private static readonly byte[] NovBytes = UTF8.GetBytes(GetMonthName(11));
private static readonly byte[] DecBytes = UTF8.GetBytes(GetMonthName(12));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do you store these as bytes and then cast back to chars?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uuh. That's a good question. I've been back and forth between char* and byte* a couple of times because of missing APIs for different targets. I'll revert back to char[].

{
switch (dayOfWeek)
{
case DayOfWeek.Sunday: return Format(SunBytes, target, offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Store as array and use indexing instead of switch

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want these in an array of arrays? 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not array of strings?

Copy link
Contributor Author

@khellang khellang Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uhhh, yes, of course. I guess we could just reuse DateTimeFormatInfo.AbbreviatedDayNames and DateTimeFormatInfo.AbbreviatedMonthNames.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you try this measuring packing all of these into a single array? Then your switch just returns an offset into the array.

For instance your format could be like:

M O N , T U E , W E D , ...

Then you can just cast to `(int *) and blit 4 bytes at once including the comma

{
switch (month)
{
case 1: return Format(JanBytes, target, offset);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indexing.

@khellang
Copy link
Contributor Author

khellang commented Sep 30, 2016

That got rid of quite a bit of code 😝


private static unsafe int FormatDayOfWeek(DayOfWeek dayOfWeek, char* target, int offset)
{
return Format(FormatInfo.AbbreviatedDayNames[(int) dayOfWeek], target, offset);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we cache this array and similarly for the month names
See http://referencesource.microsoft.com/#mscorlib/system/globalization/datetimeformatinfo.cs,1393 for what it does every get

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, using those regressed perf substantially:

image

Caching the arrays did help, but it's still worse than it was:

image

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where did additional allocations come from?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably a string enumerator. Let me get rid of that foreach loop 😉

Copy link
Contributor Author

@khellang khellang Sep 30, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaand we're back... Much better ✨

image

private static readonly string[] MonthNames = FormatInfo.AbbreviatedMonthNames;

// The format is "ddd, dd MMM yyyy HH:mm:ss GMT".
private const int Rfc1123DateLength = 29;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nitpicking but you could have done Rfc1123DateLength = "ddd, dd MMM yyyy HH:mm:ss GMT".Length

@khellang
Copy link
Contributor Author

khellang commented Oct 3, 2016

Alright. Here's the current state of this PR:

image

About ~60% faster, and ~85% less allocations.

There are other, faster alternatives, but I'm not sure how far you'd like to stretch; https://gist.github.com/aL3891/2644c97e55f2f6c67b1ade15228031b8

@khellang khellang force-pushed the custom-datetime-formatter branch 2 times, most recently from 5ba9c76 to df88a1c Compare October 3, 2016 13:34
@BrennanConroy
Copy link
Member

Where did the extra 3 bytes come from?

@khellang
Copy link
Contributor Author

khellang commented Oct 3, 2016

@BrennanConroy It varies a bit from run to run. Not really sure why. I think it might average out the bytes allocated for the statics on the first op. @mattwarren?

EDIT: Yeah, looks like it takes the total amount of bytes allocated and divides it by the total number of operations. I think the number of ops can vary for each run. https://github.com/PerfDotNet/BenchmarkDotNet/blob/b35d523dfc859cc0f94897be124e675b79f74845/src/BenchmarkDotNet.Diagnostics.Windows/MemoryDiagnoser.cs#L160

@pakrym
Copy link
Contributor

pakrym commented Oct 3, 2016

@khellang we have multiple PR's commit with similar use case of string construction, can you please try using dotnet/extensions#157 to format the date time so we can have common pattern in both of them?

@khellang
Copy link
Contributor Author

khellang commented Oct 3, 2016

@pakrym I already have it sitting here; khellang@6d9345f. Ran a quick benchmar and it looks like it regressed perf slightly (not sure why), but I think it's nice for consistency. Want me to include it in this PR?

@khellang
Copy link
Contributor Author

khellang commented Oct 3, 2016

The allocations are still the same, though. That was the main point of this PR (and related issue) before things turned into a perf contest 😛

@pakrym
Copy link
Contributor

pakrym commented Oct 3, 2016

@khellang would be awesome.

@khellang khellang force-pushed the custom-datetime-formatter branch from 0c806c2 to 7e454c9 Compare October 4, 2016 10:53
@khellang
Copy link
Contributor Author

khellang commented Oct 4, 2016

🆙📅 Moved to InplaceStringBuilder and added a quoted argument.

@khellang khellang force-pushed the custom-datetime-formatter branch from 7e454c9 to bd26df3 Compare October 4, 2016 11:02
{
var universal = dateTime.UtcDateTime;

var length = quoted ? QuotedRfc1123DateLength : Rfc1123DateLength;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

QuotedRfc1123DateLength is only used once, you might as well put the +2 here.

@@ -24,6 +24,7 @@
"frameworks": {
"netstandard1.1": {
"dependencies": {
"Microsoft.Extensions.Primitives": "1.1.0-*",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This goes under the root dependencies, it's not framework specific.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about System.Buffers? That's not framework specific either?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move them all

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't NETStandard.Library be moved to be framework-specific? Like khellang@b5f8a8d?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't it pull down lots of stuff you don't need when you target net451?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, once

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But does it hurt to move it under netstandard? Is this what you're recommending library authors do?

@pakrym pakrym merged commit 063d6ec into aspnet:dev Oct 5, 2016
@pakrym
Copy link
Contributor

pakrym commented Oct 5, 2016

Thank you @khellang !

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants