Skip to content

Conversation

@JackStouffer
Copy link
Contributor

@JackStouffer JackStouffer commented May 19, 2017

Here's a summary of the changes

  • Replaces use of format with text for error messages, as format is slower due to string parsing when all you really need for this is concatenation
  • The string being passed must be all ASCII characters to be valid, so there is no need to convert it to a dstring
  • There's no need to check if the day and month strings are all digits if you're going to convert them anyway. Leave validation to the conversion functions. This eliminates some loops.
  • Changed the conversion to int rather than ubytes and shorts, as they use slower paths inside of std.conv.parse than int.

Here's the results

$ ldc2 -O -release test.d && ./test
old	4 secs, 541 ms, 59 μs, and 1 hnsec
new	1 sec, 410 ms, 713 μs, and 4 hnsecs

Result with second commit

old	4 secs, 456 ms, 712 μs, and 1 hnsec
new	1 sec, 105 ms, 830 μs, and 8 hnsecs

Benchmark Code

import std.stdio;
import std.algorithm;
import std.conv;
import std.ascii;
import std.range;
import std.traits;
import std.string;
import std.datetime;
import std.utf;

enum testCount = 20_000_000;

Date fromISOString1(S)(in S isoString) @safe pure
    if (isSomeString!S)
{
    import std.algorithm.searching : all, startsWith;
    import std.ascii : isDigit;
    import std.conv : to;
    import std.exception : enforce;
    import std.format : format;
    import std.string : strip;

    auto dstr = to!dstring(strip(isoString));

    enforce(dstr.length >= 8, new DateTimeException(format("Invalid ISO String: %s", isoString)));

    auto day = dstr[$-2 .. $];
    auto month = dstr[$-4 .. $-2];
    auto year = dstr[0 .. $-4];

    enforce(all!isDigit(day), new DateTimeException(format("Invalid ISO String: %s", isoString)));
    enforce(all!isDigit(month), new DateTimeException(format("Invalid ISO String: %s", isoString)));

    if (year.length > 4)
    {
        enforce(year.startsWith('-', '+'),
                new DateTimeException(format("Invalid ISO String: %s", isoString)));
        enforce(all!isDigit(year[1..$]),
                new DateTimeException(format("Invalid ISO String: %s", isoString)));
    }
    else
        enforce(all!isDigit(year), new DateTimeException(format("Invalid ISO String: %s", isoString)));

    return Date(to!short(year), to!ubyte(month), to!ubyte(day));
}

Date fromISOString2(S)(in S isoString) @safe pure
    if (isSomeString!S)
{
    import std.algorithm.searching : all, startsWith;
    import std.ascii : isDigit;
    import std.conv : to, text, ConvException;
    import std.exception : enforce;
    import std.string : strip;

    auto str = isoString.strip;

    enforce!DateTimeException(str.length >= 8, text("Invalid ISO String: ", isoString));

    int day, month, year;
    auto yearStr = str[0 .. $ - 4];

    try
    {
        // using conversion to uint plus cast because it checks for +/-
        // for us quickly while throwing ConvException
        day = cast(int) to!uint(str[$ - 2 .. $]);
        month = cast(int) to!uint(str[$ - 4 .. $ - 2]);

        if (yearStr.length > 4)
        {
            enforce!DateTimeException(yearStr.startsWith('-', '+'),
                    text("Invalid ISO String: ", isoString));
            year = to!int(yearStr);
        }
        else
        {
            year = cast(int) to!uint(yearStr);
        }
    }
    catch (ConvException)
    {
        throw new DateTimeException(text("Invalid ISO String: ", isoString));
    }

    return Date(year, month, day);
}

void main()
{
    auto result = to!Duration(benchmark!(() => fromISOString1("20100704"))(testCount)[0]);
    auto result2 = to!Duration(benchmark!(() => fromISOString2("20100704"))(testCount)[0]);

    writeln("old", "\t", result);
    writeln("new", "\t", result2);
}

Ping @jmdavis

try
{
day = to!int(str[$ - 2 .. $]);
month = to!int(str[$ - 4 .. $ - 2]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Relying on to!int throwing is too permissive. to!int accepts a sign, but it should be rejected in the day and month parts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bah!

Back to the drawing board.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Judging from the documentation, to!uint should work. Maybe add a test that checks against to!uint learning new tricks, like underscores.

@JackStouffer
Copy link
Contributor Author

@aG0aep6G Fixed the issue and managed to squeeze more performance out of it. Updated original comment

if (yearStr.length > 4)
{
enforce!DateTimeException(yearStr.startsWith('-', '+'),
text("Invalid ISO String: ", isoString));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can technically now throw a UTFException, because str is no longer guaranteed to be a dstring. So, you should either throw a representation call in before startsWith or just explicitly check the first element (since it's guaranteed to exist due to the previous checks for length).

// using conversion to uint plus cast because it checks for +/-
// for us quickly while throwing ConvException
day = cast(int) to!uint(str[$ - 2 .. $]);
month = cast(int) to!uint(str[$ - 4 .. $ - 2]);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these to! calls (and the ones below) now autodecode, because str is no longer guaranteed to be a dstring, and so they risk throwing a UTFException. And I don't think that representation will work in this case. Try byCodeUnit, but I don't know if the bug preventing std.conv working with byCodeUnit was fixed yet. If not, given the clear speed-up, we can temporarily add a catch for UTFException so that we can rethrow a DateTimeException, but ultimately, none of this code should need to be doing anything with auto-decoding (at least outside of the error messages, which unfortunately do risk it with text, but I don't know what to do about that).

@JackStouffer
Copy link
Contributor Author

@jmdavis I already fixed to to not auto decode. See #5014

Try byCodeUnit, but I don't know if the bug preventing std.conv working with byCodeUnit was fixed yet

I also fixed that, see #4390

@jmdavis
Copy link
Member

jmdavis commented May 20, 2017

I also fixed that, see #4390

Yay! Thanks.

@JackStouffer JackStouffer changed the title Optimized std.datetime.date.fromISOString Optimized std.datetime.date.Date.fromISOString May 22, 2017
@JackStouffer
Copy link
Contributor Author

Ping @jmdavis, since you approved the other PR can you take a look at this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants