Skip to content

Conversation

@barcharcraz
Copy link
Contributor

[format.string.general]/2, [format.string.std]/2,3

  • _Parse_arg_id implements [format.string.general]/2 parsing
  • _Parse_align parses alignments as specified in [format.string.std]
  • _Parse_width parses widths as specified in [format.string.std]
    • note that it doesn't decide what to actually do with them
  • most parsing functions that can be are constexpr, at least in some paths, in libfmt
    pretty much all parsing functions are constexpr on all paths, but the standard does not
    require much of anything in to be constexpr, so it's more of a coding / debugging aid
    than anything else. In particular all the number parsing isn't constexpr because it calls out to
    from_chars.

Callbacks for parsing functions are specified in concepts _Parse_spec_callbacks (for [format.string.std]/1) and _Parse_arg_id_callbacks (for the arg-id field of the grammar in [format.string.general]/1

This PR has a sprinkling of all over, but future PRs should be more focused (I hope).

Tests included, however they are not complete.

@barcharcraz barcharcraz requested a review from a team August 26, 2020 00:51
@StephanTLavavej StephanTLavavej added the cxx20 C++20 feature label Aug 26, 2020
Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already looks quite good, I only came to basic_format_parse_context until my brain exploded.

stl/inc/format Outdated
} else if (*_Begin == '{') {
++_Begin;
if(_Begin != _End) {
_Begin = _Parse_arg_id(_Begin, _End, _Width_adapter(_Callbacks));
Copy link
Contributor

@miscco miscco Aug 26, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should move this down. If *_Begin == '}' then _Parse_arg_id will early return and we will fail directly after this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the issue, if begin == } then _Parse_arg_id will parse the ID as an auto-id and call the right handler for that, as it should

@miscco miscco mentioned this pull request Aug 27, 2020
Copy link
Contributor

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall: I'm concerned that there isn't test coverage here for all of the library components we're adding to <format>. Ideally we'd add components and tests at the same time, but // TODO: test coverage comments or something would be fine for this interim branch.

auto s0 = ""sv;
auto s2 = "*<"sv;
auto s3 = "*>"sv;
auto s4 = "*^"sv;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test cases for "}", "{", and "{<".

@barcharcraz
Copy link
Contributor Author

barcharcraz commented Sep 1, 2020

This already looks quite good, I only came to basic_format_parse_context until my brain exploded.

well that's good since after basic_format_parse_context the parsing bits end and we go into skeleton code that really just looks like the standard

edit: actually I'd encourage you to speak up about bits you find confusing, since I found a lot of stuff confusing and want to add appropriate comments to make it easier for the next guy.

Copy link
Contributor

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few "outdated but unresolved" comments in my earlier review as well.

auto s5 = L"*\x343E"sv;
test_parse_helper(parse_align_fn, s5, false, view_typ::npos, {_Align::_None, L"*"sv});
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we checking the end position of the parse for any of the above cases?

Copy link
Contributor

@CaseyCarter CaseyCarter Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean "why are we not checking", Casey. You were concerned about the parse potentially running off the end of the input.

Again, I'm happy to merge as-is and clean up later.

Copy link
Contributor

@CaseyCarter CaseyCarter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of small things that can wait til later if you like.

stl/inc/format Outdated
}
for (;;) {
switch (*_Align_pt) {
case L'<':
Copy link
Contributor

@CaseyCarter CaseyCarter Sep 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

comparing to L'{' here solves the problem where we narrow wide characters and mess up on wide characters where the least significant byte happens to be narrow '{'.

Switching on *_Align_pt instead of static_cast<char>(*_Align_pt) solved that problem. It's a different problem that comparing to L'<' assumes that < in every narrow encoding has the same value as L'<' and/or u8'<'.

@barcharcraz barcharcraz merged commit b10d526 into microsoft:feature/format Sep 22, 2020
@StephanTLavavej StephanTLavavej added the format C++20/23 format label Feb 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cxx20 C++20 feature format C++20/23 format

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants