Skip to content

Conversation

@CaseyCarter
Copy link
Contributor

Refactors all format functionality with knowledge of character encodings into a new class _Fmt_codec. _Fmt_codec internally caches the _Cvtvec structure that other functions previously passed around, and makes use of a new "pure C ABI function" __std_get_cvt to retrieve character conversion info for the active codepage.

The ABI function reuses the __std_code_page and __std_win_error types from <xfilesystem_abi.h>. These could be extracted into a more general __msvc_win_abi.hpp header, but I think <xfilesystem_abi.h> is lightweight enough to just include directly for now.

Drive-by:

  • BUG: _Parse_precision was not skipping the trailing } in a dynamic precision.
  • Improvement: test_parse_helper now verifies that the parse consumes the entire input when passed an expected length of npos.

@CaseyCarter CaseyCarter added cxx20 C++20 feature format C++20/23 format labels Apr 14, 2021
@CaseyCarter CaseyCarter requested a review from a team as a code owner April 14, 2021 03:06
Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The fact that you cooked that up in a few days makes my imposter syndrome super happy

Copy link
Contributor

@barcharcraz barcharcraz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good to me, although the hack to hook into format.cpp to change the codepage needs to change.

I do observe that the ucrt uses the _setmbcp and _getmbcp functions, which seem to set a global (not great).

@CaseyCarter
Copy link
Contributor Author

Looks pretty good to me, although the hack to hook into format.cpp to change the codepage needs to change.

No lie: this looked even jankier to me today than it did yesterday. I won't mourn its passing.

Refactors all `format` functionality with knowledge of character encodings into a new class `_Fmt_codec`. `_Fmt_codec` internally caches the `_Cvtvec` structure that other functions previously passed around, and makes use of a new "pure C ABI function" `__std_get_cvt` to retrieve character conversion info for the active codepage.

The ABI function reuses the `__std_code_page` and `__std_win_error` types from `<xfilesystem_abi.h>`. These could be extracted into a more general `__msvc_win_abi.hpp` header, but I think `<xfilesystem_abi.h>` is lightweight enough to just include directly for now.

Drive-by:
* BUG: `_Parse_precision` was not skipping the trailing `}` in a dynamic precision.
* Improvement: `test_parse_helper` now verifies that the parse consumes the entire input when passed an expected length of `npos`.
@StephanTLavavej StephanTLavavej changed the base branch from feature/format to main April 15, 2021 08:08
@StephanTLavavej
Copy link
Member

Now that #1821 has merged, I've force-pushed and retargeted this to main. I inspected the history carefully - acp had accumulated 3 commits beyond feature/format. Because we took feature/format, added more commits to it in merging_format, and then squashed everything into main (squashing is awesome for linear history and I wouldn't give it up for anything, but it means that situations like this need extra care), I determined that the easiest and safest thing to do was to start from main and cherry-pick acp's 3 unique commits. This cherry-picked 99.9% cleanly, with the only conflict being this PR adding inclusion of <xfilesystem_abi.h> next to the removal of <variant>, which was trivial to resolve.

Apologies for the force-push! 😸

@CaseyCarter CaseyCarter changed the title format assumes strings are encoded in the active code page <format> assumes strings are encoded in the active code page Apr 15, 2021
... fix 2 bugs exposed by that coverage, and address review comments. Drive-by: Remove the `_FORMAT_CODEPAGE` override from the `formatting_utf8` test which is unnecessary since we compile it with `/utf-8`.
@StephanTLavavej StephanTLavavej merged commit ccc5aaa into microsoft:main Apr 20, 2021
@StephanTLavavej
Copy link
Member

Thanks for fixing this bug! 😻 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cxx20 C++20 feature format C++20/23 format

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants