Skip to content

Conversation

@vitaut
Copy link
Contributor

@vitaut vitaut commented Apr 17, 2021

Fix code bloat in formatting functions by "type erasing" the output iterator as implemented in {fmt} and suggested in the standard:

For a given type charT, implementations are encouraged to provide a single instantiation of basic_­format_­context for appending to basic_­string, vector, or any other container with contiguous storage by wrapping those in temporary objects with a uniform interface (such as a span) and polymorphic reallocation.

Previously a ton of formatting code was instantiated for every output iterator type passed to format_to and format_to_n. format and formatted_size also contributed to this via iterators used internally.

For example, the binary size increase when using a new iterator type dropped form ~30k (#1835) to just ~1k:

17.04.2021  10:51           371 200 format-one.exe
17.04.2021  10:51           372 224 format-two.exe

In addition to improving binary code size this can potentially benefit compile times (fewer template instantiations) and performance provided that back_insert_iterator<buffer<Char>> is handled properly in the implementation (not part of this PR). In particular formatted_size and format_to_n are now easier to optimize because they write to a contiguous buffer instead of doing character-by-character output. I haven't done any compile time or performance measurements though.

Fixes most of #1835 except for vformat_to that will be addressed separately.

@vitaut vitaut requested a review from a team as a code owner April 17, 2021 18:08
@ghost
Copy link

ghost commented Apr 17, 2021

CLA assistant check
All CLA requirements met.

Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for all the nitpicks

@vitaut
Copy link
Contributor Author

vitaut commented Apr 17, 2021

Where can I find full CI test logs? I only see

Failed Tests (2):
1:   std :: tests/P0645R10_text_formatting_formatting:13
1:   std :: tests/P0645R10_text_formatting_formatting:02

but not the actual failed assertions in https://dev.azure.com/vclibs/STL/_build/results?buildId=7725&view=logs&jobId=2be719c3-1698-53ec-1fd0-37013441db10&j=2be719c3-1698-53ec-1fd0-37013441db10&t=82ffaba7-a56f-5ceb-461f-4e0c10a69218 which is not very useful.

@MattStephanson
Copy link
Contributor

Where can I find full CI test logs?

On the main results page https://dev.azure.com/vclibs/STL/_build/results?buildId=7725&view=results, choose the "Tests" tab near the top. Expand the shard and click on the failing file (almost all are just called "test.cpp"), and the details will come up.

@StephanTLavavej StephanTLavavej added format C++20/23 format performance Must go faster labels Apr 18, 2021
@vitaut
Copy link
Contributor Author

vitaut commented Apr 18, 2021

Fixed a few issues with move-only iterators.

Copy link
Contributor

@miscco miscco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry,I did not look at nodiscard

Also I think we should always add noexcept as a clear and easy to follow guideline

Copy link
Contributor

@statementreply statementreply left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: IIRC force-pushing is discouraged by the maintainers of this project.

@vitaut
Copy link
Contributor Author

vitaut commented Apr 18, 2021

FYI: IIRC force-pushing is discouraged by the maintainers of this project.

Good to know, will make separate commits from now on.

@StephanTLavavej
Copy link
Member

Thanks - yeah, the issue with force-pushing is that it's difficult in GitHub's UI to incrementally review changes between force-pushes.

Copy link
Member

@StephanTLavavej StephanTLavavej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks good to me - my comments are all about various conventions (I tried to avoid unimportant nitpicks though). Found no issues with the actual optimization. 🎉

@vitaut vitaut requested a review from StephanTLavavej April 19, 2021 14:15
@vitaut
Copy link
Contributor Author

vitaut commented Apr 19, 2021

I believe that all review comments should be addressed now.

@vitaut
Copy link
Contributor Author

vitaut commented Apr 19, 2021

Build failure looks unrelated: https://dev.azure.com/vclibs/STL/_build/results?buildId=7750&view=results, possibly transient. Could someone with an account trigger a rerun?

@CaseyCarter CaseyCarter linked an issue Apr 19, 2021 that may be closed by this pull request
@vitaut vitaut requested a review from CaseyCarter April 19, 2021 17:29
@StephanTLavavej
Copy link
Member

I'm happy with merging this now. We can get it into Preview 3 if we move very quickly, I think.

@vitaut
Copy link
Contributor Author

vitaut commented Apr 20, 2021

I'm happy with merging this now. We can get it into Preview 3 if we move very quickly, I think.

Great, thanks for quick review.


class _Fmt_fixed_buffer_traits {
private:
ptrdiff_t _Count_ = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Urgh, this is backwards. Conventionally the members should be _Count and _Limit and the arguments passed _Count_ and _Limit_

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that there are member functions _Count and _Limit in this class.


void _Flush() {
auto _Size = this->_Size();
this->_Clear();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is correct, but super subtle and requires the user to check the implementation of _Clear to verify that all is fine.

Can we move the this->_Clear() call below the _Copy_unchecked call so that it is obvious that we do not delete what we want to copy

template <class _Ty>
class _Fmt_buffer {
private:
_Ty* _Ptr_ = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but again this should be _Ptr, _Size and _Capacity for the members and _Size_ for arguments

@miscco
Copy link
Contributor

miscco commented Apr 20, 2021

I would like to note that the notation of arguments and members is switched. For the sake of sanity of the author we might want to clean that up in a follow up PR?

@StephanTLavavej
Copy link
Member

@miscco Yes, please feel free to submit a followup PR after we land this (the deadline is urgent so if the PR is correct I want to land this now).

@vitaut
Copy link
Contributor Author

vitaut commented Apr 20, 2021

When do you plan to merge it? I would like to submit a small follow-up PR.

@SuperWig
Copy link
Contributor

SuperWig commented Apr 20, 2021

It's essentially "merged" at this point judging by that move from "final" to "ready to merge".

They batch merging PRs due to a semi manual process with their internal repo.

You could probably submit that PR now so other contributors can review it and then rebase once this is actually merged?

Edit: minor derp, make the pr on your fork?

@barcharcraz
Copy link
Contributor

When do you plan to merge it? I would like to submit a small follow-up PR.

you may submit a follow up, however please don't change ABI in it. It's likely any follow ups will miss the initial release of the feature.

@StephanTLavavej
Copy link
Member

I'm preparing to merge this into the internal repo now (when everything is green, we simultaneously merge). Fortunately, there were no source code merge conflicts except for trivial ones with #1803 (which I will resolve when merging this on GitHub). We should be able to merge this in one day, barring catastrophe.

@StephanTLavavej
Copy link
Member

@vitaut I need to resolve a merge conflict with #1803 which we just merged, but you didn't grant permission for microsoft/STL maintainers to push to your branch.

@vitaut
Copy link
Contributor Author

vitaut commented Apr 22, 2021

I need to resolve a merge conflict with #1803 which we just merged, but you didn't grant permission for microsoft/STL maintainers to push to your branch.

I gave you write access to fmtlib/STL. Please let me know if you should add someone else as well.

@StephanTLavavej
Copy link
Member

Thanks, pushed! For future PRs, it should be sufficient to leave the "Allow edits and access to secrets by maintainers" checkbox checked when creating the PR:

pr_screenshot

For an already-existing PR, this option is at the bottom of the sidebar:

pr_sidebar

As the ❔ tooltip explains, this will automatically grant everyone with write access to microsoft/STL (which is just the microsoft/vclibs team, not literally every MS employee) write access to that specific branch of your fork. (The tooltip goes on to explain that malicious maintainers would be able to push malicious commits to dig up stored secrets and gain access to other branches of the fork - but in addition to us being non-evil, the STL doesn't have workflows like that, and in no event would unrelated repos be at risk.)

In addition to merge conflict resolutions like this one, we typically use this push access to perform trivial cleanups to save a bit of time, or fix failing tests, etc.

@vitaut
Copy link
Contributor Author

vitaut commented Apr 22, 2021

For future PRs, it should be sufficient to leave the "Allow edits and access to secrets by maintainers" checkbox checked when creating the PR

Gotcha, thanks!

@StephanTLavavej StephanTLavavej merged commit 044bfa5 into microsoft:main Apr 22, 2021
@StephanTLavavej
Copy link
Member

Thanks for optimizing this codegen - there are a lot of users who are greatly concerned about binary size and this will make a big difference. I'm going to record this alongside <format> in the Changelog for VS 2022 17.0 Preview 2, but we're going to begin the backport process to 16.10 Preview 3 today and will update the Changelog when they land there.

Also, congratulations on your first microsoft/STL commit! 🚀 😺 🎉

@vitaut
Copy link
Contributor Author

vitaut commented Apr 23, 2021

Thanks. BTW I have a small follow-up PR (#1874) that completes code size optimization. It might be worth integrating it before std::format ships somewhere to prevent any potential breakage in the future.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

format C++20/23 format performance Must go faster

Projects

None yet

Development

Successfully merging this pull request may close these issues.

<format>: Code bloat when using different iterator types

10 participants