Reduce cost of cursor invalidation #15500

lhecker · 2023-06-02T00:55:32Z

Performance of printing enwik8.txt at the following block sizes:
4KiB (printf): 53MB/s -> 58MB/s
128KiB (cat): 170MB/s -> 235MB/s

This commit is imperfect. Support for more than one rendering
engine was "hacked" into Renderer and is not quite correct.
As such, this commit cannot fix cursor invalidation correctly either,
and while some bugs are fixed (engines may see highly inconsistent
TextBuffer and Cursor states), it introduces others (an error in the
first engine may result in the second engine not executing).
Neither of those are good and the underlying issue remains to be fixed.

Validation Steps Performed

Seems ok? ✅

zadjii-msft · 2023-07-05T15:02:54Z

ah this needs merges into it before it's reviewable, doesn't it

zadjii-msft

10-40% throughput improvement in conhost with this? Yea this is worth it IMO.

Only holding off ✅ because the WaitForPaintCompletionAndDisable thing scares me

zadjii-msft · 2024-03-26T15:24:55Z

src/cascadia/TerminalControl/ControlCore.cpp

-
-        if (_renderer)
-        {
-            _renderer->TriggerTeardown();


no longer need to _pThread->WaitForPaintCompletionAndDisable(INFINITE);?

Sort of. The destructor blocks until the renderer is fully shut down, which is similar to WaitForPaintCompletionAndDisable but simpler and faster.

I'll re-add the explicit destructor calls here just to be sure nothing regresses. We use a lot of plain/unsafe pointers after all.

src/cascadia/TerminalControl/HwndTerminal.cpp

zadjii-msft · 2024-03-26T16:01:03Z

src/renderer/base/renderer.cpp

@@ -64,44 +64,90 @@ Renderer::~Renderer()
 // - HRESULT S_OK, GDI error, Safe Math error, or state/argument errors.
 [[nodiscard]] HRESULT Renderer::PaintFrame()
 {


much more legible w/o whitespace:

src/renderer/base/renderer.cpp

lhecker · 2024-03-26T16:53:37Z

10-40% throughput improvement in conhost with this? Yea this is worth it IMO.

FYI the next best area to optimize would be AdaptDispatch::_DoLineFeed which would bring up to +32%. Its costs is a combination of overly zealous invalidation (currently required by ConPTY however), our current hyperlink implementation via hashmaps and a lot of smaller flaws.

For non-English text the cost distribution is wastly different and there we would get the biggest benefit via async rendering with buffer snapshots (+34%) and improving IsGlyphFullWidth (binary tree --> trie; +27%).

j4james · 2024-03-29T18:37:59Z

src/buffer/out/LineRendition.hpp

+constexpr til::rect ScreenToBufferLine(const til::rect& line, const LineRendition lineRendition)
+{
+    // Use shift right to quickly divide the Left and Right by 2 for double width lines.
+    const auto scale = lineRendition == LineRendition::SingleWidth ? 0 : 1;
+    return { line.left >> scale, line.top, line.right >> scale, line.bottom };
+}
+


I think the right hand side may be off by one when you're dealing with exclusive coordinates with an odd value. For example, if the right screen coordinate is 7 exclusive (6 inclusive), that should map to a buffer coordinate of 4 exclusive (3 inclusive). A simple right shift works for inclusive coordinates (6 >> 1 = 3), but not for exclusive coordinates (7 >> 1 = 3, but we want 4).

Hmm... I think the current approach is valid as well, in a certain way. Currently, these two related functions round down the width of the buffer:

TextBuffer::GetLineWidth

ROW::GetReadableColumnCount

If this function where to round up the right coordinate, then passing a viewport sized til::rect will end up having a different size than the size reported by the above two functions.

I do prefer your suggestion, but do we need to change other code first before we can round exclusive coordinates up here?

Oh, I guess this means we've had this inconsistency for a while right? Because the inclusive_rect variant of ScreenToBufferLine may have reported an inclusive .right of 59 while the above two functions reported a max. width of 59. Hmm... I'm not sure how to best resolve this.

I think they're expected to be inconsistent. GetLineWidth tells you how many buffer columns you can fit on the screen, so it has to round down if only half of the last column will fit. But ScreenToBufferLine is used to determine how much of the buffer is required to cover a given screen area. If you round down, you won't get enough buffer content, and the last screen cell won't be updated.

Hmm, yeah I see what you mean. However, I'm also a little squirmish about what you said. This function isn't used anywhere yet and so I'll remove it for now and we can revisit it later. 🙂

Idle thoughts: Personally speaking, I would prefer if we could consistently use til::rect with its exclusive coordinates everywhere, even in areas of our code where inclusive coordinates would be a better fit, just so that everything works consistently. I wonder if this issue would go away with such a change as well... Probably not really, but it may still avoid some potential for confusion.

src/buffer/out/LineRendition.hpp

DHowett · 2024-04-02T17:37:22Z

src/renderer/base/renderer.cpp

        }
    }

+    FOREACH_ENGINE(pEngine)
+    {
+        RETURN_IF_FAILED(pEngine->Present());


one big functional change here is that we now Present under lock. We did not do that before - theoretically it allowed us to yeet bits at the GPU (or whatever) while the console continued working. It was built on the idea that the slow operation would be finalization.

Is this no longer required? Is there a risk to making this locking change?

The _pData->LockConsole(); call (and unlock) is inside an extra scope. So this will still run without the lock being held.

DHowett · 2024-04-02T17:38:27Z

src/renderer/base/renderer.cpp

    {
-        LOG_IF_FAILED(pEngine->PaintCursor(cursorInfo.value()));
+        LOG_IF_FAILED(pEngine->PaintCursor(_currentCursorOptions.value()));


nit: can use *_currentCursorOptions (i think it skips a check where .value() makes a check? but we just checked)

The compiler will be able to optimize redundant trivially inlinable branches away if there isn't a call in between that it can't inspect (the compiler must assume that memory may have mutated at any time during an external call). In this case it'll be able to optimize away the check for sure (just checked it), however this still isn't the case for Debug builds and so tl;dr: Yeah 100% agreed.

DHowett · 2024-04-02T17:38:56Z

src/renderer/base/renderer.cpp

+
+        if (_currentCursorOptions)
+        {
+            _currentCursorOptions->coordCursor += delta;


wat? why weren't we doing this before

_currentCursorOptions is only a member to smuggle state up outside a function call and back into an entirely unrelated one at a later time. It's never used beyond that. I.e. it wasn't preserved between render passes.

src/renderer/base/renderer.cpp

Regressed in #15500, incorrectly fixed in #17332, exposed by #17583. My ineptitude on full display. If this isn't the last cursor invalidation bug I'm going to cry. Closes #17615 ## Validation Steps Performed * cmd.exe * a directory with 6 files * 80x24 viewport * run `cls` * run `dir` twice

lhecker added Product-Conhost For issues in the Console codebase Area-Performance Performance-related issue labels Jun 2, 2023

lhecker force-pushed the dev/lhecker/vt-perf3 branch 2 times, most recently from aaba650 to de72cb6 Compare June 30, 2023 15:01

Base automatically changed from dev/lhecker/vt-perf3 to main July 5, 2023 19:26

zadjii-msft mentioned this pull request Sep 5, 2023

Very slow rendering of colored text #4129

Closed

Reduce cost of cursor invalidation

7aa3731

lhecker force-pushed the dev/lhecker/vt-perf4 branch from 3cb78a4 to 7aa3731 Compare February 21, 2024 22:33

lhecker marked this pull request as ready for review February 21, 2024 22:34

zadjii-msft added this to the Terminal v1.21 milestone Feb 27, 2024

zadjii-msft self-assigned this Mar 25, 2024

zadjii-msft reviewed Mar 26, 2024

View reviewed changes

zadjii-msft removed their assignment Mar 26, 2024

Merge remote-tracking branch 'origin/main' into dev/lhecker/vt-perf4

4da71ff

Address feedback

0600f08

j4james reviewed Mar 29, 2024

View reviewed changes

lhecker added 2 commits March 30, 2024 02:21

Address first half of feedback

ec7ceb8

Address remaining feedback

7cae3c5

DHowett reviewed Apr 2, 2024

View reviewed changes

Address feedback

47ef6e7

DHowett approved these changes Apr 2, 2024

View reviewed changes

DHowett enabled auto-merge April 2, 2024 18:46

carlos-zamora approved these changes Apr 10, 2024

View reviewed changes

DHowett added this pull request to the merge queue Apr 10, 2024

Merged via the queue into main with commit 20b0bed Apr 10, 2024
20 checks passed

DHowett deleted the dev/lhecker/vt-perf4 branch April 10, 2024 19:27

lhecker commented Apr 16, 2024

View reviewed changes

src/renderer/base/renderer.cpp Show resolved Hide resolved

This was referenced Apr 28, 2024

Issues with cursor invalidation in GDI #17150

Closed

DCS sequences split across multiple "packets" can be corrupted by conpty #17117

Closed

j4james mentioned this pull request May 9, 2024

Cursor invalidation failing when line renditions are used #17226

Closed

lhecker mentioned this pull request May 16, 2024

VT renderer works incorrectly when cursor is moved between writes #17270

Closed

lhecker mentioned this pull request Jul 25, 2024

Fix cursor invalidation, again #17617

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reduce cost of cursor invalidation #15500

Reduce cost of cursor invalidation #15500

lhecker commented Jun 2, 2023 •

edited

Loading

zadjii-msft commented Jul 5, 2023

zadjii-msft left a comment

zadjii-msft Mar 26, 2024

lhecker Mar 26, 2024

zadjii-msft Mar 26, 2024

lhecker commented Mar 26, 2024

j4james Mar 29, 2024

lhecker Mar 30, 2024

lhecker Mar 30, 2024

j4james Mar 30, 2024

lhecker Mar 30, 2024 •

edited

Loading

DHowett Apr 2, 2024

lhecker Apr 2, 2024

DHowett Apr 2, 2024

lhecker Apr 2, 2024

DHowett Apr 2, 2024

lhecker Apr 2, 2024

Reduce cost of cursor invalidation #15500

Reduce cost of cursor invalidation #15500

Conversation

lhecker commented Jun 2, 2023 • edited Loading

Validation Steps Performed

zadjii-msft commented Jul 5, 2023

zadjii-msft left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lhecker commented Mar 26, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lhecker Mar 30, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

lhecker commented Jun 2, 2023 •

edited

Loading

lhecker Mar 30, 2024 •

edited

Loading