Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace PolyTextOutW with ExtTextOutW #10478

Merged
1 commit merged into from
Jun 22, 2021

Conversation

alabuzhev
Copy link
Contributor

@alabuzhev alabuzhev commented Jun 21, 2021

Replace PolyTextOutW with ExtTextOutW to allow substitution of missing
glyphs from other fonts.

Why not let Windows substitute the glyphs that are missing in the
current font? Currently the GDI renderer of conhost/OpenConsole uses
PolyTextOutW for drawing. PolyTextOutW doesn't try to substitute
any missing glyphs, so the only way to see, say, Hiragana is to change
the whole font to something like MS Gothic (which is eye-bleeding, to
be honest).

A trivial replace PolyTextOutW -> ExtTextOutW does the trick.

Switch to PolyTextOutW happened in Windows 7 along with introduction
of conhost.exe. Substitution worked in previous Windows versions, where
internal NT interfaces were used.

Before the change:

image

After the change:

image

Closes #10472

Implementation of microsoft#10472.
ExtTextOutW allows substitution of missing glyphs from other fonts.
@ghost ghost added Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Impact-Compatibility Sure don't work like it used to. Impact-Correctness It be wrong. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase labels Jun 21, 2021
@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

@msftbot make sure @miniksa signs off on this

@ghost ghost added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 22, 2021
@ghost
Copy link

ghost commented Jun 22, 2021

Hello @DHowett!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I'll only merge this pull request if it's approved by @miniksa

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

Notes for @miniksa:

Performance

I ran side-by-side WPR traces for PolyTextOut and ExtTextOutW with chafa and big.txt -- things that either require full screen invalidations or a lot of scrolling and region updates. big.txt would have been a "nothing but net" case for PolyTextOut, had we been blitting the entire buffer every time, as it requires rendering the entire screen in one color.

Poly

image

Ext

image

It's interesting that we got nearly a thousand more samples in PaintBufferOutput for ExtTextOut in a similar time period (10.5 seconds - this captures 5 seconds of the tail end of a chafa animation plus three invocations of time cat big.txt in WSL.)

Anyway: PolyTextOutW accounted for 99.2% of the samples under _FlushBufferLines whereas ExtTextOutW accounted for a whopping 99.3%.

Measurements taken on x64. Other architectures may have higher or lower system call overhead.

Performance: I'm not worried about the impact on performance.

Implementation Details

ExtTextOutW (user mode) eventually calls GreExtTextOutWLocked (kernel mode) once per invocation.

PolyTextOutW (user mode) eventually calls GreExtTextOutWLocked (kernel mode) once per line. 😁

There's one critical difference: The user mode implementation of ExtTextOutW performs batching and can submit the entire batch to the kernel in one system call. It effectively decays to PolyTextOutW.

@DHowett DHowett removed the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 22, 2021
@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

@msftbot make sure @miniksa signs off

@ghost ghost added the AutoMerge Marked for automatic merge by the bot when requirements are met label Jun 22, 2021
@ghost
Copy link

ghost commented Jun 22, 2021

Hello @DHowett!

Because you've given me some instructions on how to help merge this pull request, I'll be modifying my merge approach. Here's how I understand your requirements for merging this pull request:

  • I'll only merge this pull request if it's approved by @miniksa

If this doesn't seem right to you, you can tell me to cancel these instructions and use the auto-merge policy that has been configured for this repository. Try telling me "forget everything I just told you".

@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

sorry, had to re-trigger the bot 😄

Thanks for submitting this, @alabuzhev!

Copy link
Member

@miniksa miniksa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@DHowett answered everything I was worried about. Good enough then. Let's do it!

@ghost ghost merged commit b7fc0f2 into microsoft:main Jun 22, 2021
@alabuzhev
Copy link
Contributor Author

Implementation Details

ExtTextOutW (user mode) eventually calls GreExtTextOutWLocked (kernel mode) once per invocation.

PolyTextOutW (user mode) eventually calls GreExtTextOutWLocked (kernel mode) once per line. 😁

@DHowett since you've already looked, is it really by design that PolyTextOutW doesn't substitute? I can't find such limitation in MSDN, it only says "To draw a single string of text, the application should call the ExtTextOut function", implying that the only difference in behaviour is single / multiline usage. Maybe it's also an accidental consequence of some changeset from decades ago? And if it's intentional it would probably make sense to reflect that explicitly on that page.

@DHowett
Copy link
Member

DHowett commented Jun 22, 2021

I legitimately cannot figure out what's different about ETO when we call it versus when PolyTextOut calls it. Everything tells me that this should have just worked. . .

@HBelusca
Copy link
Contributor

HBelusca commented Jul 8, 2021

@ MS people: Can you ask the GDI team why the PolyTextOut doesn't do this glyph substitution, while the ExtTextOut function does? And whether it would be possible to correct PolyTextOut to do so?

ghost pushed a commit that referenced this pull request Jul 8, 2021
Currently, when the user changes the console codepage (manually or via a script) the GDI engine tries to find and set the "best possible" font. The "best possible" here is charset-wise, it doesn't mean that the font is actually better or more eye-candy for the user.

Example:
- Open cmd
- Set the font to Consolas
- Enter `chcp 932`
- Suddenly, a wild MS Gothic appears!

This kind of makes sense (*"if I'm changing the codepage I probably want to see the national characters"*) but it doesn't happen anywhere else - all other apps just substitute the missing glyphs.

After #10472 / #10478 this magic should finally work here as well. So, do we still need to change the whole font? Terminal doesn't do that after all.

## Validation Steps Performed
Download [932.cmd.txt](https://github.com/microsoft/terminal/files/6697577/932.cmd.txt),
rename to 932.cmd, run it, check if the output is still readable.

Closes #10497
@miniksa
Copy link
Member

miniksa commented Jul 8, 2021

@ MS people: Can you ask the GDI team why the PolyTextOut doesn't do this glyph substitution, while the ExtTextOut function does? And whether it would be possible to correct PolyTextOut to do so?

Yes, @HBelusca. I have done so this afternoon.

This is what I got from the lead who currently owns GDI:

The way it works is that GDI's ExtTextOutW will receive text calls from application, then ExtTextOutW instead of directly proceed to rendering, it sends the data to Uniscribe for language processing first. After the language processing (what you call "substitution"), Uniscribe will break down the original ExtTextOutW call to one or more ExtTextOutW calls, but this time with the ETO_IGNORELANGUAGE flag. ExtTextOutW sees this flag, it will not callback into Uniscribe again, but will directly proceed to rendering.

When I look at the PolyTextOut code, I do not see it call into Uniscribe, so the language processing ("substituion") doesn't happen."

Uniscribe is one of the older Microsoft implementations from the Windows XP era for processing complex scripts and can be found here: https://docs.microsoft.com/windows/win32/intl/uniscribe

Further on looking at the code with them:
ExtTextOutW appears to call LpkExtTextOut which is the Uniscribe code. I think that's the bit that @DHowett missed when he was looking into the source and couldn't tell what was up.

And then finally:

There are two possible explanations for this -> 1) oversight in the original implementation; we forgot to implement the feature in PolyTextOut. 2) PolyTextOut is used in some performance sensitive area and we left it unchanged to avoid the overhead caused by the language processing. Maybe there are other explanations but this stuff is really before my time.

When asked about whether they would add it to PolyTextOut, the answer was that they probably wouldn't at this point.

However, it was also elaborated...

If you look at ExtTextOutW more carefully, it has batching behavior too.
It batches the call until we think it is necessary to do the user mode to kernel transition.
In this day an age where we often times have to display different languages, ExtTextOutW is strongly recommended.

And when I asked if we should update the Remarks sections for these functions on docs.microsoft to add this additional information:

Yes we probably should make a remark on this.

So I've volunteered to reach out to my contacts on the content publishing team and get ExtTextOutA, ExtTextOutW, PolyTextOutA, and PolyTextOutW updated with some additional remarks that clarify the difference in font fallback/Uniscribe support, the recommendation that ExtTextOutW should be used today, and that it will do batching internally before making the kernel transition to help alleviate that worry versus PolyTextOut.

@alabuzhev alabuzhev deleted the 10472_missing_glyph_substitution branch July 9, 2021 00:49
@HBelusca
Copy link
Contributor

HBelusca commented Jul 9, 2021

Interesting, thanks @miniksa . But as I see in the documentation for ExtTextOut, its behaviour can be controlled with its 4th parameter (the options flags). What's furthermore interesting is that for PolyTextOut, this is controlled by the POLYTEXTW uiFlags member. It is interesting to observe that this uiFlags appears to take only a subset of the options of ExtTextOut (so, in principle, it could be made to support more of these too).

As you told, and as the documentation suggests, ExtTextOut by default supports the language processing unless ETO_IGNORELANGUAGE is specified, while in PolyTextOut/POLYTEXTW there is no mention of this flag... this would be obvious indeed if the default behaviour of the latter is to not perform such language processing, maybe, as you say, for performance reasons.
Thus I am wondering, whether it couldn't be possible, to add some sort of ETO_ENABLELANGUAGE for PolyTextOut in order actually to make it activate language processing (in order to remain backward-compatible with any application that would expect no processing by default). Also I'm curious to know why it couldn't be possible to enable GDI batching for PolyTextOut, if it also could improve performance there... 🤔

@miniksa
Copy link
Member

miniksa commented Jul 9, 2021

It's possible. Don't get us wrong there. It's just that I don't think it's likely to happen. Significantly less likely to happen than me just updating the documentation to explain the way the world is and make recommendations on which path to use.

If I had to hazard a guess (and this is a guess based on my personal experience, please don't misinterpret it as some declaration)...

  • GDI has been generally superseded by the DirectX family of graphics technologies
  • GDI is a kernel-mode based technology which we generally don't favor anymore primarily due to security
  • GDI is old so its behavior and quirks are relied on by a large swath of existing software so any change is risky
  • GDI is old... so none of the original owners or decision makers are around to inform changes anymore
  • The team that owns GDI likely owns a handful of other technologies and areas that are more relevant to modern development
  • This particular issue is new or hasn't been popular enough, on a broad developer market scale, for it to rank higher than the other ongoing work

I would guess that a combination of all of these factors go into the unlikelihood that PolyTextOut will be updated at this point.

I will have no problem in looking these sorts of things up for you now and in the future. And I also have no problem updating the official public documentation to help better inform you and others.

@miniksa
Copy link
Member

miniksa commented Jul 12, 2021

MicrosoftDocs/sdk-api#837

@ghost
Copy link

ghost commented Jul 14, 2021

🎉Windows Terminal Preview v1.10.1933.0 has been released which incorporates this pull request.:tada:

Handy links:

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues AutoMerge Marked for automatic merge by the bot when requirements are met Impact-Compatibility Sure don't work like it used to. Impact-Correctness It be wrong. Issue-Task It's a feature request, but it doesn't really need a major design. Priority-2 A description (P2) Product-Conhost For issues in the Console codebase
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Missing glyph substitution
5 participants