Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a DxRenderer based on a glyph atlas #10461

Closed
lhecker opened this issue Jun 19, 2021 · 33 comments
Closed

Add a DxRenderer based on a glyph atlas #10461

lhecker opened this issue Jun 19, 2021 · 33 comments
Labels
Area-AtlasEngine Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.

Comments

@lhecker
Copy link
Member

lhecker commented Jun 19, 2021

Description of the new feature/enhancement

While the initial layouting and rasterization of glyphs is computationally expensive, the composition using a classic texture atlas and glyph-lookup-texture is extremely fast and, unless new glyphs appear on the screen, can be rendered in a single pass. Such an implementation would provide us with a high-framerate, low-latency renderer.

Proposed technical implementation details

The initial implementation to just render pure, colored, single-glyph-per-code-point text is quite trivial obviously. Guidance for this can be obtained from many sources throughout the web (even WikiBooks!). After an initial implementation has been drafted additional features could be added incrementally over time.
The renderer should live as an optional feature, that can be toggled on if the user wants to.

Further experience can be gained from the alacritty project.

Alternative solutions

DirectWrite uses a glyph atlas internally and we could continue to rely on it, but just optimize our render pass instead.
For instance a hybrid-approach would be feasible: Render everything but glyphs using a shader.

@lhecker lhecker added Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Area-Performance Performance-related issue Product-Terminal The new Windows Terminal. labels Jun 19, 2021
@lhecker lhecker added this to the Terminal Backlog milestone Jun 19, 2021
@ghost ghost added the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jun 19, 2021
@lhecker
Copy link
Member Author

lhecker commented Jun 19, 2021

Feedback welcome! 🤗
(I'll keep the issue description updated with any feedback.)

@skyline75489
Copy link
Collaborator

Sure. Why not?

We have software renderer option for people who face hardware rendering issue. Adding a “performance” renderer for people who want the most of the performance(& use only English & want none of the Unicode features) sounds reasonable. Also alacritty has done this before, if I’m not mistaken. So there’s at least one popular examples.

I humbly ask for basic CJK support for the initial implementation, because, well, I live in a cave where people use CJK as their primary languages.

@cedric-h
Copy link

cedric-h commented Jun 19, 2021

I don't think ClearType in particular has been done before.

image

Here's a picture I took of Alacritty in Renderdoc. All of the glyphs are rendered in two draw calls, one for background and one for foreground. I believe this is done to facilitate ClearType.

EDIT: Confirmation from Alacritty contributor that ClearType is in use: alacritty/alacritty#2645 (comment)

@cedric-h
Copy link

Sure. Why not?

We have software renderer option for people who face hardware rendering issue. Adding a “performance” renderer for people who want the most of the performance(& use only English & want none of the Unicode features) sounds reasonable. Also alacritty has done this before, if I’m not mistaken. So there’s at least one popular examples.

I humbly ask for basic CJK support for the initial implementation, because, well, I live in a cave where people use CJK as their primary languages.

The obvious answer to "Why not?" is because the benchmarking experiments done here and here suggest that the naive string parsing is a bottleneck long before rendering comes into play.

With regards to supporting multiple languages, that should not be a significant limitation, it's simply a matter of adding more glyphs to your atlas; see this example created in response to the controversy in this issue

@mdtauk
Copy link

mdtauk commented Jun 19, 2021

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

@cedric-h
Copy link

cedric-h commented Jun 19, 2021

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely (whenever a never-before-rendered character is encountered) compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator which uses this technique to render.

@mdtauk
Copy link

mdtauk commented Jun 19, 2021

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator.

Ah so the Atlas is for the window display itself, not as a texture source off-screen for use when rendering?

@cedric-h
Copy link

I may be very simplistic here, but does this mean larger font sizes will have a performance impact?

You render your glyph atlas very rarely compared to how often the actual grid of glyphs is rendered (every frame) and larger font sizes mean less cells in your grid, so generally larger font sizes will always be more performant in a terminal emulator.

Ah so the Atlas is for the window display itself, not as a texture source off-screen for use when rendering?

No, that is the opposite of what I intended to convey.

@lhecker
Copy link
Member Author

lhecker commented Jun 19, 2021

@skyline75489
I believe most of CJK is exactly the "single-glyph-per-char" case I mentioned and should be simple to render.

@cedric-h
Nice find! As far as I can see they only use ClearType for rendering the alpha texture of the glyph, which we can do as well. That doesn't mean the final composition is proper ClearType though (since that requires knowledge of the background color among others). They draw in two passes because they draw background and text separately. But it'd be absolutely awesome if I'm wrong! Can you point me to a comment explaining how they implement ClearType during the final render? I can't find anything in the code nor issues about it unfortunately.

The obvious answer to "Why not?" is because [...]

Please don't be inflammatory, alright? 🤗
I'm sure you know full well that "Why not?" is an idiom.

@skyline75489
Copy link
Collaborator

skyline75489 commented Jun 19, 2021

@lhecker For Chinese, most characters I believe belong to the “single-glyph-per-char" category. But there’s other issues for Japanese and Hangul demonstrated in #3546, naming IVS (Ideographic Variation Sequence) for Japanese and complex composite in Hangul.

IVS I believe also exists in Traditional Chinese (#8731) but I don’t know that side of the story much, meaning that I don’t know how important IVS is in Traditional Chinese environment.

@skyline75489
Copy link
Collaborator

skyline75489 commented Jun 19, 2021

I don’t know Hangul, but it seems to be even harder when it comes to composition, which requires NFKC composition. See #3578 (comment)

Man, finding those comments bring back memories of the good old days 😅

@mmozeiko
Copy link

mmozeiko commented Jun 19, 2021

Freetype is dual licenseed - BSD like FTL license and GPL. it is not just GPL license.

Shader from Alacritty does freetype style ClearType rendering using Dual Source Blending that OpenGL or D3D11 both support. Fragment shader outputs three blend factors separately for r/g/b channels. Although I'm not sure if they are doing that in correct colorspace (linear/sRGB), haven't look in their code to understand that completely.

That said - there is no reason to do this kind of blending. As renderer who draws text knows exactly what are both - background and foreground colors. So it can do whatever style blending & colorspace it wants directly in shader, and just output final cleartype color.

@lhecker
Copy link
Member Author

lhecker commented Jun 19, 2021

@mmozeiko I'm currently considering to implement at least the initial version, as mentioned in the issue.
As you probably know, I cannot legally look at your (unlicensed) shader source code and haven't done so far.
If you ever feel like it, you can accelerate the development of the new renderer by contributing any small demo application that contains your shader and optionally its DirectWrite logic. This would help me skip the experimental phase and I could focus on integrating it into the existing render interface. Of course, please don't feel pressured to do so. 🙂

edit: I have been told that this message "sounds so nice it's borderline sarcastic". 😄
But if you look at my other comments on my profile you'll notice that I almost always write like this!
I'm simply considering this a reset from the mood of the very heated previous issue.

@mmozeiko
Copy link

No problem.
I donate my shader to public domain, feel free to use it or relicense it however you wish: https://gist.github.com/mmozeiko/c7cd68ba0733a0d9e4f0a97691a50d39

@skyline75489

This comment has been minimized.

@skyline75489

This comment has been minimized.

@skyline75489

This comment has been minimized.

@lhecker
Copy link
Member Author

lhecker commented Jun 20, 2021

@skyline75489 Yeah I personally agree that it's hard to reason about what's possible and what isn't when unicode comes into play.

But to be fair monospace-only support simplifies most things. For instance our console buffer currently is N characters wide, and should be N glyphs wide (and not just N graphemes either!). Solving this issue seems awfully similar to solving the layout issue for a glyph atlas. If we know what forms entire glyphs I believe there aren't many issues anymore you could possibly have.
In other words: A glyph atlas is entirely capable of drawing almost all of unicode!

Now the nice thing about using DirectWrite for entire runs of glyphs (and not just single glyphs) though is that we know we can trust it to perform entirely correctly no matter what we call it with. Fixing our unicode support in the buffer and non-DirectWrite related rendering code is far more trivial than actually figuring out what forms a glyph and could be done "relatively" soon. This would give us full unicode rendering support, even if the width buffer rows don't match the glyph count (for instance hebrew would look correctly, but you can only have let's say 20 glyphs on an 80-cell wide buffer row). A glyph atlas forces us to solve that prematurely and entirely.

The list of fundamentally unsolved glyph atlas issues that Alacritty has for instance shouldn't be taken lightly.
Many of those issues, again fundamentally, don't exist if you use DirectWrite for entire glyph runs.
Certainly one of the many reasons it was called as hard as a dissertation before. Skia's implementation certainly is...

That said I'm convinced we should make the glyph atlas renderer separate from the current DxRenderer.
While it slightly increases the maintenance burden, both approaches are fundamentally different. Additionally there are some concerns which I'll be mentioning later.


With all the above being said, it doesn’t hurt anyone if we or the enthusiasts from the gaming industry take a shot at this specific project.

Yep! And I'm entirely on it!
It did hurt though, to be an actual human with feelings and stuff, who might get offended if people are rude. 😐
Sarcasm aside: It is on us, for not taking the technical advice at face value, no matter the delivery!
This one of the reasons we've since reopened the gates for @cmuratori et. al. again, after some time of self-reflection.

@skyline75489
Copy link
Collaborator

Thanks @lhecker for the explanation. I learned a lot and I guess I still have a lot to learn.

@DHowett
Copy link
Member

DHowett commented Jun 21, 2021

I was clearly mistaken as to how hard this work would be. I'm glad, and I appreciate being corrected.

Terminal cannot turn away valuable performance work simply on ideological grounds.

Anyway-
I want to establish some ground rules:

  • This renderer should be behind a til::feature (see src/features.xml); you can decide whether it is compiled-in or compiled-out by default. These are not toggleable at runtime, but it will give us the ability to make sure the code doesn't go out in Stable or even Preview until it's ready for people to turn it on.
    • (I want to use til::feature for more "out in the open" development, rather than having long-running feature branches)
  • We need to determine how to expose this switch to the user, and the cost of parallel development on both renderers.
  • Before we begin in earnest bringing an atlas renderer into the codebase, I'd like to see some progress on reducing the renderer's dependency on the global console lock to be as small as possible. That will help even the venerable GDI renderer.

Fair?

@DHowett DHowett removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Jul 2, 2021
@Tyriar
Copy link
Member

Tyriar commented Jul 4, 2021

If you want a reference for this the webgl renderer in xterm.js uses a texture atlas as well. Some thoughts from implementing this:

@hfhchan
Copy link

hfhchan commented Aug 27, 2021

Addedum: I think the people implementing know the following points already, and there are valid reasons for not supporting the following scenarios in a quick-path implementation. But I have included them here it because I found some of the points mentioned in previous comments being insensitive to requirements for East Asian scripts. East Asian scripts are typically not even considered complex.

This is done by calling the GetTextComplexity API) because of the existence of “locale based” letter forms(locl), which depends on the font being used and the current locale. This feature is used in languages like Turkish & Polish & etc. Some might think “this is just nonsense. No one really needs this”.

locl support is required for correctly rendering CJK characters when using pan-CJK fonts. Windows does not ship with any fonts that support proper glyphs for Chinese Traditional (Hong Kong), only for Chinese Traditional (Taiwan). The currently most popular way to get fonts that adhere to the Hong Kong government reference orthography is by installing Source Han Sans, which relies on the locl tag to deliver the correct glyphs.

But to be fair monospace-only support simplifies most things. For instance our console buffer currently is N characters wide, and should be N glyphs wide (and not just N graphemes either!). Solving this issue seems awfully similar to solving the layout issue for a glyph atlas. If we know what forms entire glyphs I believe there aren't many issues anymore you could possibly have.

CJK characters are supposed to render as full-width characters, i.e. take up double the space used by a single ASCII character. You should never assume that N characters wide will be N glyphs wide. For certain Latin based scripts such as Vietnamese and Chinese pinyin you need to support combining characters which have no pre-composed character equivalents.

Support for IVS characters with CJK characters are also necessary for Chinese locales apart from the Japanese locale. The Macao Special Administrative Region has already registered an IVD collection including glyphs which need to be supported for Chinese (Traditional) use for Macao users. Other regions have IVD collections in the works.

@ndwork
Copy link

ndwork commented Nov 7, 2021

Any progress here? It’s been four months since the issue was opened and two since the last meaningful comment.

@lhecker

@lhecker
Copy link
Member Author

lhecker commented Nov 7, 2021

@ndwork It's something that already exists and is being tested internally... 🙂
image

As I've mentioned in the other issue, I'm aiming for the 1.13 Preview release. I hope you can understand that we had to work on our pre-existing roadmap first over the last few months. But as of about 2 weeks ago I've been working on this most of the time. Once something that's stable, tested and usable has landed in Windows Terminal I'll make sure to let everyone know in this issue.

@rbeesley
Copy link
Contributor

rbeesley commented Jan 4, 2022

I think I'm seeing this problem. I checked and I only have 1.12 Preview, so I can't see if this fixes it for me.

I was running this crt.hlsl experimental pixelShader, and I was looking for anything else I might have missed in the Ansi-Color.cmd tool (hopefully part of a future release, but the attached PR would let you replicate this), and running the plaid.def through this tool, I was seeing a 10% CPU increase. I thought maybe the shader was computationally expensive for some reason, so I also tried grid.hlsl which does nothing more than to render a grid of squares across the terminal, it too showed the same CPU overhead. None of the other definition files were causing this issue, but it is notable that plaid.def uses the block element █ (U+2588), and the shade elements ░ (U+2591), ▒ (U+2592), and ▓ (U+2593), and shows a tight grid of 1024 of these characters (4 each for a 16x16 combination of colors).

So I think what is happening is that rendering these 4 glyphs is very expensive for Terminal and it is costing a lot to render this for each frame of the shader? Is this something I should just wait until 1.13 Preview and file a performance bug if I'm still seeing a CPU hit? It seems like #10362 is the same problem I ran into, but maybe adding the shader just amplifies it the problem for me significantly?

@zadjii-msft
Copy link
Member

@rbeesley you may be more specifically hitting #6974

@rbeesley
Copy link
Contributor

rbeesley commented Jan 4, 2022

@zadjii-msft, reading through that bug report, it does seem like it fits. I'll put more information there.

@zadjii-msft zadjii-msft added the Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release. label Jan 28, 2022
@lhecker
Copy link
Member Author

lhecker commented Feb 3, 2022

Windows Terminal Preview v1.13.10336.0 was released today, which features a new rendering engine. While it doesn't solve all of the performance issues reported here yet, it's a significant improvement nonetheless.

You can enable it this way:

  • Open settings
  • Select any profile (including "Defaults")
  • Select "Advanced" at the bottom
  • Select "Enable experimental text rendering engine"

image

The performance should be about the same in the worst case (regular black/white text), but significantly better for highly colored text (text that exceeds 20 distinct colors on a screen). Additionally this engine won't be limited to 60 FPS anymore.
Additionally I'm currently writing a blog post detailing why this issue occurs in Direct2D.

If you find any issues or got any feedback for this new text renderer, please let us know in #9999. 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-AtlasEngine Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Terminal The new Windows Terminal. Resolution-Fix-Committed Fix is checked in, but it might be 3-4 weeks until a release.
Projects
None yet
Development

No branches or pull requests

14 participants