Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

10x shader compilation time regression from r132 to r133 (Windows) #22758

Closed
toji opened this issue Oct 29, 2021 · 19 comments
Closed

10x shader compilation time regression from r132 to r133 (Windows) #22758

toji opened this issue Oct 29, 2021 · 19 comments

Comments

@toji
Copy link
Contributor

toji commented Oct 29, 2021

Describe the bug

This appears to be the core issue behind google/model-viewer#2915

It seems that something about a recent materials change has caused shader compilation on my PC to take ~10x(!) longer than it used to. If I were to guess at why I'd say that it's probably the DirectX shader compiler hitting a pathological case that wasn't there before and trying to do an insane loop unroll or something of that nature.

To Reproduce

Steps to reproduce the behavior:

  1. git checkout r132
  2. Open examples/webgl_animation_keyframes.html
  3. Observe load time
  4. git checkout r133
  5. Refresh examples/webgl_animation_keyframes.html
  6. Observe (much longer) load time

With r132 I get the following recording from the performance tab in Chrome dev tools (though it's worth noting the same regression is also observable when using Firefox):

Screenshot 2021-10-29 113147

You can see that the largest block of work on the main thread takes ~500ms. It bottoms out at a bunch of getProgramInfoLog calls and a couple of texImage2D calls that all take about 20-50ms.

With r133 the recording looks significantly different. There are multiple calls to getProgramInfoLog that take ~150ms each and one particularly long one that takes 3+ seconds! In total, the block of work that represents to model's upload to the GPU takes nearly 5 seconds to complete, a 10x increase between versions (and consistent with what was seeing.)

Screenshot 2021-10-29 113654

Platform:

  • Device: Desktop, Intel Core i7-10700K, Nvidia GTX 1660 Super
  • OS: Windows 11
  • Browser: Chrome Canary 97.0.4685.2, Firefox Nightly 95.0a1
  • Three.js version: r133
@elalish
Copy link
Contributor

elalish commented Oct 29, 2021

Indeed, I'm doing a roll-back of model-viewer v1.9 right now because of this regression. Thanks @toji for the detailed repro!

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 29, 2021

It seems this is a duplicate of #22631.

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 29, 2021

Would be great if you could bump: https://bugs.chromium.org/p/chromium/issues/detail?id=1256340

@elalish
Copy link
Contributor

elalish commented Oct 29, 2021

I don't believe that's true, as r134 has the same behavior as r133 here. Unless that issue hasn't been completely fixed yet?

Edit, now I see it was closed not as fixed.

@toji
Copy link
Contributor Author

toji commented Oct 29, 2021

Is it? While the end result seems similar that one seemed to focus on stalls when uploading textures, while this issue is pretty clearly showing that stalls are coming from program compilation.

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 29, 2021

I don't believe that's true, as r134 has the same behavior as r133 here. Unless that issue hasn't been completely fixed yet?

No, it isn't. The issue has been closed because it needs to be fixed on the browser side.

The only option to fix this in three would be to revert the usage of SRGB8_ALPHA8 and accept wrong colors with sRGB encoded textures due to wrong texture filtering (on all platforms), see #22483.

While the end result seems similar that one seemed to focus on stalls when uploading textures, while this issue is pretty clearly showing that stalls are coming from program compilation.

If the regression disappears by using WebGL1Renderer, SRGB8_ALPHA8 is the problem. I'm not sure why getProgramInfoLog() shows an increase in execution time though...

@toji
Copy link
Contributor Author

toji commented Oct 29, 2021

Ah, I see that this IS a very direct duplicate of #22700, and in that thread they do point out that removing sRGBEncoding resolves the issue.

I'm not sure why that would be showing up in getProgramInfoLog() either. Does setting sRGBEncoding cause any shader changes?

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 29, 2021

Does setting sRGBEncoding cause any shader changes?

When loading an asset via GLTFLoader then no. The encoding is defined before the first shader compilation happens so there should not be an overhead in that regard.

@elalish
Copy link
Contributor

elalish commented Oct 29, 2021

Yeah, and we remove the getProgramInfoLog() calls by setting checkShaderErrors = false, in which case it just seem to be the next gl call that gets hammered. Really wish Chrome's profiler somehow made it clearer where time was actually being spent.

@toji
Copy link
Contributor Author

toji commented Oct 29, 2021

Confirmed this against webgl_animation_keyframes.html. If I go into r133's code and change WebGLTextures.js:169 from

if ( glType === _gl.UNSIGNED_BYTE ) internalFormat = ( encoding === sRGBEncoding ) ? _gl.SRGB8_ALPHA8 : _gl.RGBA8;

to simply

if ( glType === _gl.UNSIGNED_BYTE ) internalFormat = _gl.RGBA8;

Then load times are instantly better (though the scene is predictably washed out.) Which... is straight up baffling to me, honestly. 😵

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 29, 2021

Disabling SRGB8_ALPHA8 in the engine looks like so: #22759

Changing WebGLTextures is one part of the solution. We also have to ensure that the engine does not inject the inline GLSL decode and PMREMGenerator does not use SRGB8_ALPHA8 as well.

@toji
Copy link
Contributor Author

toji commented Oct 29, 2021

Hold on, think I've found something: Do you have any sense of how difficult it would be to make use of texStorage2D in this scenario? (WebGL2 only) I have an early indication that may be a workaround for the performance hit here. (And should be a best practice all around.)

@toji
Copy link
Contributor Author

toji commented Oct 29, 2021

Left a comment on #22631 with more of my findings.

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 30, 2021

Do you have any sense of how difficult it would be to make use of texStorage2D in this scenario?

Related issue: #21874

One reason I did not focus on the issue yet is #21874 (comment) and the fact that is requires a bit of refactoring in WebGLTextures.

When I remember correctly you must know in advance the number of mips for a texture whereas with texImage2D() you don't have to. If we want to use texStorage2D(), I guess we need the result of the following computation before making the WebGL call.

Math.log2( Math.max( width, height, depth ) )

@Mugen87
Copy link
Collaborator

Mugen87 commented Oct 30, 2021

TBH, I always fear refactorings in WebGLTextures a bit because of the many code paths and the respective complexity.

For a quick fix of this issue I would rather merge #22759 and plan more time for the introduction of texStorage2D().

@hybridherbst
Copy link
Contributor

The file in #22785 (closed as duplicate of this issue) exhibits slowdowns way beyond the 10x mark (going from "loads immediately" to "loads never and makes the entire browser unresponsive").

@mrdoob
Copy link
Owner

mrdoob commented Nov 5, 2021

Probably best to wait a few weeks on https://bugs.chromium.org/p/chromium/issues/detail?id=1256340 but sounds like the best solution for this would be to disable SRGB8_ALPHA8 (#22759) and try again in a few months.

@Mugen87
Copy link
Collaborator

Mugen87 commented Nov 5, 2021

sounds like the best solution for this would be to disable SRGB8_ALPHA8 (#22759) and try again in a few months.

That sounds like the best solution to me, too!

@Mugen87
Copy link
Collaborator

Mugen87 commented Nov 10, 2021

Since #22759 has been merged, the performance regression should disappear with r134.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants