Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve morphAttributes perfomance #26692

Closed
CITIZENDOT opened this issue Sep 4, 2023 · 16 comments
Closed

improve morphAttributes perfomance #26692

CITIZENDOT opened this issue Sep 4, 2023 · 16 comments

Comments

@CITIZENDOT
Copy link
Contributor

CITIZENDOT commented Sep 4, 2023

Description

Currently, morphAttributes are handled inefficiently because, before uploading the DataTextures to GPU, renderer iterates over every element in the morphAttribute's ArrayBuffer and assigns it to buffer (which is used to create DataTexture). This does not happen with BufferGeometry.attributes.

for ( let i = 0; i < morphTargetsCount; i ++ ) {
const morphTarget = morphTargets[ i ];
const morphNormal = morphNormals[ i ];
const morphColor = morphColors[ i ];
const offset = width * height * 4 * i;
for ( let j = 0; j < morphTarget.count; j ++ ) {
const stride = j * vertexDataStride;
if ( hasMorphPosition === true ) {
morph.fromBufferAttribute( morphTarget, j );
buffer[ offset + stride + 0 ] = morph.x;
buffer[ offset + stride + 1 ] = morph.y;
buffer[ offset + stride + 2 ] = morph.z;
buffer[ offset + stride + 3 ] = 0;

This is a perfomance profile for a geometry with only 2 morphAttributes. uploadTexture takes about 0.66ms whereas the update is taking 16ms. Because it is iterating over every single element in the ArrayBuffer before uploading to GPU.

image

Solution

I'm not sure if this is the only way, but If the morphAttributes are parsed such that, it can be directly consumed by GPU/WebGL, this overhead can be avoided.

Alternatives

NA

Additional context

No response

@Mugen87
Copy link
Collaborator

Mugen87 commented Sep 4, 2023

There is no other way to transfer and arrange the attribute data to the data texture so the initial overhead is inevitable. However, the usage of a data texture enables much more active morph targets (compared to the previous approach based on attributes which only allows 8).

@donmccurdy
Copy link
Collaborator

To add a bit of context from a prior discussion elsewhere – the use case here involves animation based on sequences of morph targets. The mesh has 30-50K vertices, and a morph target for each frame of animation. No more than 2-3 morph targets are active at any one time. Similar to the models found at http://www.ro.me/tech/, but with much higher polygon counts.

Perhaps this is a situation where our previous approach to morph targets performed better? @CITIZENDOT would you be willing to compare performance in r132 or lower, before #22293?

I think an even better solution for this use case would be something like Vertex Animation Textures (VAT), but that is more advanced as there are currently few tools or tutorials for using the technique in three.js.

@CITIZENDOT
Copy link
Contributor Author

Sure, I'll compare the performance with the old one. But the baseline I was looking for is geometry.attributes. Whether morphAttributes are efficient than BufferGeometry.attributes.

@donmccurdy
Copy link
Collaborator

Testing with a WebGL 1 renderer should have the same effect.

@CITIZENDOT
Copy link
Contributor Author

I should add more context here. I'm trying to update morphAttributes without creating new BufferGeometry instance.

I made this change in my local threejs repo (will make a PR soon). I'm updating morphAttributes every 5 frames. While doing so, I found this time consuming update function. For every 5 frames (that is for every morphAttribute upload), there is an overhead.

@CITIZENDOT
Copy link
Contributor Author

CITIZENDOT commented Sep 5, 2023

Testing with a WebGL 1 renderer should have the same effect.

Tested with WebGL1Renderer. The perfomance has improved considerably. Not sure about the 8 morphTarget limitation, I tried with 2, 8, 25 and 60 morphTargets and all of them are played without any issue. update function is taking ~1ms when using 25 morphTargets (which is great!)

@CITIZENDOT
Copy link
Contributor Author

CITIZENDOT commented Sep 5, 2023

One more minor correction I see here:

const buffer = new Float32Array( width * height * 4 * morphTargetsCount );

morphAttribute might be of Int8Array or Int16Array as well, which are 4x or 2x smaller than Float32Array respectively. We should probably use the constructor from morphAttribute instead. Unless webgl only accepts Float32Array for morphAttributes.

@donmccurdy
Copy link
Collaborator

Tested with WebGL1Renderer. The performance has improved considerably.

By "improved considerably" ... do you mean the performance is acceptable? Or still not great? Creating a DataTexture in the render loop will always have some overhead during setup, which is required to support unlimited morph targets. I'm not sure we can bring that overhead down so much that you can upload >1 million vertices per second (my estimate - feel free to correct!), this is neither how morph targets are typically used, nor what we've optimized the implementation for.

If the vertex attributes used in WebGL1 are working better, that's great! If not, I think preparing a data texture in advance and animating with a custom shader might be the better approach.

Not sure about the 8 morphTarget limitation, I tried with 2, 8, 25 and 60 morphTargets and all of them are played without any issue.

The limitation here is 4-8 active morph targets, where active means a target has weight >0. The situation here, where each morph target is played sequentially, is less common.

We should probably use the constructor from morphAttribute instead. Unless webgl only accepts Float32Array for morphAttributes.

My guess would be your morph attributes are normalized? (If you can share ways to reproduce an issue, or models, that helps us avoid guessing). If so they are being used in the shader as floats anyway, and the conversion needs to be made.

@CITIZENDOT
Copy link
Contributor Author

By "improved considerably" ... do you mean the performance is acceptable?

It is great! I meant the change is perfomance is very much visible.

The limitation here is 4-8 active morph targets, where active means a target has weight >0. The situation here, where each morph target is played sequentially, is less common.

Oh, yea, We probably need 2-3 active morph targets at most. I think it'll be a good option to add the old feature as optional? Because supporting unlimited morphTargets with an overhead is not fully superior to supporting 8 attributes without overhead.

I'm preparing codesandbox. Sharing it in an hour. Thank you so much for your time.

@Mugen87
Copy link
Collaborator

Mugen87 commented Sep 5, 2023

The problem is that constantly changing morph target data is not a common use case, imo. Morph targets similar to animation clips are usually authored once in a DCC tool and then exported. I'm not sure it's worth to retain the "old" code path and make it configurable since it was the plan to delete it as soon as WebGL 1 support is stopped.

@CITIZENDOT
Copy link
Contributor Author

CITIZENDOT commented Sep 5, 2023

https://codesandbox.io/s/great-lucy-97kg6t

It's not reproducible, because I couldn't upload the files (file size limitation by codesandbox). But the source contains pretty much what I'm trying to do. (I added some comments explaining why I'm doing some things)

My guess would be your morph attributes are normalized?

No, The morphAttributes are obtained from meshopt compressed GLB. So, I think it quantized them.

@CITIZENDOT
Copy link
Contributor Author

VAT fit my usecase perfectly (both in terms of perfomance and updating them whenever I wish). Should we close this?

@donmccurdy
Copy link
Collaborator

No, The morphAttributes are obtained from meshopt compressed GLB. So, I think it quantized them...

Could be either, in that case — I believe gltfpack uses non-normalized vertex attributes for meshopt compression, and gltf-transform uses normalized attributes.


@Mugen87 I'm OK with whatever you prefer here. If the old implementation is more code than we'd want to keep around after WebGL 1 support is gone, then that is fine. I've wondered whether — for use cases that don't update morph targets — there's a practical difference in the runtime performance when sampling from a texture, but I haven't seen issues like that reported.

@CITIZENDOT
Copy link
Contributor Author

gltfpack uses non-normalized vertex attributes for meshopt compression

that is true. I checked this.

@Mugen87
Copy link
Collaborator

Mugen87 commented Sep 5, 2023

Sampling from textures is in general a fast operation. Moreover, the morph target code uses texelFetch() which is even faster than texture2D() since it bypasses all filtering (it just reads the raw texel value). The performance problem presented in this issue is related to the recurring texture prepare and upload operations which the implementation is not designed for.

Since the OP uses just 2-3 active morph targets with frequently changed data, I would recommend to not using morph targets at all but custom buffer attributes (with DynamicDrawUsage) and a shader material or material enhancement via onBeforeCompile().

@donmccurdy
Copy link
Collaborator

I'd lean toward a data texture or video texture and a custom shader I think, but in any case yes, morph targets are probably not the most efficient approach in this situation. I'll close this for now then, and we can decide on the WebGL 1 code another time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants