Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gltfpack: Remove duplicated frames from animation #766

Open
kzhsw opened this issue Sep 12, 2024 · 2 comments
Open

gltfpack: Remove duplicated frames from animation #766

kzhsw opened this issue Sep 12, 2024 · 2 comments

Comments

@kzhsw
Copy link

kzhsw commented Sep 12, 2024

Motivation:
Currently gltfpack optimizes animation with a fixed-fps mode, which causes duplicated frames for slow animations, or animations that have a long duration.
Taking this model as an example, optimizing it with gltfpack -i untitled.glb -o untitled_gltfpack.glb , the optimization not only makes the model larger (from ~72k to ~90k), but also make keyframe count increases from 4492 to 5615. In some cases this could hurt storage and memory.
untitled.zip
untitled_gltfpack.zip

Proposal:
Remove duplicated frames from animation. Here "duplicated frames" means frames that contribute nothing or very few to the animation. If a frame can be interpolated by the frame before and the frame after it, then it's duplicated.
Here are examples frames with LINEAR interpolation:

input 1 2 3
output (1,1,1) (2,2,2) (3,3,3)

Here the frame with input 2 can be linearly interpolated by the frame before and the frame after it, so it's duplicated.

glTF-Transform has implemented this algorithm here with docs here, and here is a impl in c for refrerence.

A command option is needed to switch on this feature, and another optional one for configuring tolerance.

Alternatives:
2-passes, first use gltfpack, and second use glTF-Transform.

@zeux
Copy link
Owner

zeux commented Sep 12, 2024

I agree this can be beneficial. However, this also may not be. gltfpack uses the fixed rate resampling strategy to ensure that the time can be shared across all tracks. The extra redundancy in the data can be a problem, but it's partially mitigated by compression (-cc). I think I also had a viewer implementation strategy in mind that preserves some of this efficiency at runtime (this avoids the need to binary search keyframe times), but that is probably not very relevant without a dedicated extension.

Another potential problem I've seen with systems like this in the past is that the error introduced during tolerance based resampling accumulates through a hierarchy, which can lead to extra small movement of bones that are not supposed to move; the common example is feet not being firmly planted on the ground.

For animations with just a single track this is probably almost always a good idea, although I think compression efficiency is reduced on data with gaps so this would need to be verified.

Ideally this should be tested on some real world animation data to see how many keyframes this saves and how often this leads to splitting the time tracks, to determine if this is broadly beneficial, and validate the hierarchical error to see how low the tolerance can be driven while avoiding that on deep bone chains.

@zeux zeux added the gltfpack label Sep 12, 2024
@zeux
Copy link
Owner

zeux commented Sep 16, 2024

Alright, a couple notes since I looked into this a little bit.

  1. In general, the optimization can certainly be effective depending on the input asset. Tested this a little bit on some assets and sometimes a fair amount of keyframes are redundant (dots indicate needed keyframes).

image

  1. I am worried about the hierarchical accumulation issue. In fact glTF-Transform with default tolerance shows this on https://github.com/mrdoob/three.js/blob/dev/examples/models/gltf/Soldier.glb (you'd need to look at left foot touching the ground fairly closely; after resampling it exhibits slight jitter). Easier to see with larger tolerance; to be fair, gltfpack has this issue with -cc as well unless you bump the default settings for animation rotation up. Because of this, I suspect this needs to be opt-in via a separate argument.

  2. Due to a slight floating point wobble that is unavoidable if each track is pruned independently, in some cases in a hierarchy the kept keyframes are slightly offset from each other on time axis. This may exacerbate (2) above the quantization, and also will result in worse sharing between time tracks.

image

  1. Having said that, sharing time tracks is still essential - on many assets many tracks end up sharing the kept frame subsets, and we'd only need a single input for any track with the same mask. Also, sometimes only a couple frames end up being dropped from an otherwise long track; a small threshold will be necessary to maintain efficiency for almost-incompressible tracks (e.g. must remove X% of frames) - see first two tracks on the image.

image

  1. gltfpack today already has a mode that triggers a 2-keyframe track export; this is necessary for constant tracks to maintain animation time start/end. On some animation tracks I am seeing all intermediate keyframes between first and last one becoming redundant, but the constant track mode doesn't activate because they are different. This is interesting because this would be a subset of this optimization that can be enabled unconditionally. Would also need to test how many constant tracks there usually are and what overhead they carry, as maybe all constant tracks can become 2-keyframe tracks with minimal cost...

  2. As I noted before, compression makes the improvements here harder to justify. For example, here's animation stats from a couple scenes:

junkrat without compression: current 54944 bytes, pruned 44520 bytes
junkrat with compression (-c): current 24333 bytes, pruned 26954 bytes (a little worse)
junkrat with compression (-cc): current 12286 bytes, pruned 19565 bytes (worse)
junkrat with compression (-cc -ar 14): current 14635 bytes, pruned 20409 bytes (worse)

rubyrose without compression: current 118580 bytes, pruned 75064 bytes
rubyrose with compression (-c): current 54782 bytes, pruned 47603 bytes (better)
rubyrose with compression (-cc): current 28355 bytes, pruned 35699 bytes (worse)
rubyrose with compression (-cc -ar 14): current 37263 bytes, pruned 38590 bytes (a little worse)

That said, this is with limited input sharing between tracks (only consecutive tracks share inputs) so these can be improved a little bit perhaps. But these results make me think this optimization needs to be more limited to account for the extra cost of the inputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants