Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gltfpack: Deduplicate mesh geometry #782

Merged
merged 8 commits into from
Oct 7, 2024
Merged

gltfpack: Deduplicate mesh geometry #782

merged 8 commits into from
Oct 7, 2024

Conversation

zeux
Copy link
Owner

@zeux zeux commented Oct 5, 2024

Sometimes source glTF files have identical meshes attached to different nodes with
duplicate geometry. gltfpack will now deduplicate these meshes early in the processing
pipeline, merging the node/instance lists and discarding the geometry. Deduplication
is based on a 128-bit geometry hash which should, in practice, be collision-free.

This can only happen when the meshes have the same materials. Different materials
require, at the minimum, different mesh objects in glTF file; also, different material
properties may result in different processing being applied.

By itself this change might increase draw call counts: if the source scene had duplicate
meshes before, they could be detached from their nodes unless -kn was specified. Now,
we will merge the node lists and the heuristics for when to detach meshes from nodes
will play it safe to avoid increasing the file size. However, this can already be mitigated
using -mm (which will merge geometry of mesh instances, increasing the file size back
but reducing draw call count) or -mi (which will use instancing in these cases), and
when -kn is used, which is common in practice, this is a non-issue.

When the meshes have different materials, we run them through the processing pipeline
but recompute the hashes after processing and use them to share the entire accessor
structure between primitives. This only impacts the output file size (removing duplicate
accessors and binary data); however we also prevent detaching of potentially-duplicate
meshes from their nodes to avoid breaking the deduplication due to the world-space
transform.

The cost of hash computation is ~2% for geometry heavy scenes so this should not
noticeably impact processing times.

Since it is late in the release cycle and this change is fairly large, a temporary command line
option (-mdd) is provided to disable deduplication. If you are reading this and decide to
use this option, note that the intent is to remove it and make deduplication not optional; to
avoid your pipelines breaking in the future, please open a discussion thread describing
your use case.

To make it possible to deduplicate meshes, but not have to do a pairwise
compare of the entire mesh contents, we can precompute a 128-bit hash
(using MurMur3, chosen for the simplicity of implementation and good
collision properties); this change does that without the deduplication
code for now.

Calculating the hashes is reasonably fast: it takes ~2% of processing
time for geometry heavy scenes give or take.
Now that we have computed the geometry hash, we simply need to compare a
few other fields to make sure we can merge nodes/instances between two
meshes. This processing is only done on meshes that have not been
detached from their nodes; while we could dutifully compare morph target
and material variant information, it's easier not to.

Since we do not expose mesh identity via any command line options, it
should be safe to do this processing unconditionally. It may increase
draw call count by disabling node detaching and subsequent merging, but
that can be reactivated using `-mm` flag.
Since deduplication relies on material equality, files with duplicate
materials and duplicate meshes can be merged more effectively if
materials got deduplicated first.

There are no other order dependencies for mergeMeshMaterials - it does
not rely on texture merging for example, which is independent - so this
should be safe.
@zeux zeux marked this pull request as draft October 5, 2024 16:53
@zeux zeux changed the title gltfpack: Deduplicate meshes with identical geometry gltfpack: Deduplicate mesh geometry Oct 5, 2024
When a mesh has duplicate geometry, detaching it from its node will
break the data sharing because of the world-space transform. When -mm is
specified, this is fair game: it's a request to trade off draw call
count for data size, but by default it would be more consistent to leave
these as is.

This can increase the number of used nodes and mesh objects, but not
beyond what we already use with `-kn`, and the extra cost generally is
paid off by the savings in deduplicated data.
@zeux zeux marked this pull request as ready for review October 5, 2024 18:29
writeMeshIndices should take indices separately from the mesh as it's a
low level function; the rest of the code has been moved to
writeMeshGeometry, which simplifies the outer flow and prepares it for
more changes.
When meshes have different materials, we can not deduplicate them early
in the processing flow. Depending on the material structure, we might
end up with a different final vertex stream / index buffer, but since
the processing is deterministic often they will end up being the same.

We can thus reuse the geometry hash to cache primitive accessors and
data: if we have seen the exact same geometry before, we can reuse the
entire JSON blob (that contains accessor indices that point to buffer
views) instead of emitting new ones.

Technically, it's also possible to reuse individual streams, which is
more general, however this is more complicated to implement because
accessors with the same data may have a different setup, and is less
often useful.

For now this means that we recompute the mesh hashes redundantly again.
Instead of computing a hash after mesh processing for every mesh, we now
only do this when the mesh has been tagged as possibly-duplicate. This
reduces the cost of hash update to cases when we are likely to benefit
from them, and makes it easier to disable deduplication entirely for
testing.

While it is possible that two meshes did not have duplicate geometry
before processing and are processed to have it (eg two identical meshes
with a redundant vertex stream on one but not the other that gets
removed), this is fairly unlikely in practice.
…tion

In theory, deduplication should not result in any issues as any changes
only concern the internal file structure, and it should be beneficial in
most cases (and when it's penalizing draw call counts, existing options
provide enough control), but just in case we now can disable this
feature entirely using -mdd.

The goal is to remove this option in the future, it's just here in case
anyone reports issues in the wild.

Not running dedupMeshes means no mesh has geometry_duplicate flag set,
so no dedup related code activates.
@zeux zeux merged commit 2df0a25 into master Oct 7, 2024
12 checks passed
@zeux zeux deleted the gltf-dedup branch October 7, 2024 15:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant