Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updated skinning example and text #64

Merged
merged 6 commits into from
Nov 19, 2021

Conversation

javagl
Copy link
Contributor

@javagl javagl commented Nov 13, 2021

The skinning section has several issues, and has been a source of confusion for quite a while.

Some inconsistencies between the textual description in the tutorial, and the actual data in the (inlined, embedded) SimpleSkin example had been pointed out here:

I recently added a SimpleSkin example to the sample models repo, via KhronosGroup/glTF-Sample-Models#243 . This was supposed to be a sensible, warning-free model, and should also be used here in the tutorial.

This PR attempts to update the skinning section based on this, latest sample model.

  • The (inlined, embedded) example has been updated using that SimpleSkin model
  • The geometry data in the text has been updated to match the geometry data of the model. Originally, the geometry data in the text was a rectangle (0,0)-(1,2). Now it is a rectangle (-0.5, 0)-(0.5, 2). This should also match the diagrams, which usually display the origin at the center of the bottom edge of the rectangle.
  • The inverseBindMatrices caused some confusion: Depending on the geometry data, it should either include a translation about -0.5 along the x-axis or not. In the actual model, it does not contain this translation, and this should now match the description.
  • The role of the inverseBindMatrices has been aligned with the wording of the spec. It does no longer talk about the global transform of the node. It now only "...used to bring coordinates being skinned into the same space as each joint.".
  • The latter is related to Update and simplify the skinning section #54 : There was a globalTransformOfNodeThatTheMeshIsAttachedTo^-1 involved in the computations, but according to the latest spec (and some discussion in related issues), this should not be required, and removed with this PR.

If somebody could have a look at the current state, and tell me whether I overlooked something (or introduced another error...), that would be great. Blatantly pinging people who have been active in the skinning-related discussions:

@lexaknyazev @emackey @donmccurdy @cx20 @scurest


@emackey : You had opened #44 , and I'm not entirely sure whether I understood that correctly, and whether it is addressed with this PR. (Note that the diagrams have not been changed - only the underlying data and the textual descriptions).

Specifically, the sample model places a translation on node index 1, which is actually joint 0, the parent joint. It places a rotation but not a translation on node 2, joint 1, which is the only child joint.

My understanding was that the parent node should have the translation. This causes the child to be translated (and the child the contains a rotation, which will affect the vertices). Specifically:

  • Should the translation be at the child node? (That doesn't seem right...)
  • Does this affect the inverseBindMatrices? (I wondered whether this might mean that the inverse bind matrix of the parent node should actually be the identity matrix, and only the inverse bind matrix of the child node should contain the -1 translation in y-direction. I could just "try out whether this works in most viewers", but there has been so much back and forth that I'm increasingly hesitant to do this. Maybe someone with a deeper understanding can make a definite statement here...

@scurest
Copy link

scurest commented Nov 13, 2021

I checked that the accessor contents as reported in the text match the contents as calculated from the data URIs. They do.


The first two vertices (at the bottom of the geometry) will not be influenced by joints. Their weights are set to (1.0, 0.0, 0.0, 0.0), because the weights for each vertex have to sum up to 1.0.

Yes they will. Every vert is influenced by joints. The first two are influenced completely by the first joint, just like the last two are influenced completely by the second joint.


The vert shader also needs to be updated. Since the skin matrix now transforms directly into world space, the model matrix should not be applied.

 void main(void)
 {
     mat4 skinMat =
         a_weight.x * u_jointMat[int(a_joint.x)] +
         a_weight.y * u_jointMat[int(a_joint.y)] +
         a_weight.z * u_jointMat[int(a_joint.z)] +
         a_weight.w * u_jointMat[int(a_joint.w)];
-    vec4 pos = u_modelViewMatrix * skinMat * vec4(a_position,1.0);
-    gl_Position = u_projectionMatrix * pos;
+    vec4 worldPosition = skinMat * vec4(a_position,1.0);
+    vec4 cameraPosition = u_viewMatrix * worldPosition;
+    gl_Position = u_projectionMatrix * cameraPosition;
 }

And this line too

The skin matrix is then used to transform the original position of the vertex before it is transformed with the model-view-matrix.

It is also probably worth explicitly pointing out that unlike unskinned meshes, the node where a skinned mesh is instantiated does not affect the calculation.


There's also this line about inverse binds that still talks about the global transforms

This means that each matrix is the inverse of the global transform of the respective joint, in its initial configuration.

@javagl
Copy link
Contributor Author

javagl commented Nov 14, 2021

Thanks @scurest for this feedback!

The first two vertices (at the bottom of the geometry) will not be influenced by joints. ...

Yes they will. Every vert is influenced by joints. The first two are influenced completely by the first joint, just like the last two are influenced completely by the second joint.

This was an attempt to address the fact that the JOINTS_0 input only contains zeros in the first two rows. This is also related to what emackey pointed out in #44 , saying

Also, separate from the above, at some point we made the decision to zero out the first two lines of JOINTS_0 [...], and some of the wording needs to be updated to reflect what happened there.


Specifcally, the part where the JOINTS_0 is explained currently says this:

Vertex 0:  0, 0, 0, 0,
Vertex 1:  0, 0, 0, 0,
Vertex 2:  0, 1, 0, 0,
Vertex 3:  0, 1, 0, 0,
Vertex 4:  0, 1, 0, 0,
Vertex 5:  0, 1, 0, 0,
Vertex 6:  0, 1, 0, 0,
Vertex 7:  0, 1, 0, 0,
Vertex 8:  0, 1, 0, 0,
Vertex 9:  0, 1, 0, 0,

This means that every vertex should be influenced by joint 0 and joint 1, except for the two vertices at the bottom.

And given your feedback, this is at least very ambiguous. It should at least be extended by saying

...except for the two vertices at the bottom, which are only influenced by joiint 0.

Do you agree? (The wording for the WEIGHTS part could then be updated analogously)


EDIT: I'll integrate the point about the model matrix ASAP

@scurest
Copy link

scurest commented Nov 14, 2021

This means that every vertex should be influenced by joint 0 and joint 1, except for the two vertices at the bottom, which are only influenced by joint 0.

Vertex 8 and 9 are also only influenced by joint 1.

@emackey
Copy link
Member

emackey commented Nov 15, 2021

* Should the translation be at the child node? (That doesn't seem right...)

As you know, bones often form chains, with the child bone beginning where its parent bone ended.

screenshot

When these chains occur, each child will need a non-animated translation moving its local origin away from its parents' origin by the length of the parent bone, paired with an animated rotation of the child bone's movement.

If the child has no translation then it shares its origin with the parent, and the bones "overlap" as seen in this diagram:

diagram of wrong location

I don't think we want the child and its parent sharing their origins like this. It does work, but it's not the usual way bones are laid out, and it's not what the diagrams in this PR show. The diagrams in this PR show a child bone that has been translated "up" from its parent by the length of the parent bone, and animates its rotation from there.

Does this affect the inverseBindMatrices?

Yes, and I think your comments about the parent's inverseBindMatrix being identity are correct once the translation has moved. The diagrams show the parent sitting at the model's origin.

I could just "try out whether this works in most viewers"

Try importing it into Blender. Beware that Blender will make an incorrect assumption about which axis the bone "points" along, partly due to the bones being colocated, I suspect.

Blender screenshot

Hard to see without animation, but both bones are in the same location here. One of them rotates when animated.

@javagl
Copy link
Contributor Author

javagl commented Nov 15, 2021

I'm a bit embarrassed that I have to ask, because it sounds like something very basic, but I'm really confused right now:

When these chains occur, each child will need a non-animated translation moving its local origin away from its parents' origin by the length of the parent bone,

Why should the translation in the child cause its origin to be moved? My understanding was that the translation in the parent defines where the origin of the child will be. More generally, my way of thinking about this was that everything that is attached to a node (being a camera, a mesh, or another node) is translated with the translation of that node, and that a node cannot "translate itself" by specifying a translation...

Is there any conceptual difference in this regard between nodes of a skeleton and "normal" nodes?

@emackey
Copy link
Member

emackey commented Nov 15, 2021

Why should the translation in the child cause its origin to be moved? My understanding was that the translation in the parent defines where the origin of the child will be.

It does feel like we're having a basic disconnect, and I worry it's my inability to express a basic concept that we both know. The child does of course inherit all of the parent's transforms, and add its own as well. Each node's mesh (or skin vertices) acts as a would-be child of not just its parent's node's transforms, but the child node's transforms as well.

For example, try putting a mesh with the glTF avocado at these two nodes, and remove the rest of the skinning info. If you have a parent/child relationship where the child has only a rotation, no translate/scale, then the two avocados will be at the same location, intersecting each other at the rotation angle. The child avocado needs a translation to avoid touching its parent. To lay a chain of avocados end-to-end, each additional child must translate itself away from its parent, by exactly the length of the parent avocado. I'm getting hungry now.

@javagl
Copy link
Contributor Author

javagl commented Nov 15, 2021

And I'm afraid that this disconnect shows that I've been missing some point for quite a while. But the skinning in general has caused so many questions, discussion and confusion that ... I hope that it's OK to carve out some of these things here again. Thanks for everybody who is patient enough with me. Maybe that will help me to write it up in a form that even I understand, and - considering what a low bar that is - will eventually also help others to better understand what's going on there.


If the child has no translation then it shares its origin with the parent, and the bones "overlap" as seen in this diagram:

Why? The parent has a translation of (0,1,0). This means that the origin of the parent will be at (0,0,0), and the origin of the child will be at (0,1,0)

O   Child node at (0,1,0) 
|
|
|   Translation stored in PARENT: (0,1,0)
|
|
O   Parent node at (0,0,0) 

Am I missing something here?


I asked earlier:

Should the translation be at the child node? (That doesn't seem right...)

and part of your response was what I already quoted:

When these chains occur, each child will need a non-animated translation moving its local origin away from its parents' origin by the length of the parent bone, ...

But I now have to ask one more specific question:

  • Assume the configuration from the SimpleSkin example
  • Assume that the length of the parent bone was 2.0, and the length of the child bone was 1.0.

Which node should then receive which translation? I assume that

  • The parent node should have a translation of (0,2,0)
  • The child node should have a translation of (0,1,0) (and the rotation)

Is this correct?
(I'm asking because that sounds different from what your quote seems to suggest...)


Specifically, I'm still wondering about whether a translation has to be stored in the child node. Intuitively, it would make sense to have a translation there as well, but only in addition to the one stored in the parent.

(BTW: I'll have to take a closer look at the maths, and play that through with the inverseBindMatrices in mind....)

But looking at the image from the tutorial

I now have to assume that this is suggesting the wrong thing intuitively, or is technically wrong in the detail: The text in the third panel says "translating about 1.0 along the y-axis, and rotating about 45 degrees around the z-axis". This was supposed to convey the idea

  • first translating with (0,1,0) - namely the translation of the parent node
  • then rotating about 45 degrees - namely, the rotation of the child node

Coincidentally, the visual result would be the same if this was done with a T*R*S where both the rotation and the translation were stored in the child node.

Now, similar to the question above: If the parent bone had a length of 2.0, then the order of transformations would be

  • The transform from the T*R*S of the child node:
    • A rotation about 45 degrees
    • (Then a translation about (0,1,0), probably?)
  • Then the transform from the parent node:
    • A translation about (0,2.0,0)

@emackey
Copy link
Member

emackey commented Nov 15, 2021

Unfortunately I have to sign off for tonight, but:

Your diagram is 100% correct. Everything it says and shows about joint 1 is fine. The only problem is joint node 0. It's shown where it should be, at the origin, so even there, the diagram is fine, it's the model that's incorrect.

Imagine that you put a triangle on "joint node 0" there, with a translate as part of that node. Would the triangle somehow appear at the origin, in spite of the translate? The translate on that node affects the node's mesh, and any skinned vertices assigned to that node.

Try rotating the parent, joint node 0. What point does it rotate about: The bottom of the diagram, or the middle?

I'll try to craft a more useful answer tomorrow :)

@emackey
Copy link
Member

emackey commented Nov 16, 2021

Hi @javagl, I crafted a small example, based on SimpleSkin. I've removed all the skinning stuff and just placed a mesh on the joint nodes themselves, in the shape of a disc with a long tail, similar to how joints are portrayed in your diagrams.

SimpleJoints.zip

This is using the node hierarchy & transforms from SimpleSkin, only a new mesh has been added to indicate location. When the animation plays, we can see the discs overlap.

screenshot1

Try editing the file to move the transform from the parent to the child node, such that the child has both translation & rotation. That's how to get this arrangement:

screenshot2

I think the disconnect is here:

The parent has a translation of (0,1,0). This means that the origin of the parent will be at (0,0,0)

No. When we talk about the "origin" (or location, or position/orientation) of any node, we always talk about the post-transform location/rot/scale, not the pre-transform location. The pre-transform location typically isn't a useful thing to show or mention, as all root nodes have the identity matrix there. So we show the post-transform: For example in Blender, if you move the default cube around, it shows the cube node's position at the center of the relocated cube, even though it has no parent. Likewise in your own diagram, the child joint is shown post-transform, incorporating the child's own rotation, even though that rotation wasn't inherited. So when your parent node has that (0,1,0) transform on it, it moves the node away from the origin, and it's no longer correct to portray it as if it were on the origin.

Assume that the length of the parent bone was 2.0

Ah, well, this is where things get... interesting. It took me quite some time to wrap my head around where glTF actually stores its bone length, which turns out to be nowhere. Bone length is critical for artists when rigging, as it graphically portrays how much influence the bone is intended to have, and helps represent a skeleton of an organic creature. But when it comes to actually running the skinning vertex shader, which I learned how to write by reading your own tutorials, you'll find that the bone length is completely absent. Indeed glTF does not even specify which axis (X, Y, or Z) the bone's length might be aligned to, or provide any place to store its length. This is because bone "length" is merely used to initialize the set of vertex bone (joint) weights, and this initialization happens prior to glTF export. The joint weights are all that's needed to describe the bone's influence over skinned vertices, and at runtime all the implementation needs is that weight influence and the proper transformation.

This does mean that re-importing glTF's bones back into a graphical tool like Blender is fraught with some guesswork (thanks again @scurest for making it work as well as it possibly can!).

So there you have it: A parent bone with its head directly on the origin, as shown on your diagram, cannot have any translation at all. Its position (3-axis), orientation (3-axis), and scale (3-axis) are all known, but its "direction" (single-axis) and "length" (single-axis) are unspecified and irrelevant at runtime. Its influence has already been encoded into vertex weights, so each vertex knows exactly what percentage of the bone's 3-axis transformation to apply.

@javagl
Copy link
Contributor Author

javagl commented Nov 16, 2021

Thanks for that detailed response, and the example. I really appreciate that.

I've kept thinking about this (and maybe things like this shouldn't keep me awake until 5am, but they do). And the crucial source of misunderstanding was probably indeed related to the idea of "bones":

Ah, well, this is where things get... interesting. It took me quite some time to wrap my head around where glTF actually stores its bone length, which turns out to be nowhere.
...
Indeed glTF does not even specify which axis (X, Y, or Z) the bone's length might be aligned to, or provide any place to store its length.

It's hard to explain, in hindsight, why I came to this idea, but I somehow thought that "the translation is the bone". This makes the question about its axis or length obsolete, and maybe that felt nicely simple and convenient ... or maybe that idea was an undesired side-effect of drawing these diagrams - who knows.


Given your explanations, I think that my understanding has been corrected. It does not yet feel as "deep" and "settled" as I'd like it to be, but I'll probably play around with the SimpleSkin example, and apply some different translations to the parent and child nodes, and see whether the observed results match my expectations. This could include an example where not only the rotation, but also the translation of the child is animated - for example, from (0,1,0) to (0,2,0) and back again.

But this would be beyond what would be covered in the tutorial.


To bring this PR into a mergeable state:

  • Based on what you said, Update SimpleSkin to place the parent bone at the origin. glTF-Sample-Models#335 is probably the right thing to do. I have not yet tried out the actual model, but it technically matches the SimpleJoints example, in that it only moves the translation to the child, and updates the IBMs. And... the fact that the first IBM now is the identity matrix is consistent with my guess in the first post here.
    In that PR, you mentioned

    Of course, I started from the sample-model default branch here,

    but I don't have any pending contributions in glTF-Sample-Models, so that other PR could probably be merged. (I had copied the inlined example for the tutorial from the sample model, because these models should match)

  • I'll either wait for the merge of the other PR, or manually update the inlined model for the tutorial, based on your input (translation goes into child, IBM matrices are updated). To my understanding, the diagrams may not need to be updated, but I'll consider to clarify the wording a bit (for the third panel in the diagram that I posted above).

  • I'll integrate the parts of feedback from scurest ASAP


I'll do another "ping" when this is done (and this might cause some long-standing issues to be closed), and then address some of the other (hopefully simpler) issues afterwards.

@javagl
Copy link
Contributor Author

javagl commented Nov 17, 2021

A detail. This is not an issue of the tutorial, but raises the question of how things should be worded appropriately here:

Yes they will. Every vert is influenced by joints. The first two are influenced completely by the first joint, just like the last two are influenced completely by the second joint.

The joint information in the text and example file is currently

Vertex 0:  0, 0, 0, 0,
Vertex 1:  0, 0, 0, 0,
Vertex 2:  0, 1, 0, 0,
Vertex 3:  0, 1, 0, 0,
Vertex 4:  0, 1, 0, 0,
Vertex 5:  0, 1, 0, 0,
Vertex 6:  0, 1, 0, 0,
Vertex 7:  0, 1, 0, 0,
Vertex 8:  0, 1, 0, 0,
Vertex 9:  0, 1, 0, 0,

There has been some forth and back: I think that originally, the data was

Vertex 0:  0, 1, 0, 0,
Vertex 1:  0, 1, 0, 0,
...

and that should be fine, because the corresponding weights for joint 1 are 0.0. But changing that causes two validation warnings of the form:

ACCESSOR_JOINTS_USED_ZERO_WEIGHT

Joints accessor element at index 1 (component index 1) is used with zero weight but has non-zero value (1).

For the last vertices, the same warning could be issued. In fact, when changing the last two vertices to

...
Vertex 8:  1, 1, 0, 0,
Vertex 9:  1, 1, 0, 0,

Then it does cause a warning. The caveat here is that we cannot differentiate between "joint 0" and "an unused value". However, I have to say that this feels a bit inconsistent....

@javagl
Copy link
Contributor Author

javagl commented Nov 17, 2021

@scurest

There's also this line about inverse binds that still talks about the global transforms

This means that each matrix is the inverse of the global transform of the respective joint, in its initial configuration.

Is this wrong, though? It does not refer to the global transform of the node that the skin is attached to, but to the global transform of the joint node - and based on the changes recently discussed (and e.g. the inverse bind matrix of node 0 becoming the identity matrix), this seems to be correct...

Based on the fixed state that will be part of the
sample models, with the translation being on the
child node, and the inverse bind matrices being
updated accordingly.
Updated inlined snippets and wording for translation
being at the parent node.
Updated wording for inverse bind matrices
Tried to update wording for joint influences.
@javagl
Copy link
Contributor Author

javagl commented Nov 17, 2021

I have integrated the updates from the feedback, except for the aforementined difficulty of phrasing the joint influences. I hope that saying

joints [0,1,0,0] means that the vertex MAY be influenced by joint 0 and 1

is clear enough. The goal is to leave open which joints actually WILL influence the vertices, because that depends on the 'weights', which are explained directly below that.

@scurest
Copy link

scurest commented Nov 17, 2021

Is this wrong, though?

It's certainly not true in general.

@javagl
Copy link
Contributor Author

javagl commented Nov 18, 2021

@scurest

(About ... "This means that each matrix is the inverse of the global transform of the respective joint, in its initial configuration.")

It's certainly not true in general.

Considering that the joint matrix is computed as

jointMatrix(j) =
  globalTransformOfJointNode(j) *
  inverseBindMatrixForJoint(j);

I'd assume that the IBM has to be the inverse of the global transform of the joint node in its initial state.

But now I have to wonder: Under which conditions should this computation of the joint matrix not yield the identity matrix for an untransformed skeleton? (And what should that result be or mean?)

@scurest
Copy link

scurest commented Nov 18, 2021

When the mesh is not stored in global space. eg if the mesh is in skeleton-root space, the inverse binds would be inverse(joint-to-skeleton-root).
When the "initial configuration" does not match the bind pose.
When arbitrary transforms have been baked into the inverse binds, like rotations to do an axis conversion (CesiumMan does this I think), or scalings like the "dequantization" transforms for KHR_mesh_quantization, etc.

@javagl
Copy link
Contributor Author

javagl commented Nov 18, 2021

OK, if I understood this correctly, then this refers exactly to the point that is currently mentioned via

Note: Vertex skinning in other contexts often involves a matrix that is called "Bind Shape Matrix". This matrix is supposed to transform the geometry of the skinned mesh into the coordinate space of the joints. In glTF, this matrix is omitted, and it is assumed that this transform is either premultiplied with the mesh data, or postmultiplied to the inverse bind matrices.

with the emphasized part being the case where the IBM is exactly not the inverse of the global joint transform. That sounds reasonable. (Sorry, I'm a bit slow sometimes, but want to be sure to get it right this time...). I'll update the part that caused this discussion accordingly (it might just consist of removing that statement, but I'll re-read it in context).

@emackey
Copy link
Member

emackey commented Nov 18, 2021

the case where the IBM is exactly not the inverse of the global joint transform

Sounds like this topic is resolved, but I'll offer another point of view more on the content-creation side of things. When a mesh & a skeleton are first created, the artist might create them one at a time, and then bind them together. Bone influence can be initialized by size and proximity to vertices, but you don't want a bone in one leg claiming influence over vertices in an adjacent leg. So the mesh at that point often has its various limbs and appendages spread apart and/or axis-aligned, such that newly-crafted bones won't be granted undue influence over unrelated appendages on the creature. This is sometimes called the "bind pose," and it can look quite unnatural, because the goal of this pose is to control bone influence, not final appearance. The default pose of the creature is intended to look more realistic, and may show the legs closer together, or arms in a more natural arrangement, not spread wide or axis-aligned like the bind pose was.

So, the default (no animations) pose of a glTF model can potentially be quite different from the un-inverted bind matrices (or inverted IBMs), due to authoring techniques such as this.

@javagl
Copy link
Contributor Author

javagl commented Nov 18, 2021

Thanks, I wasn't aware of this (although one occasionally sees characters posing like the Virtruvian man, for example).

I should probably familiarize myself more with the content creation side. Until now, everything that I know about vertex skinning is derived from reading specifications and trying to implement it for glTF (and, to some extent, condensing it into the overview and the tutorial).

Hand-crafting something like the SimpleSkin example seemed sensible and helpful on a certain level, and helped a lot for gaining a certain understanding. But I have no idea how a real skinned mesh is created e.g. in Blender, and that might be a perspective that could allow describing some things (e.g. in the tutorial) in a form that is easier to digest, and might help answering some of the questions that I now had to ask here in this issue. (So thanks again everybody for your patience).

Co-authored-by: Ed Mackey <elm19087@gmail.com>
@emackey
Copy link
Member

emackey commented Nov 19, 2021

Is this still a draft? It looks in pretty good shape now.

@javagl javagl marked this pull request as ready for review November 19, 2021 18:42
@javagl
Copy link
Contributor Author

javagl commented Nov 19, 2021

@emackey I had added your wording suggestion, and just updated the diagrams in KhronosGroup/glTF-Sample-Models#335 , so I think these could both be merged now.

Copy link
Member

@emackey emackey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. I didn't mean to rush this, I was just eager to put a green checkmark on it. Please merge whenever you're comfortable with the result here, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Cannot replicate Skin glTF-Tutorial data contents SimpleSkin sample model doesn't match tutorial
3 participants