Optimize performance and memory usage in dynamics computations #280

flferretti · 2024-11-06T15:56:01Z

Enhance efficiency by normalizing quaternions, reducing memory footprint, and speeding up various dynamics computations. Replace jnp.float32 with float for consistency. Avoid unnecessary calls to .names_to_idxs where possible.

Note

See profiler traces at https://gist.github.com/flferretti/a20c7cd171c25dd784eef361246bd8cd

Tip

js.model.generalized_free_floating_jacobian_derivative used jax.vmap over js.link.jacobian_derivative, but a lot of quantities were unnecessarily computed multiple times. For this reason, I moved the major computation to js.model and left only the link extraction part in js.link. Some adjustments regarding the output velocity representation were necessary in order to avoid to call js.link.velocity inside the vmapped function _compute_row

Function	New Compile Time	Original Compile Time	New Run Time	Original Run Time
`js.model.generalized_free_floating_jacobian_derivative`	643 ms	748 ms	479 μs	6.07 ms
`js.model.free_floating_mass_matrix`	423 ms	456 ms	470 μs	502 μs
`js.model.forward_kinematics`	246 ms	255 ms	406 μs	445 μs
`js.model.generalized_free_floating_jacobian`	311 ms	368 ms	455 μs	473 μs
`js.references.JaxSimModelReferences.build`	98.3 ms	139 ms	558 μs	498 μs
`js.contact.collidable_point_kinematics`	326 ms	332 ms	420 μs	432 μs
`data.base_orientation`	14.3 ms	19.1 ms	10.2 μs	14.6 μs
`model.kin_dyn_parameters.joint_transforms_and_motion_subspaces`	236 ms	241 ms	53.3 μs	60.1 μs
`references.link_forces`	12.7 ms	38.2 ms	20.9 μs	22.1 μs

📚 Documentation preview 📚: https://jaxsim--280.org.readthedocs.build//280/

diegoferigo

Leaving a preliminary comment before going more in detail with a review.

Do you have any benchmark supporting the need to switch from names to indices? Using indices makes the code much more fragile and, if the indexing logic changes, there will be countless of places to update. The name-to-index helpers provide a centralized location to store this logic.

A very descriptive example of what I just said is the joint name to index map. ~~There is a tricky +- 1 index, that at first sight this PR lacks.~~ I'm not that supportive of gaining minor performance benefits if the price to pay is less maintainable and more error prone code. However, I'll be waiting for some benchmark, we can assess the situation more clearly with data.

Edit: I double checked the indexing returned by jaxsim.api.joint.names_to_idxs and it actually starts from 0. It's the low-level indexing used by RBDAs that start from 1.

flferretti · 2024-11-06T16:17:55Z

Leaving a preliminary comment before going more in detail with a review.

Do you have any benchmark supporting the need to switch from names to indices? Using indices makes the code much more fragile and, if the indexing logic changes, there will be countless of places to update. The name-to-index helpers provide a centralized location to store this logic.

A very descriptive example of what I just said is the joint name to index map. There is a tricky +- 1 index, that at first sight this PR lacks. I'm not that supportive of gaining minor performance benefits if the price to pay is less maintainable and more error prone code. However, I'll be waiting for some benchmark, we can assess the situation more clearly with data.

I haven't removed the usage of *_names, but simply avoided to call js.*.names_to_idxs when *_names is not passed to the call. In that case, we explicitly need all of the * (joint/link/frame), so we can just use model.number_of_*

flferretti · 2024-11-06T16:24:12Z

I'm doing the benchmarks using Perfetto Profiler (see https://jax.readthedocs.io/en/latest/profiling.html). I'll post them ASAP

diegoferigo · 2024-11-06T16:24:15Z

Leaving a preliminary comment before going more in detail with a review.
Do you have any benchmark supporting the need to switch from names to indices? Using indices makes the code much more fragile and, if the indexing logic changes, there will be countless of places to update. The name-to-index helpers provide a centralized location to store this logic.
A very descriptive example of what I just said is the joint name to index map. There is a tricky +- 1 index, that at first sight this PR lacks. I'm not that supportive of gaining minor performance benefits if the price to pay is less maintainable and more error prone code. However, I'll be waiting for some benchmark, we can assess the situation more clearly with data.

I haven't removed the usage of *_names, but simply avoided to call js.*.names_to_idxs when *_names is not passed to the call. In that case, we explicitly need all of the * (joint/link/frame), so we can just use model.number_of_*

Yes you introduced the knowledge that you can use arange in that case. But as said, this change spreads around this piece of knowledge that might change in the future (what if we start supporting free-floating joints to unify the base-related section of RBDAs and the corresponding base joint get a zero index?). If there are no relevant performance consideration in place, why making the code more hard to read and error prone for future refactorings? (this is also a general comment, not necessarily related to just this change).

flferretti · 2024-11-07T12:19:39Z

But as said, this change spreads around this piece of knowledge that might change in the future (what if we start supporting free-floating joints to unify the base-related section of RBDAs and the corresponding base joint get a zero index?).

I believe that the most intuitive behavior for the final user is that

js.link.names_to_indexes(joint_names=model.link_names(), **kwargs)

and

jnp.arange(model.number_of_links())

should return the same result. If there will be any need to change this logic in the future, we will propagate the change over the whole library

diegoferigo · 2024-11-07T14:11:13Z

But as said, this change spreads around this piece of knowledge that might change in the future (what if we start supporting free-floating joints to unify the base-related section of RBDAs and the corresponding base joint get a zero index?).

I believe that the most intuitive behavior for the final user is that
js.link.names_to_indexes(joint_names=model.link_names(), **kwargs)
and
jnp.arange(model.number_of_links())
should return the same result. If there will be any need to change this logic in the future, we will propagate the change over the whole library

My point is not on the user side since users are not supposed to know anything about indices (there are the names to index helpers for this reason). My point was about the developers and the logic within JaxSim functions.

I'm not strongly against this change, my whole point is that in case of changes on the indices used internally, there will be the need to change multiple locations. I checked the tests, and we already have checks that ensure that the current indexing does not get broken:

jaxsim/tests/test_api_joint.py

Lines 22 to 24 in bd59060

    
           assert js.joint.names_to_idxs( 
        
               model=model, joint_names=model.joint_names() 
        
           ) == pytest.approx(jnp.arange(model.number_of_joints()))

jaxsim/tests/test_api_link.py

Lines 27 to 29 in bd59060

    
           assert js.link.names_to_idxs( 
        
               model=model, link_names=model.link_names() 
        
           ) == pytest.approx(jnp.arange(model.number_of_links()))

Can you please add a comment to both files that links to our discussion in this PR in case that check starts to fail?

diegoferigo · 2024-11-07T14:26:21Z

src/jaxsim/math/adjoint.py

+        return jnp.vstack(
+            [
+                jnp.block([A_R_B.T, -A_R_B.T @ A_X_B[0:3, 3:6] @ A_R_B.T]),
+                jnp.block([jnp.zeros(shape=(3, 3)), A_R_B.T]),
+            ]
        )


I just realized that all these methods are correct only if they operate on transforms for 6D velocites, not those for 6D forces. Let's keep this in mind. We might also want to add an additional argument to turn on logic for 6D forces. I think that the term adjoint only corresponds to the tangent of the original space ($\text{SE}(3)$ in this case), and it might mean that the considered adjoint is the transform for 6D velocities. Not sure if the real terminology for the trasform for 6D forces is dual adjoint. cc @traversaro

Murray Lee Sastry seems just to write of "adjoint transpose". Indeed Park calls it "dual adjoint" in https://mae.eng.uci.edu/~bobrow/papers_files/DynamicsParkBobrow.pdf , but if you want to avoid confusion in the reader you can also use more user friendly terms like "velocity transform" and "force transform" or something similar.

Thanks for the pointers! If we want to adapt the logic to also support 6D force transforms, probably we can add an additional is_dual=False argument, with a proper docstring its meaning.

flferretti · 2024-11-07T15:24:10Z

Can you please add a comment to both files that links to our discussion in this PR in case that check starts to fail?

Done in 85ab07c. Thanks!

flferretti · 2024-11-07T16:38:41Z

Traces available at https://gist.github.com/flferretti/a20c7cd171c25dd784eef361246bd8cd.

In order to view them, please run:

git clone https://gist.github.com/a20c7cd171c25dd784eef361246bd8cd.git
cd a20c7cd171c25dd784eef361246bd8cd
python -m http.server

C.C. @xela-95 @diegoferigo

diegoferigo · 2024-11-08T11:49:03Z

Traces available at https://gist.github.com/flferretti/a20c7cd171c25dd784eef361246bd8cd.

In order to view them, please run:
git clone https://gist.github.com/a20c7cd171c25dd784eef361246bd8cd.git
cd a20c7cd171c25dd784eef361246bd8cd
python -m http.server
C.C. @xela-95 @diegoferigo

Can you please summarize them, if you have any specific improvement in mind?

flferretti · 2024-11-08T12:15:11Z

Can you please summarize them, if you have any specific improvement in mind?

I made the gist just for reference and in case you were interested into the specific improvements. For a general benchmark, you can take a look at the PR description

diegoferigo

Great, thanks @flferretti! The improvement of $\dot{J}$ was much needed.

src/jaxsim/api/data.py

Co-authored-by: Diego Ferigo <diego.ferigo@iit.it>

xela-95

Thanks @flferretti for the performance improvements!! I checked your changes and they look good to me!

flferretti added 5 commits November 6, 2024 15:59

Always normalize base quaternion

b1b514e

Reduce memory footprint in _compute_next_state

7bbb9e1

Speed up adjoint inverse computation

eb64a8f

Use float instead of jnp.float32

405444d

Speed up joint_transforms_and_motion_subspaces

9d2209a

flferretti self-assigned this Nov 6, 2024

diegoferigo reviewed Nov 6, 2024

View reviewed changes

flferretti force-pushed the feature/speedup branch 2 times, most recently from fc5f82d to 3a2fc47 Compare November 7, 2024 10:04

Avoid the usage of .names_to_idxs when possible

1bf8efa

flferretti force-pushed the feature/speedup branch 2 times, most recently from c5da3b4 to b4c9e2b Compare November 7, 2024 11:11

Remove extra call to js.contact.collidable_point_kinematics

94eb5f4

flferretti force-pushed the feature/speedup branch from b4c9e2b to bab95c1 Compare November 7, 2024 12:14

Speedup js.model.generalized_free_floating_jacobian_derivative

061f97c

flferretti force-pushed the feature/speedup branch from bab95c1 to d783c16 Compare November 7, 2024 12:15

flferretti requested a review from xela-95 November 7, 2024 12:15

flferretti marked this pull request as ready for review November 7, 2024 12:16

Speedup computation of Baumgarte term

e0894c1

flferretti force-pushed the feature/speedup branch from d783c16 to e0894c1 Compare November 7, 2024 13:24

Reduce memory footprint of _compute_ineq_constraint_matrix

a1c3c60

diegoferigo reviewed Nov 7, 2024

View reviewed changes

Add xref to discussion about indexing

85ab07c

diegoferigo approved these changes Nov 8, 2024

View reviewed changes

src/jaxsim/api/data.py Outdated Show resolved Hide resolved

Fix AD for base orientation norm

62df0c2

Co-authored-by: Diego Ferigo <diego.ferigo@iit.it>

flferretti force-pushed the feature/speedup branch from 03edbf2 to 62df0c2 Compare November 8, 2024 12:48

flferretti merged commit 98f1220 into main Nov 8, 2024
24 checks passed

flferretti deleted the feature/speedup branch November 8, 2024 15:57

xela-95 approved these changes Nov 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize performance and memory usage in dynamics computations #280

Optimize performance and memory usage in dynamics computations #280

flferretti commented Nov 6, 2024 •

edited

Loading

diegoferigo left a comment •

edited

Loading

flferretti commented Nov 6, 2024

flferretti commented Nov 6, 2024

diegoferigo commented Nov 6, 2024

flferretti commented Nov 7, 2024

diegoferigo commented Nov 7, 2024 •

edited

Loading

diegoferigo Nov 7, 2024 •

edited

Loading

traversaro Nov 7, 2024

diegoferigo Nov 8, 2024

flferretti commented Nov 7, 2024

flferretti commented Nov 7, 2024

diegoferigo commented Nov 8, 2024

flferretti commented Nov 8, 2024

diegoferigo left a comment

xela-95 left a comment

Optimize performance and memory usage in dynamics computations #280

Optimize performance and memory usage in dynamics computations #280

Conversation

flferretti commented Nov 6, 2024 • edited Loading

diegoferigo left a comment • edited Loading

Choose a reason for hiding this comment

flferretti commented Nov 6, 2024

flferretti commented Nov 6, 2024

diegoferigo commented Nov 6, 2024

flferretti commented Nov 7, 2024

diegoferigo commented Nov 7, 2024 • edited Loading

diegoferigo Nov 7, 2024 • edited Loading

Choose a reason for hiding this comment

traversaro Nov 7, 2024

Choose a reason for hiding this comment

diegoferigo Nov 8, 2024

Choose a reason for hiding this comment

flferretti commented Nov 7, 2024

flferretti commented Nov 7, 2024

diegoferigo commented Nov 8, 2024

flferretti commented Nov 8, 2024

diegoferigo left a comment

Choose a reason for hiding this comment

xela-95 left a comment

Choose a reason for hiding this comment

flferretti commented Nov 6, 2024 •

edited

Loading

diegoferigo left a comment •

edited

Loading

diegoferigo commented Nov 7, 2024 •

edited

Loading

diegoferigo Nov 7, 2024 •

edited

Loading