-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] broadcast pointwise ops for tensor/tensordict mixed inputs #1166
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This was referenced Jan 7, 2025
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jan 7, 2025
vmoens
added a commit
that referenced
this pull request
Jan 7, 2025
ghstack-source-id: 9d1446630ed08238e0a62a879222aeb6e161c425 Pull Request resolved: #1166
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 47.2280μs | 21.0463μs | 47.5143 KOps/s | 49.9637 KOps/s | |
test_plain_set_stack_nested | 67.2760μs | 21.1947μs | 47.1817 KOps/s | 48.9615 KOps/s | |
test_plain_set_nested_inplace | 0.1317ms | 22.8563μs | 43.7516 KOps/s | 45.4554 KOps/s | |
test_plain_set_stack_nested_inplace | 76.1920μs | 22.7722μs | 43.9132 KOps/s | 45.3432 KOps/s | |
test_items | 43.4620μs | 4.1095μs | 243.3412 KOps/s | 238.9702 KOps/s | |
test_items_nested | 0.6945ms | 0.4038ms | 2.4763 KOps/s | 2.4352 KOps/s | |
test_items_nested_locked | 0.6023ms | 0.4040ms | 2.4751 KOps/s | 2.4335 KOps/s | |
test_items_nested_leaf | 0.1478ms | 77.0581μs | 12.9772 KOps/s | 12.9764 KOps/s | |
test_items_stack_nested | 0.8413ms | 0.4067ms | 2.4586 KOps/s | 2.4203 KOps/s | |
test_items_stack_nested_leaf | 0.1510ms | 79.9144μs | 12.5134 KOps/s | 12.2530 KOps/s | |
test_items_stack_nested_locked | 0.5537ms | 0.4081ms | 2.4503 KOps/s | 2.4299 KOps/s | |
test_keys | 21.1300μs | 3.4856μs | 286.8914 KOps/s | 283.0684 KOps/s | |
test_keys_nested | 0.2721ms | 0.1654ms | 6.0466 KOps/s | 5.8665 KOps/s | |
test_keys_nested_locked | 1.8037ms | 0.1721ms | 5.8108 KOps/s | 5.7410 KOps/s | |
test_keys_nested_leaf | 0.2345ms | 0.1445ms | 6.9208 KOps/s | 6.8734 KOps/s | |
test_keys_stack_nested | 0.3263ms | 0.1654ms | 6.0447 KOps/s | 5.9918 KOps/s | |
test_keys_stack_nested_leaf | 0.2327ms | 0.1439ms | 6.9469 KOps/s | 6.8814 KOps/s | |
test_keys_stack_nested_locked | 0.2678ms | 0.1710ms | 5.8492 KOps/s | 5.8043 KOps/s | |
test_values | 5.9212μs | 1.0505μs | 951.9314 KOps/s | 951.2131 KOps/s | |
test_values_nested | 0.1330ms | 63.5118μs | 15.7451 KOps/s | 15.5950 KOps/s | |
test_values_nested_locked | 0.1243ms | 63.9641μs | 15.6338 KOps/s | 14.8120 KOps/s | |
test_values_nested_leaf | 0.1608ms | 73.0617μs | 13.6871 KOps/s | 13.8100 KOps/s | |
test_values_stack_nested | 0.1236ms | 65.5509μs | 15.2553 KOps/s | 15.7673 KOps/s | |
test_values_stack_nested_leaf | 0.1281ms | 74.7077μs | 13.3855 KOps/s | 13.6825 KOps/s | |
test_values_stack_nested_locked | 0.1162ms | 64.6818μs | 15.4603 KOps/s | 15.6584 KOps/s | |
test_membership | 19.6470μs | 0.8840μs | 1.1312 MOps/s | 1.1059 MOps/s | |
test_membership_nested | 29.4260μs | 2.9489μs | 339.1105 KOps/s | 344.5831 KOps/s | |
test_membership_nested_leaf | 30.4270μs | 2.9948μs | 333.9176 KOps/s | 334.1849 KOps/s | |
test_membership_stacked_nested | 25.0670μs | 2.9422μs | 339.8770 KOps/s | 333.4512 KOps/s | |
test_membership_stacked_nested_leaf | 33.9730μs | 2.9996μs | 333.3747 KOps/s | 346.7772 KOps/s | |
test_membership_nested_last | 28.4540μs | 4.4998μs | 222.2345 KOps/s | 227.7478 KOps/s | |
test_membership_nested_leaf_last | 29.2540μs | 4.5492μs | 219.8168 KOps/s | 224.3809 KOps/s | |
test_membership_stacked_nested_last | 42.3800μs | 4.4453μs | 224.9572 KOps/s | 227.8845 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.8200μs | 4.4572μs | 224.3566 KOps/s | 225.0590 KOps/s | |
test_nested_getleaf | 49.9030μs | 11.1796μs | 89.4486 KOps/s | 92.0904 KOps/s | |
test_nested_get | 50.8450μs | 10.5874μs | 94.4518 KOps/s | 95.5312 KOps/s | |
test_stacked_getleaf | 50.8550μs | 11.1554μs | 89.6430 KOps/s | 91.4396 KOps/s | |
test_stacked_get | 51.4070μs | 10.5717μs | 94.5923 KOps/s | 96.6658 KOps/s | |
test_nested_getitemleaf | 52.7690μs | 11.5951μs | 86.2434 KOps/s | 90.2691 KOps/s | |
test_nested_getitem | 51.9880μs | 10.7994μs | 92.5975 KOps/s | 94.2462 KOps/s | |
test_stacked_getitemleaf | 49.9940μs | 11.6280μs | 85.9997 KOps/s | 90.1395 KOps/s | |
test_stacked_getitem | 50.0430μs | 11.5727μs | 86.4106 KOps/s | 94.0139 KOps/s | |
test_lock_nested | 2.2180ms | 0.4625ms | 2.1623 KOps/s | 2.1261 KOps/s | |
test_lock_stack_nested | 0.5810ms | 0.4359ms | 2.2943 KOps/s | 2.2592 KOps/s | |
test_unlock_nested | 1.3990ms | 0.3817ms | 2.6198 KOps/s | 2.5807 KOps/s | |
test_unlock_stack_nested | 0.6761ms | 0.3529ms | 2.8336 KOps/s | 2.8326 KOps/s | |
test_flatten_speed | 0.2020ms | 0.1015ms | 9.8502 KOps/s | 9.7812 KOps/s | |
test_unflatten_speed | 0.9163ms | 0.5416ms | 1.8465 KOps/s | 1.8475 KOps/s | |
test_common_ops | 1.6291ms | 0.8038ms | 1.2440 KOps/s | 1.3051 KOps/s | |
test_creation | 0.1226ms | 2.7541μs | 363.0956 KOps/s | 395.8291 KOps/s | |
test_creation_empty | 43.8820μs | 12.3112μs | 81.2272 KOps/s | 100.1901 KOps/s | |
test_creation_nested_1 | 55.6140μs | 15.0348μs | 66.5122 KOps/s | 77.3335 KOps/s | |
test_creation_nested_2 | 64.6210μs | 19.9042μs | 50.2407 KOps/s | 56.8570 KOps/s | |
test_clone | 51.7370μs | 13.7297μs | 72.8350 KOps/s | 71.9938 KOps/s | |
test_getitem[int] | 1.1249ms | 12.8592μs | 77.7654 KOps/s | 76.5643 KOps/s | |
test_getitem[slice_int] | 0.1549ms | 25.5141μs | 39.1940 KOps/s | 40.0437 KOps/s | |
test_getitem[range] | 0.1874ms | 48.8755μs | 20.4602 KOps/s | 19.8601 KOps/s | |
test_getitem[tuple] | 0.1753ms | 20.4314μs | 48.9442 KOps/s | 48.0661 KOps/s | |
test_getitem[list] | 0.1991ms | 44.7157μs | 22.3635 KOps/s | 21.9816 KOps/s | |
test_setitem_dim[int] | 48.3110μs | 24.6790μs | 40.5202 KOps/s | 38.1125 KOps/s | |
test_setitem_dim[slice_int] | 97.0020μs | 50.7209μs | 19.7158 KOps/s | 19.3656 KOps/s | |
test_setitem_dim[range] | 0.1213ms | 73.2890μs | 13.6446 KOps/s | 13.3924 KOps/s | |
test_setitem_dim[tuple] | 81.8430μs | 39.7639μs | 25.1484 KOps/s | 24.3531 KOps/s | |
test_setitem | 65.9630μs | 20.4498μs | 48.9001 KOps/s | 50.0865 KOps/s | |
test_set | 0.3961ms | 20.2179μs | 49.4612 KOps/s | 52.1361 KOps/s | |
test_set_shared | 2.1581ms | 0.1712ms | 5.8402 KOps/s | 5.5960 KOps/s | |
test_update | 0.3711ms | 23.3425μs | 42.8403 KOps/s | 46.7775 KOps/s | |
test_update_nested | 0.3657ms | 33.7708μs | 29.6114 KOps/s | 31.3871 KOps/s | |
test_update__nested | 0.7612ms | 33.5288μs | 29.8251 KOps/s | 29.1374 KOps/s | |
test_set_nested | 0.3813ms | 22.5018μs | 44.4410 KOps/s | 46.0089 KOps/s | |
test_set_nested_new | 0.3791ms | 27.2467μs | 36.7017 KOps/s | 37.6014 KOps/s | |
test_select | 0.3982ms | 44.4787μs | 22.4827 KOps/s | 23.5550 KOps/s | |
test_select_nested | 0.1216ms | 64.1126μs | 15.5976 KOps/s | 15.4363 KOps/s | |
test_exclude_nested | 0.1629ms | 83.4754μs | 11.9796 KOps/s | 12.1454 KOps/s | |
test_empty[True] | 0.5229ms | 0.4136ms | 2.4177 KOps/s | 2.3526 KOps/s | |
test_empty[False] | 13.2147μs | 1.3903μs | 719.2908 KOps/s | 699.2205 KOps/s | |
test_unbind_speed | 0.4012ms | 0.2730ms | 3.6632 KOps/s | 3.5981 KOps/s | |
test_unbind_speed_stack0 | 0.3904ms | 0.2732ms | 3.6606 KOps/s | 3.6622 KOps/s | |
test_unbind_speed_stack1 | 0.1185s | 0.8520ms | 1.1737 KOps/s | 1.3027 KOps/s | |
test_split | 1.7352ms | 1.6074ms | 622.1405 Ops/s | 550.4681 Ops/s | |
test_chunk | 0.1213s | 2.0246ms | 493.9230 Ops/s | 554.8626 Ops/s | |
test_consolidate_njt[False-None] | 8.9948ms | 8.2553ms | 121.1341 Ops/s | 116.7531 Ops/s | |
test_creation[device0] | 4.5713ms | 94.5646μs | 10.5748 KOps/s | 10.7168 KOps/s | |
test_creation_from_tensor | 0.4410ms | 98.7902μs | 10.1225 KOps/s | 10.5411 KOps/s | |
test_add_one[memmap_tensor0] | 0.1199ms | 4.9682μs | 201.2802 KOps/s | 199.7852 KOps/s | |
test_contiguous[memmap_tensor0] | 8.1250μs | 0.5214μs | 1.9181 MOps/s | 1.9466 MOps/s | |
test_stack[memmap_tensor0] | 0.1307ms | 3.5664μs | 280.3985 KOps/s | 292.7340 KOps/s | |
test_memmaptd_index | 1.1395ms | 0.2352ms | 4.2508 KOps/s | 4.1335 KOps/s | |
test_memmaptd_index_astensor | 0.8136ms | 0.3250ms | 3.0772 KOps/s | 3.0032 KOps/s | |
test_memmaptd_index_op | 1.0806ms | 0.6049ms | 1.6531 KOps/s | 1.7263 KOps/s | |
test_serialize_model | 0.1258s | 0.1202s | 8.3182 Ops/s | 8.2356 Ops/s | |
test_serialize_model_pickle | 0.4339s | 0.3988s | 2.5075 Ops/s | 2.4975 Ops/s | |
test_serialize_weights | 0.1286s | 0.1215s | 8.2328 Ops/s | 7.1621 Ops/s | |
test_serialize_weights_returnearly | 0.2823s | 0.1832s | 5.4588 Ops/s | 6.4306 Ops/s | |
test_serialize_weights_pickle | 0.4832s | 0.4069s | 2.4574 Ops/s | 2.3460 Ops/s | |
test_serialize_weights_filesystem | 0.1595s | 0.1466s | 6.8209 Ops/s | 6.8354 Ops/s | |
test_serialize_model_filesystem | 0.1637s | 0.1541s | 6.4903 Ops/s | 6.5987 Ops/s | |
test_reshape_pytree | 85.7610μs | 26.5151μs | 37.7144 KOps/s | 37.4850 KOps/s | |
test_reshape_td | 78.3570μs | 33.0780μs | 30.2316 KOps/s | 29.7203 KOps/s | |
test_view_pytree | 0.1054ms | 26.9401μs | 37.1194 KOps/s | 36.9710 KOps/s | |
test_view_td | 85.4600μs | 37.8907μs | 26.3917 KOps/s | 25.1669 KOps/s | |
test_unbind_pytree | 65.0620μs | 29.7801μs | 33.5795 KOps/s | 32.8652 KOps/s | |
test_unbind_td | 0.3509ms | 40.6281μs | 24.6135 KOps/s | 24.6339 KOps/s | |
test_split_pytree | 72.5460μs | 29.6169μs | 33.7645 KOps/s | 33.5474 KOps/s | |
test_split_td | 0.6560ms | 45.7118μs | 21.8762 KOps/s | 22.1434 KOps/s | |
test_add_pytree | 83.4560μs | 35.3287μs | 28.3056 KOps/s | 27.9565 KOps/s | |
test_add_td | 0.1369ms | 61.0805μs | 16.3718 KOps/s | 18.4130 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2119ms | 64.1445μs | 15.5898 KOps/s | 15.8017 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.5456ms | 0.1786ms | 5.6006 KOps/s | 5.7980 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1066ms | 45.4638μs | 21.9955 KOps/s | 21.2423 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2624ms | 0.1184ms | 8.4488 KOps/s | 8.3823 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 73.3470μs | 25.9896μs | 38.4770 KOps/s | 36.8485 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1205ms | 59.2236μs | 16.8852 KOps/s | 16.6983 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3556ms | 78.3428μs | 12.7644 KOps/s | 12.5487 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1415ms | 68.0511μs | 14.6948 KOps/s | 14.4212 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1923ms | 0.1054ms | 9.4845 KOps/s | 9.4167 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4486ms | 0.2215ms | 4.5137 KOps/s | 4.5650 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1344ms | 44.6744μs | 22.3842 KOps/s | 21.6216 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4815ms | 67.4723μs | 14.8209 KOps/s | 15.3332 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1842ms | 0.1023ms | 9.7799 KOps/s | 9.7106 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4375ms | 0.2011ms | 4.9725 KOps/s | 4.9811 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4689ms | 0.2406ms | 4.1555 KOps/s | 4.2379 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2002ms | 0.1042ms | 9.5924 KOps/s | 9.3541 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1526ms | 68.5874μs | 14.5799 KOps/s | 16.8855 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1168ms | 47.1648μs | 21.2022 KOps/s | 21.8986 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.2527ms | 0.1584ms | 6.3114 KOps/s | 6.3034 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1899ms | 0.1062ms | 9.4156 KOps/s | 9.6242 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 79.0000μs | 21.0505μs | 47.5047 KOps/s | 46.0732 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1337ms | 66.2283μs | 15.0993 KOps/s | 15.0454 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1506ms | 83.3610μs | 11.9960 KOps/s | 12.0967 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1305ms | 69.0873μs | 14.4744 KOps/s | 14.3643 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3097ms | 0.2054ms | 4.8684 KOps/s | 4.8120 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.6271ms | 1.3423ms | 744.9959 Ops/s | 756.9032 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3349ms | 0.2021ms | 4.9489 KOps/s | 4.8556 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 0.9792ms | 0.7759ms | 1.2888 KOps/s | 1.2958 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5410ms | 0.4499ms | 2.2227 KOps/s | 2.2030 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.5396ms | 2.7415ms | 364.7630 Ops/s | 378.2797 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 90.1490μs | 34.5361μs | 28.9552 KOps/s | 26.6725 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6850ms | 32.9796μs | 30.3218 KOps/s | 28.9533 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1088ms | 28.6509μs | 34.9030 KOps/s | 32.6979 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1045ms | 23.4919μs | 42.5679 KOps/s | 40.3790 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 78.9380μs | 29.3240μs | 34.1017 KOps/s | 31.9058 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1036ms | 23.4003μs | 42.7345 KOps/s | 42.1286 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1324ms | 50.2069μs | 19.9176 KOps/s | 18.7380 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5812ms | 20.7127μs | 48.2796 KOps/s | 49.3451 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1006ms | 42.4556μs | 23.5540 KOps/s | 22.0801 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 99.1060μs | 18.6670μs | 53.5705 KOps/s | 51.9637 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1219ms | 43.4751μs | 23.0017 KOps/s | 21.7617 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 98.8370μs | 18.4206μs | 54.2872 KOps/s | 52.3899 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1587ms | 51.3842μs | 19.4612 KOps/s | 18.2631 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 1.0636ms | 20.0797μs | 49.8016 KOps/s | 50.3002 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 98.0640μs | 43.9402μs | 22.7582 KOps/s | 21.4946 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 57.1570μs | 18.5616μs | 53.8747 KOps/s | 52.8079 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1225ms | 44.1620μs | 22.6439 KOps/s | 21.5927 KOps/s | |
test_compile_indexing[int-pytree-eager] | 57.2570μs | 18.5050μs | 54.0395 KOps/s | 52.9247 KOps/s | |
test_mod_add[eager] | 0.1268ms | 33.7419μs | 29.6367 KOps/s | 29.4539 KOps/s | |
test_mod_add[compile] | 0.1420ms | 47.1806μs | 21.1951 KOps/s | 19.9226 KOps/s | |
test_mod_add[compile-overhead] | 0.1028ms | 46.4634μs | 21.5223 KOps/s | 19.7515 KOps/s | |
test_mod_wrap[eager] | 0.3644ms | 0.2279ms | 4.3882 KOps/s | 4.3363 KOps/s | |
test_mod_wrap[compile] | 0.3012ms | 0.2090ms | 4.7858 KOps/s | 4.8101 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3941ms | 0.2088ms | 4.7893 KOps/s | 4.8451 KOps/s | |
test_mod_wrap_and_backward[eager] | 18.1627ms | 12.3833ms | 80.7538 Ops/s | 82.3454 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.0254ms | 12.8747ms | 77.6716 Ops/s | 85.5255 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 13.5814ms | 12.0057ms | 83.2935 Ops/s | 85.5481 Ops/s | |
test_seq_add[eager] | 0.2254ms | 0.1141ms | 8.7621 KOps/s | 8.4732 KOps/s | |
test_seq_add[compile] | 0.1710ms | 62.0005μs | 16.1289 KOps/s | 15.5482 KOps/s | |
test_seq_add[compile-overhead] | 0.1315ms | 60.6624μs | 16.4847 KOps/s | 16.0373 KOps/s | |
test_seq_wrap[eager] | 0.7984ms | 0.4491ms | 2.2269 KOps/s | 2.2234 KOps/s | |
test_seq_wrap[compile] | 0.4060ms | 0.2350ms | 4.2554 KOps/s | 4.2670 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4449ms | 0.2304ms | 4.3399 KOps/s | 4.3555 KOps/s | |
test_func_call_runtime[False-eager] | 0.9938ms | 0.5540ms | 1.8050 KOps/s | 1.8035 KOps/s | |
test_func_call_runtime[False-compile] | 0.5507ms | 0.4370ms | 2.2882 KOps/s | 2.3221 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8455ms | 0.4435ms | 2.2549 KOps/s | 2.3176 KOps/s | |
test_func_call_runtime[True-eager] | 1.3102ms | 0.7728ms | 1.2940 KOps/s | 1.2896 KOps/s | |
test_func_call_runtime[True-compile] | 0.6567ms | 0.4822ms | 2.0740 KOps/s | 2.1007 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.6614ms | 0.4808ms | 2.0800 KOps/s | 2.1202 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.0068ms | 0.5528ms | 1.8088 KOps/s | 1.8305 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.6197ms | 0.4411ms | 2.2672 KOps/s | 2.3194 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.6284ms | 0.4389ms | 2.2782 KOps/s | 2.3288 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4757ms | 0.9231ms | 1.0833 KOps/s | 1.0941 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.6844ms | 0.5041ms | 1.9838 KOps/s | 2.0042 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6691ms | 0.5084ms | 1.9671 KOps/s | 1.9935 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.8381ms | 1.9645ms | 509.0265 Ops/s | 513.1605 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7546ms | 0.5250ms | 1.9049 KOps/s | 1.8831 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 1.0866ms | 0.5432ms | 1.8408 KOps/s | 1.8869 KOps/s | |
test_distributed | 0.4017ms | 0.1259ms | 7.9428 KOps/s | 7.6339 KOps/s | |
test_tdmodule | 44.3030μs | 26.4544μs | 37.8009 KOps/s | 39.4531 KOps/s | |
test_tdmodule_dispatch | 78.1160μs | 48.6631μs | 20.5495 KOps/s | 21.7760 KOps/s | |
test_tdseq | 60.1530μs | 29.5336μs | 33.8597 KOps/s | 34.6601 KOps/s | |
test_tdseq_dispatch | 88.7960μs | 54.6123μs | 18.3109 KOps/s | 19.0228 KOps/s | |
test_instantiation_functorch | 2.3641ms | 1.5359ms | 651.0947 Ops/s | 634.2536 Ops/s | |
test_exec_functorch | 0.3368ms | 0.1789ms | 5.5887 KOps/s | 5.3855 KOps/s | |
test_exec_functional_call | 0.3021ms | 0.1756ms | 5.6933 KOps/s | 5.8182 KOps/s | |
test_exec_td_decorator | 0.5451ms | 0.2380ms | 4.2017 KOps/s | 4.3249 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8908ms | 0.6732ms | 1.4853 KOps/s | 1.5062 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1819ms | 0.6745ms | 1.4826 KOps/s | 1.5067 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8562ms | 0.5403ms | 1.8507 KOps/s | 1.8772 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8561ms | 0.5416ms | 1.8463 KOps/s | 1.8719 KOps/s | |
test_to_module_speed[True] | 2.6468ms | 1.3638ms | 733.2409 Ops/s | 738.8994 Ops/s | |
test_to_module_speed[False] | 2.0798ms | 1.3304ms | 751.6407 Ops/s | 725.9635 Ops/s | |
test_tc_init | 99.9270μs | 45.6227μs | 21.9189 KOps/s | 22.8320 KOps/s | |
test_tc_init_nested | 0.1944ms | 94.0133μs | 10.6368 KOps/s | 11.3242 KOps/s | |
test_tc_first_layer_tensor | 27.1710μs | 1.5548μs | 643.1902 KOps/s | 652.9202 KOps/s | |
test_tc_first_layer_nontensor | 27.6210μs | 4.7010μs | 212.7194 KOps/s | 212.2733 KOps/s | |
test_tc_second_layer_tensor | 24.9870μs | 2.8723μs | 348.1473 KOps/s | 350.8432 KOps/s | |
test_tc_second_layer_nontensor | 52.4080μs | 6.0868μs | 164.2907 KOps/s | 165.7808 KOps/s | |
test_unbind | 0.2542s | 14.2526ms | 70.1629 Ops/s | 62.7140 Ops/s | |
test_full_like | 12.9306ms | 9.7799ms | 102.2505 Ops/s | 110.0829 Ops/s | |
test_zeros_like | 4.2041ms | 3.4926ms | 286.3164 Ops/s | 290.7328 Ops/s | |
test_ones_like | 5.1076ms | 4.2276ms | 236.5388 Ops/s | 149.7090 Ops/s | |
test_clone | 7.2973ms | 6.4086ms | 156.0411 Ops/s | 111.2004 Ops/s | |
test_squeeze | 0.1000ms | 12.1392μs | 82.3780 KOps/s | 81.8939 KOps/s | |
test_unsqueeze | 0.1729ms | 92.7443μs | 10.7823 KOps/s | 10.8544 KOps/s | |
test_split | 0.3985ms | 0.1929ms | 5.1845 KOps/s | 5.1101 KOps/s | |
test_permute | 0.4027ms | 0.2094ms | 4.7760 KOps/s | 4.7332 KOps/s | |
test_stack | 34.9362ms | 28.6046ms | 34.9594 Ops/s | 34.8049 Ops/s | |
test_cat | 32.4270ms | 28.2399ms | 35.4109 Ops/s | 35.0316 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 28.4610μs | 11.3055μs | 88.4529 KOps/s | 78.5192 KOps/s | |
test_plain_set_stack_nested | 36.6100μs | 11.4933μs | 87.0075 KOps/s | 77.1846 KOps/s | |
test_plain_set_nested_inplace | 44.7900μs | 12.3651μs | 80.8725 KOps/s | 72.2716 KOps/s | |
test_plain_set_stack_nested_inplace | 38.4110μs | 12.3154μs | 81.1992 KOps/s | 72.0625 KOps/s | |
test_items | 26.0600μs | 2.8635μs | 349.2191 KOps/s | 343.0273 KOps/s | |
test_items_nested | 0.4022ms | 0.3577ms | 2.7955 KOps/s | 2.7840 KOps/s | |
test_items_nested_locked | 0.4225ms | 0.3540ms | 2.8250 KOps/s | 2.7651 KOps/s | |
test_items_nested_leaf | 79.5010μs | 57.9045μs | 17.2698 KOps/s | 17.3530 KOps/s | |
test_items_stack_nested | 0.3974ms | 0.3613ms | 2.7679 KOps/s | 2.7580 KOps/s | |
test_items_stack_nested_leaf | 93.1020μs | 59.9991μs | 16.6669 KOps/s | 16.7316 KOps/s | |
test_items_stack_nested_locked | 0.4031ms | 0.3590ms | 2.7859 KOps/s | 2.7419 KOps/s | |
test_keys | 24.6300μs | 3.4399μs | 290.7019 KOps/s | 291.5825 KOps/s | |
test_keys_nested | 0.1119ms | 80.7549μs | 12.3831 KOps/s | 12.0827 KOps/s | |
test_keys_nested_locked | 0.8195ms | 86.1642μs | 11.6057 KOps/s | 11.3455 KOps/s | |
test_keys_nested_leaf | 2.7085ms | 72.1271μs | 13.8644 KOps/s | 13.7198 KOps/s | |
test_keys_stack_nested | 0.1159ms | 82.2833μs | 12.1531 KOps/s | 12.0061 KOps/s | |
test_keys_stack_nested_leaf | 0.1183ms | 72.8543μs | 13.7260 KOps/s | 13.4575 KOps/s | |
test_keys_stack_nested_locked | 0.1168ms | 88.1604μs | 11.3430 KOps/s | 11.2839 KOps/s | |
test_values | 3.7490μs | 0.8430μs | 1.1863 MOps/s | 1.1781 MOps/s | |
test_values_nested | 60.2110μs | 34.5108μs | 28.9764 KOps/s | 29.0648 KOps/s | |
test_values_nested_locked | 61.9110μs | 35.9280μs | 27.8334 KOps/s | 27.6388 KOps/s | |
test_values_nested_leaf | 74.1310μs | 39.1078μs | 25.5704 KOps/s | 25.3626 KOps/s | |
test_values_stack_nested | 61.7410μs | 34.9548μs | 28.6084 KOps/s | 28.5087 KOps/s | |
test_values_stack_nested_leaf | 73.3610μs | 39.6028μs | 25.2507 KOps/s | 25.5833 KOps/s | |
test_values_stack_nested_locked | 67.7810μs | 36.5744μs | 27.3416 KOps/s | 27.5046 KOps/s | |
test_membership | 1.7230μs | 0.5019μs | 1.9922 MOps/s | 1.9639 MOps/s | |
test_membership_nested | 31.4600μs | 2.0136μs | 496.6285 KOps/s | 511.0686 KOps/s | |
test_membership_nested_leaf | 15.4355μs | 1.9560μs | 511.2483 KOps/s | 513.5656 KOps/s | |
test_membership_stacked_nested | 27.0810μs | 2.0290μs | 492.8630 KOps/s | 478.5833 KOps/s | |
test_membership_stacked_nested_leaf | 23.7510μs | 2.0432μs | 489.4320 KOps/s | 485.8433 KOps/s | |
test_membership_nested_last | 27.2800μs | 3.0604μs | 326.7584 KOps/s | 329.4252 KOps/s | |
test_membership_nested_leaf_last | 27.6500μs | 3.0155μs | 331.6180 KOps/s | 324.5486 KOps/s | |
test_membership_stacked_nested_last | 33.1410μs | 3.0805μs | 324.6221 KOps/s | 147.4585 KOps/s | |
test_membership_stacked_nested_leaf_last | 36.8010μs | 3.0594μs | 326.8654 KOps/s | 146.3000 KOps/s | |
test_nested_getleaf | 37.2310μs | 6.0682μs | 164.7926 KOps/s | 165.6221 KOps/s | |
test_nested_get | 29.9000μs | 5.8083μs | 172.1679 KOps/s | 174.0449 KOps/s | |
test_stacked_getleaf | 36.9410μs | 6.1066μs | 163.7585 KOps/s | 165.2708 KOps/s | |
test_stacked_get | 35.8410μs | 5.8069μs | 172.2083 KOps/s | 174.0533 KOps/s | |
test_nested_getitemleaf | 37.2410μs | 6.1717μs | 162.0306 KOps/s | 159.5021 KOps/s | |
test_nested_getitem | 38.3910μs | 5.9658μs | 167.6211 KOps/s | 170.6297 KOps/s | |
test_stacked_getitemleaf | 36.0810μs | 6.2178μs | 160.8296 KOps/s | 161.5666 KOps/s | |
test_stacked_getitem | 35.8510μs | 5.9115μs | 169.1625 KOps/s | 169.4064 KOps/s | |
test_lock_nested | 2.4897ms | 0.3669ms | 2.7254 KOps/s | 2.7169 KOps/s | |
test_lock_stack_nested | 0.3852ms | 0.3396ms | 2.9444 KOps/s | 2.9781 KOps/s | |
test_unlock_nested | 0.6449ms | 0.3070ms | 3.2574 KOps/s | 3.2733 KOps/s | |
test_unlock_stack_nested | 0.3193ms | 0.2789ms | 3.5849 KOps/s | 3.6409 KOps/s | |
test_flatten_speed | 0.1185ms | 74.7354μs | 13.3805 KOps/s | 13.3075 KOps/s | |
test_unflatten_speed | 0.3592ms | 0.3139ms | 3.1861 KOps/s | 3.1495 KOps/s | |
test_common_ops | 1.5438ms | 0.5581ms | 1.7917 KOps/s | 1.6011 KOps/s | |
test_creation | 0.1049ms | 1.7155μs | 582.9296 KOps/s | 585.1059 KOps/s | |
test_creation_empty | 35.8410μs | 6.3273μs | 158.0456 KOps/s | 105.7940 KOps/s | |
test_creation_nested_1 | 31.7310μs | 8.0371μs | 124.4231 KOps/s | 89.0497 KOps/s | |
test_creation_nested_2 | 45.4110μs | 10.6900μs | 93.5454 KOps/s | 72.0865 KOps/s | |
test_clone | 0.1339ms | 10.3465μs | 96.6508 KOps/s | 96.3950 KOps/s | |
test_getitem[int] | 1.8925ms | 10.4462μs | 95.7285 KOps/s | 94.5667 KOps/s | |
test_getitem[slice_int] | 0.1132ms | 20.1614μs | 49.5998 KOps/s | 48.8534 KOps/s | |
test_getitem[range] | 0.1273ms | 35.7358μs | 27.9832 KOps/s | 28.5043 KOps/s | |
test_getitem[tuple] | 0.1095ms | 17.8346μs | 56.0708 KOps/s | 56.8849 KOps/s | |
test_getitem[list] | 0.1258ms | 31.0987μs | 32.1557 KOps/s | 31.9050 KOps/s | |
test_setitem_dim[int] | 39.5410μs | 19.4747μs | 51.3488 KOps/s | 58.3895 KOps/s | |
test_setitem_dim[slice_int] | 60.7110μs | 39.4593μs | 25.3425 KOps/s | 27.7685 KOps/s | |
test_setitem_dim[range] | 83.3420μs | 53.1100μs | 18.8288 KOps/s | 19.9883 KOps/s | |
test_setitem_dim[tuple] | 53.6610μs | 32.3456μs | 30.9161 KOps/s | 33.4253 KOps/s | |
test_setitem | 0.1292ms | 14.5082μs | 68.9267 KOps/s | 67.6088 KOps/s | |
test_set | 0.1264ms | 13.1353μs | 76.1305 KOps/s | 67.5619 KOps/s | |
test_set_shared | 1.5529ms | 0.1502ms | 6.6581 KOps/s | 6.6609 KOps/s | |
test_update | 0.5479ms | 14.9460μs | 66.9075 KOps/s | 54.1541 KOps/s | |
test_update_nested | 0.1214ms | 20.1452μs | 49.6396 KOps/s | 42.6881 KOps/s | |
test_update__nested | 0.5557ms | 24.2966μs | 41.1581 KOps/s | 41.5786 KOps/s | |
test_set_nested | 0.1217ms | 14.2145μs | 70.3508 KOps/s | 62.4676 KOps/s | |
test_set_nested_new | 0.1252ms | 16.4815μs | 60.6739 KOps/s | 54.4067 KOps/s | |
test_select | 0.1626ms | 27.7532μs | 36.0319 KOps/s | 33.1189 KOps/s | |
test_select_nested | 67.9810μs | 43.1336μs | 23.1838 KOps/s | 22.8462 KOps/s | |
test_exclude_nested | 92.5610μs | 60.9812μs | 16.3985 KOps/s | 16.2323 KOps/s | |
test_empty[True] | 0.3180ms | 0.2846ms | 3.5137 KOps/s | 3.4351 KOps/s | |
test_empty[False] | 3.9140μs | 0.8302μs | 1.2045 MOps/s | 1.2157 MOps/s | |
test_to | 84.2220μs | 55.1751μs | 18.1241 KOps/s | 17.9590 KOps/s | |
test_to_nonblocking | 94.8210μs | 49.2319μs | 20.3121 KOps/s | 21.3437 KOps/s | |
test_unbind_speed | 1.6725ms | 0.2327ms | 4.2968 KOps/s | 4.3701 KOps/s | |
test_unbind_speed_stack0 | 0.2897ms | 0.2350ms | 4.2548 KOps/s | 4.3059 KOps/s | |
test_unbind_speed_stack1 | 93.4147ms | 0.6653ms | 1.5031 KOps/s | 1.5238 KOps/s | |
test_split | 94.8030ms | 1.5818ms | 632.1982 Ops/s | 588.0183 Ops/s | |
test_chunk | 94.9267ms | 1.5839ms | 631.3413 Ops/s | 701.3201 Ops/s | |
test_consolidate[False-None] | 97.4774ms | 2.8930ms | 345.6674 Ops/s | 341.1321 Ops/s | |
test_consolidate[default-None] | 1.7385ms | 1.6287ms | 613.9828 Ops/s | 612.8844 Ops/s | |
test_consolidate[reduce-overhead-None] | 2.0543ms | 1.6567ms | 603.5997 Ops/s | 596.4995 Ops/s | |
test_consolidate_njt[False-None] | 6.6218ms | 6.1975ms | 161.3543 Ops/s | 161.9332 Ops/s | |
test_to[False-False-None] | 2.0846ms | 1.6707ms | 598.5464 Ops/s | 584.6755 Ops/s | |
test_to[True-False-None] | 1.6664ms | 1.2325ms | 811.3674 Ops/s | 811.7336 Ops/s | |
test_to[within-False-None] | 4.3362ms | 3.9379ms | 253.9437 Ops/s | 250.9985 Ops/s | |
test_to[True-default-None] | 5.4126ms | 5.0060ms | 199.7614 Ops/s | 201.4812 Ops/s | |
test_to_njt[False-False-None] | 7.1342ms | 6.7495ms | 148.1597 Ops/s | 149.1523 Ops/s | |
test_to_njt[True-False-None] | 5.6006ms | 5.1813ms | 193.0029 Ops/s | 193.7095 Ops/s | |
test_to_njt[within-False-None] | 11.8562ms | 11.4682ms | 87.1979 Ops/s | 87.9750 Ops/s | |
test_creation[device0] | 0.5437ms | 78.6331μs | 12.7173 KOps/s | 12.8023 KOps/s | |
test_creation_from_tensor | 0.6700ms | 81.7413μs | 12.2337 KOps/s | 12.2106 KOps/s | |
test_add_one[memmap_tensor0] | 0.3183ms | 6.4353μs | 155.3920 KOps/s | 156.6005 KOps/s | |
test_contiguous[memmap_tensor0] | 20.2948μs | 0.3952μs | 2.5306 MOps/s | 2.4889 MOps/s | |
test_stack[memmap_tensor0] | 21.9110μs | 4.1377μs | 241.6785 KOps/s | 241.0583 KOps/s | |
test_memmaptd_index | 1.6827ms | 0.2371ms | 4.2171 KOps/s | 4.1992 KOps/s | |
test_memmaptd_index_astensor | 0.5696ms | 0.2959ms | 3.3793 KOps/s | 3.3601 KOps/s | |
test_memmaptd_index_op | 0.9905ms | 0.5235ms | 1.9104 KOps/s | 1.7387 KOps/s | |
test_serialize_model | 0.1320s | 0.1310s | 7.6319 Ops/s | 7.6805 Ops/s | |
test_serialize_model_pickle | 1.3661s | 1.2149s | 0.8231 Ops/s | 0.8240 Ops/s | |
test_serialize_weights | 0.4169s | 0.1713s | 5.8365 Ops/s | 7.7004 Ops/s | |
test_serialize_weights_returnearly | 0.3353s | 53.5928ms | 18.6592 Ops/s | 11.2767 Ops/s | |
test_serialize_weights_pickle | 1.3779s | 1.2236s | 0.8172 Ops/s | 0.8376 Ops/s | |
test_reshape_pytree | 65.5710μs | 21.7255μs | 46.0288 KOps/s | 46.3895 KOps/s | |
test_reshape_td | 47.4510μs | 25.9695μs | 38.5067 KOps/s | 38.4272 KOps/s | |
test_view_pytree | 48.7810μs | 21.7494μs | 45.9784 KOps/s | 46.8612 KOps/s | |
test_view_td | 56.2010μs | 28.8519μs | 34.6598 KOps/s | 32.2523 KOps/s | |
test_unbind_pytree | 55.1910μs | 27.5688μs | 36.2729 KOps/s | 36.6336 KOps/s | |
test_unbind_td | 0.7599ms | 35.6447μs | 28.0547 KOps/s | 28.4816 KOps/s | |
test_split_pytree | 58.2510μs | 29.4295μs | 33.9795 KOps/s | 34.2073 KOps/s | |
test_split_td | 0.9588ms | 37.3500μs | 26.7737 KOps/s | 26.4918 KOps/s | |
test_add_pytree | 64.2210μs | 33.3374μs | 29.9963 KOps/s | 30.5967 KOps/s | |
test_add_td | 85.1010μs | 44.5908μs | 22.4261 KOps/s | 21.1410 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1689ms | 0.1216ms | 8.2228 KOps/s | 8.2794 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2184ms | 0.1287ms | 7.7723 KOps/s | 7.8883 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1390ms | 96.6842μs | 10.3430 KOps/s | 10.4600 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 2.2948ms | 0.1451ms | 6.8908 KOps/s | 6.9369 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 62.8410μs | 23.8847μs | 41.8678 KOps/s | 47.2592 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 52.5210μs | 28.7803μs | 34.7460 KOps/s | 34.5092 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3892ms | 63.3215μs | 15.7924 KOps/s | 15.6874 KOps/s | |
test_compile_copy_nested[pytree-eager] | 83.4810μs | 48.8205μs | 20.4832 KOps/s | 20.2041 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1829ms | 0.1414ms | 7.0742 KOps/s | 7.1401 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3118ms | 0.2142ms | 4.6692 KOps/s | 4.7505 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1493ms | 0.1001ms | 9.9919 KOps/s | 10.4791 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1151ms | 55.2482μs | 18.1001 KOps/s | 19.3240 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2735ms | 0.1342ms | 7.4500 KOps/s | 7.4865 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4979ms | 0.4605ms | 2.1716 KOps/s | 2.1576 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3710ms | 0.2578ms | 3.8784 KOps/s | 3.9053 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.1893ms | 0.1422ms | 7.0301 KOps/s | 7.1303 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1485ms | 68.3549μs | 14.6295 KOps/s | 15.9619 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1376ms | 98.9682μs | 10.1043 KOps/s | 10.3925 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4575ms | 0.3936ms | 2.5404 KOps/s | 2.5162 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1712ms | 0.1342ms | 7.4508 KOps/s | 7.5820 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 46.6710μs | 19.4782μs | 51.3394 KOps/s | 58.8694 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 58.8810μs | 31.0452μs | 32.2111 KOps/s | 32.3612 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2198ms | 70.6344μs | 14.1574 KOps/s | 14.4122 KOps/s | |
test_compile_copy_flat[pytree-eager] | 81.3110μs | 51.9152μs | 19.2622 KOps/s | 19.5329 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.5901ms | 0.3833ms | 2.6093 KOps/s | 2.2918 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.6451ms | 2.5521ms | 391.8343 Ops/s | 386.9012 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5714ms | 0.4311ms | 2.3197 KOps/s | 2.3203 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 2.7349ms | 2.5585ms | 390.8599 Ops/s | 384.8160 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1687ms | 0.1142ms | 8.7552 KOps/s | 9.0499 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5791ms | 78.3966μs | 12.7557 KOps/s | 12.8341 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2136ms | 0.1071ms | 9.3342 KOps/s | 9.7249 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.5082ms | 68.7782μs | 14.5395 KOps/s | 14.8178 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.5607ms | 0.1101ms | 9.0855 KOps/s | 9.6647 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.4706ms | 68.3683μs | 14.6267 KOps/s | 14.1395 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.5180ms | 0.1030ms | 9.7113 KOps/s | 10.2092 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1428ms | 15.8363μs | 63.1460 KOps/s | 60.7160 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.3531ms | 95.0011μs | 10.5262 KOps/s | 10.7237 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 41.3410μs | 15.4608μs | 64.6797 KOps/s | 63.2509 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1396ms | 98.6098μs | 10.1410 KOps/s | 10.5923 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 50.0710μs | 15.4365μs | 64.7816 KOps/s | 64.2667 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1401ms | 98.6363μs | 10.1383 KOps/s | 10.0931 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5705ms | 16.3988μs | 60.9800 KOps/s | 58.5174 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1389ms | 95.3209μs | 10.4909 KOps/s | 10.5853 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 51.4400μs | 15.2819μs | 65.4367 KOps/s | 64.5751 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1479ms | 95.0216μs | 10.5239 KOps/s | 10.6002 KOps/s | |
test_compile_indexing[int-pytree-eager] | 44.8610μs | 15.3757μs | 65.0378 KOps/s | 64.4444 KOps/s | |
test_mod_add[eager] | 0.1684ms | 35.1677μs | 28.4352 KOps/s | 26.3182 KOps/s | |
test_mod_add[compile] | 0.2132ms | 81.6481μs | 12.2477 KOps/s | 12.8050 KOps/s | |
test_mod_add[compile-overhead] | 0.3185ms | 0.1697ms | 5.8910 KOps/s | 5.7977 KOps/s | |
test_mod_wrap[eager] | 0.3205ms | 0.2388ms | 4.1882 KOps/s | 4.0474 KOps/s | |
test_mod_wrap[compile] | 0.3483ms | 0.2795ms | 3.5779 KOps/s | 3.4559 KOps/s | |
test_mod_wrap[compile-overhead] | 7.1909ms | 3.6595ms | 273.2603 Ops/s | 274.5701 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6099ms | 1.3477ms | 742.0096 Ops/s | 724.4654 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.3452ms | 1.2255ms | 816.0079 Ops/s | 799.9640 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3437ms | 0.9103ms | 1.0986 KOps/s | 1.0770 KOps/s | |
test_seq_add[eager] | 0.2098ms | 0.1062ms | 9.4150 KOps/s | 8.8469 KOps/s | |
test_seq_add[compile] | 0.1836ms | 85.3461μs | 11.7170 KOps/s | 11.6339 KOps/s | |
test_seq_add[compile-overhead] | 0.2349ms | 0.1317ms | 7.5933 KOps/s | 7.9350 KOps/s | |
test_seq_wrap[eager] | 0.5209ms | 0.4112ms | 2.4319 KOps/s | 2.4195 KOps/s | |
test_seq_wrap[compile] | 0.4052ms | 0.2928ms | 3.4150 KOps/s | 3.4236 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3050ms | 0.2187ms | 4.5721 KOps/s | 4.5785 KOps/s | |
test_func_call_runtime[False-eager] | 0.7837ms | 0.7138ms | 1.4009 KOps/s | 1.3911 KOps/s | |
test_func_call_runtime[False-compile] | 0.8063ms | 0.7179ms | 1.3929 KOps/s | 1.3970 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4048ms | 0.3526ms | 2.8359 KOps/s | 2.8504 KOps/s | |
test_func_call_runtime[True-eager] | 0.9561ms | 0.8805ms | 1.1357 KOps/s | 1.1346 KOps/s | |
test_func_call_runtime[True-compile] | 0.8135ms | 0.7367ms | 1.3574 KOps/s | 1.3538 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4468ms | 0.3747ms | 2.6690 KOps/s | 2.6843 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7687ms | 0.7150ms | 1.3985 KOps/s | 1.4007 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8367ms | 0.7216ms | 1.3858 KOps/s | 1.3925 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.3960ms | 0.3562ms | 2.8072 KOps/s | 2.8370 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0663ms | 0.9767ms | 1.0238 KOps/s | 1.0076 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8037ms | 0.7644ms | 1.3083 KOps/s | 1.3083 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4470ms | 0.3987ms | 2.5081 KOps/s | 2.4996 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4867ms | 2.0437ms | 489.3026 Ops/s | 482.8714 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.8365ms | 0.7822ms | 1.2784 KOps/s | 1.2793 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.4524ms | 0.4029ms | 2.4818 KOps/s | 2.4782 KOps/s | |
test_distributed | 3.1187ms | 0.2071ms | 4.8277 KOps/s | 7.8828 KOps/s | |
test_tdmodule | 0.3577ms | 18.9214μs | 52.8503 KOps/s | 51.4240 KOps/s | |
test_tdmodule_dispatch | 55.2110μs | 32.8150μs | 30.4739 KOps/s | 28.1731 KOps/s | |
test_tdseq | 27.5000μs | 18.9159μs | 52.8654 KOps/s | 47.4550 KOps/s | |
test_tdseq_dispatch | 62.8410μs | 35.2202μs | 28.3928 KOps/s | 25.6071 KOps/s | |
test_instantiation_functorch | 1.6245ms | 1.5125ms | 661.1401 Ops/s | 669.1552 Ops/s | |
test_exec_functorch | 0.1809ms | 0.1391ms | 7.1872 KOps/s | 7.1912 KOps/s | |
test_exec_functional_call | 0.1734ms | 0.1335ms | 7.4919 KOps/s | 7.6330 KOps/s | |
test_exec_td_decorator | 0.4056ms | 0.1815ms | 5.5093 KOps/s | 5.5332 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.7601ms | 0.6697ms | 1.4931 KOps/s | 1.4824 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8448ms | 0.6711ms | 1.4902 KOps/s | 1.4802 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7030ms | 0.5836ms | 1.7136 KOps/s | 1.7081 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7018ms | 0.5818ms | 1.7188 KOps/s | 1.7034 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.2270ms | 18.8864ms | 52.9482 Ops/s | 52.6929 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.0696ms | 18.9503ms | 52.7697 Ops/s | 52.5462 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 18.9485ms | 18.8230ms | 53.1264 Ops/s | 53.0754 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 18.8633ms | 18.7853ms | 53.2331 Ops/s | 52.9574 Ops/s | |
test_to_module_speed[True] | 1.0687ms | 0.9605ms | 1.0412 KOps/s | 1.0509 KOps/s | |
test_to_module_speed[False] | 1.3370ms | 0.9553ms | 1.0468 KOps/s | 1.0759 KOps/s | |
test_tc_init | 64.1310μs | 32.3728μs | 30.8901 KOps/s | 29.0539 KOps/s | |
test_tc_init_nested | 0.1137ms | 66.1733μs | 15.1118 KOps/s | 14.2634 KOps/s | |
test_tc_first_layer_tensor | 5.0843μs | 0.7028μs | 1.4228 MOps/s | 1.4667 MOps/s | |
test_tc_first_layer_nontensor | 23.8200μs | 2.2436μs | 445.7060 KOps/s | 454.9922 KOps/s | |
test_tc_second_layer_tensor | 9.5533μs | 1.4446μs | 692.2125 KOps/s | 710.6785 KOps/s | |
test_tc_second_layer_nontensor | 0.1388ms | 2.9962μs | 333.7574 KOps/s | 337.1868 KOps/s | |
test_unbind | 0.2217s | 10.1712ms | 98.3170 Ops/s | 146.2863 Ops/s | |
test_full_like | 9.8135ms | 9.1927ms | 108.7818 Ops/s | 107.6308 Ops/s | |
test_zeros_like | 4.8998ms | 4.3290ms | 231.0023 Ops/s | 234.6626 Ops/s | |
test_ones_like | 4.6579ms | 4.3359ms | 230.6325 Ops/s | 230.7104 Ops/s | |
test_clone | 6.7027ms | 6.4046ms | 156.1377 Ops/s | 109.5115 Ops/s | |
test_squeeze | 57.8410μs | 9.5536μs | 104.6730 KOps/s | 110.5279 KOps/s | |
test_unsqueeze | 5.0006ms | 71.4757μs | 13.9908 KOps/s | 14.9839 KOps/s | |
test_split | 0.2526ms | 0.1566ms | 6.3858 KOps/s | 6.6421 KOps/s | |
test_permute | 0.2174ms | 0.1751ms | 5.7120 KOps/s | 5.8879 KOps/s | |
test_stack | 51.2982ms | 50.7511ms | 19.7040 Ops/s | 19.5978 Ops/s | |
test_cat | 51.0116ms | 50.6818ms | 19.7309 Ops/s | 19.7718 Ops/s |
The following behavior is deprecated as part of this PR: td0 = TensorDict(batch_size=[3, 4])
td1 = TensorDict(batch_size=[3, 4, 2])
td0 == td1 Previously, This change is necessary as this would be the expected behaviour: td0 = TensorDict(a=torch.randn(3, 4), batch_size=[3, 4])
td1 = TensorDict(a=torch.randn(4), batch_size=[4])
td0 == td1 # works because td1 is broadcast to (3, 4) |
vmoens
added a commit
that referenced
this pull request
Jan 8, 2025
ghstack-source-id: bbefbb1a2e9841847c618bb9cf49160ff1a5c36a Pull Request resolved: #1166
vmoens
added a commit
that referenced
this pull request
Jan 8, 2025
ghstack-source-id: bbefbb1a2e9841847c618bb9cf49160ff1a5c36a Pull Request resolved: #1166
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
BC-breaking
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):