-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] torch.export and onnx compatibility #991
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Sep 16, 2024
This was referenced Sep 16, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 53.5000μs | 20.0270μs | 49.9326 KOps/s | 50.3315 KOps/s | |
test_plain_set_stack_nested | 40.9770μs | 20.0123μs | 49.9694 KOps/s | 49.8753 KOps/s | |
test_plain_set_nested_inplace | 59.1800μs | 21.6906μs | 46.1029 KOps/s | 46.4534 KOps/s | |
test_plain_set_stack_nested_inplace | 76.1660μs | 21.8794μs | 45.7051 KOps/s | 46.8221 KOps/s | |
test_items | 15.7590μs | 4.1863μs | 238.8730 KOps/s | 231.3909 KOps/s | |
test_items_nested | 0.7628ms | 0.3553ms | 2.8149 KOps/s | 2.7864 KOps/s | |
test_items_nested_locked | 0.6518ms | 0.3591ms | 2.7850 KOps/s | 2.7742 KOps/s | |
test_items_nested_leaf | 0.1290ms | 69.2831μs | 14.4335 KOps/s | 14.5852 KOps/s | |
test_items_stack_nested | 0.6761ms | 0.3618ms | 2.7638 KOps/s | 2.7680 KOps/s | |
test_items_stack_nested_leaf | 0.1253ms | 71.0421μs | 14.0762 KOps/s | 14.0234 KOps/s | |
test_items_stack_nested_locked | 0.6852ms | 0.3612ms | 2.7688 KOps/s | 2.7548 KOps/s | |
test_keys | 30.7580μs | 3.5175μs | 284.2914 KOps/s | 278.6451 KOps/s | |
test_keys_nested | 0.2027ms | 99.9338μs | 10.0066 KOps/s | 10.2073 KOps/s | |
test_keys_nested_locked | 0.8084ms | 0.1059ms | 9.4384 KOps/s | 9.5028 KOps/s | |
test_keys_nested_leaf | 0.1454ms | 82.4196μs | 12.1330 KOps/s | 11.7037 KOps/s | |
test_keys_stack_nested | 0.1758ms | 99.5402μs | 10.0462 KOps/s | 10.1548 KOps/s | |
test_keys_stack_nested_leaf | 0.1441ms | 82.2764μs | 12.1542 KOps/s | 11.8336 KOps/s | |
test_keys_stack_nested_locked | 0.1969ms | 0.1047ms | 9.5507 KOps/s | 9.4298 KOps/s | |
test_values | 9.0950μs | 1.0707μs | 933.9812 KOps/s | 913.9490 KOps/s | |
test_values_nested | 0.1755ms | 72.3335μs | 13.8249 KOps/s | 13.9727 KOps/s | |
test_values_nested_locked | 0.1211ms | 71.9932μs | 13.8902 KOps/s | 13.8653 KOps/s | |
test_values_nested_leaf | 0.1190ms | 60.9661μs | 16.4026 KOps/s | 16.1964 KOps/s | |
test_values_stack_nested | 0.1347ms | 73.5834μs | 13.5900 KOps/s | 13.7656 KOps/s | |
test_values_stack_nested_leaf | 0.1121ms | 60.8482μs | 16.4343 KOps/s | 16.4843 KOps/s | |
test_values_stack_nested_locked | 0.1287ms | 73.5267μs | 13.6005 KOps/s | 13.3668 KOps/s | |
test_membership | 2.7792μs | 0.6949μs | 1.4391 MOps/s | 1.4175 MOps/s | |
test_membership_nested | 23.7950μs | 2.6945μs | 371.1325 KOps/s | 372.1803 KOps/s | |
test_membership_nested_leaf | 44.2630μs | 2.6849μs | 372.4592 KOps/s | 348.6172 KOps/s | |
test_membership_stacked_nested | 18.2150μs | 2.7247μs | 367.0167 KOps/s | 374.5875 KOps/s | |
test_membership_stacked_nested_leaf | 41.9780μs | 2.7332μs | 365.8676 KOps/s | 368.3568 KOps/s | |
test_membership_nested_last | 25.4370μs | 3.9758μs | 251.5220 KOps/s | 252.2308 KOps/s | |
test_membership_nested_leaf_last | 29.6550μs | 3.9498μs | 253.1773 KOps/s | 256.6277 KOps/s | |
test_membership_stacked_nested_last | 24.5960μs | 4.5718μs | 218.7313 KOps/s | 255.2595 KOps/s | |
test_membership_stacked_nested_leaf_last | 53.4310μs | 4.6053μs | 217.1404 KOps/s | 256.9286 KOps/s | |
test_nested_getleaf | 46.3160μs | 10.7576μs | 92.9576 KOps/s | 96.2351 KOps/s | |
test_nested_get | 56.2050μs | 10.1367μs | 98.6511 KOps/s | 100.4158 KOps/s | |
test_stacked_getleaf | 49.5330μs | 10.7084μs | 93.3849 KOps/s | 95.2779 KOps/s | |
test_stacked_get | 51.9670μs | 10.2445μs | 97.6130 KOps/s | 99.3600 KOps/s | |
test_nested_getitemleaf | 54.6920μs | 11.1085μs | 90.0209 KOps/s | 91.4451 KOps/s | |
test_nested_getitem | 51.4660μs | 10.4329μs | 95.8506 KOps/s | 98.4986 KOps/s | |
test_stacked_getitemleaf | 37.4800μs | 11.1175μs | 89.9480 KOps/s | 90.5625 KOps/s | |
test_stacked_getitem | 61.4150μs | 10.2694μs | 97.3762 KOps/s | 98.7129 KOps/s | |
test_lock_nested | 86.5067ms | 0.5597ms | 1.7867 KOps/s | 2.0719 KOps/s | |
test_lock_stack_nested | 0.8824ms | 0.4443ms | 2.2509 KOps/s | 2.1672 KOps/s | |
test_unlock_nested | 91.2891ms | 0.4852ms | 2.0609 KOps/s | 2.3904 KOps/s | |
test_unlock_stack_nested | 0.7402ms | 0.3645ms | 2.7436 KOps/s | 2.6700 KOps/s | |
test_flatten_speed | 0.1645ms | 86.7254μs | 11.5306 KOps/s | 11.3644 KOps/s | |
test_unflatten_speed | 0.8444ms | 0.4632ms | 2.1590 KOps/s | 2.1919 KOps/s | |
test_common_ops | 3.7379ms | 1.0703ms | 934.2917 Ops/s | 892.8694 Ops/s | |
test_creation | 31.5960μs | 2.0404μs | 490.1062 KOps/s | 462.1679 KOps/s | |
test_creation_empty | 60.7330μs | 17.2881μs | 57.8432 KOps/s | 58.2744 KOps/s | |
test_creation_nested_1 | 57.3770μs | 20.0546μs | 49.8640 KOps/s | 48.1107 KOps/s | |
test_creation_nested_2 | 81.4120μs | 24.2788μs | 41.1881 KOps/s | 40.5068 KOps/s | |
test_clone | 0.1970ms | 17.0227μs | 58.7451 KOps/s | 60.0009 KOps/s | |
test_getitem[int] | 0.8823ms | 16.6224μs | 60.1600 KOps/s | 60.0495 KOps/s | |
test_getitem[slice_int] | 0.1296ms | 30.8158μs | 32.4509 KOps/s | 33.7901 KOps/s | |
test_getitem[range] | 0.1651ms | 56.8843μs | 17.5795 KOps/s | 17.0890 KOps/s | |
test_getitem[tuple] | 0.1878ms | 25.2537μs | 39.5981 KOps/s | 40.6235 KOps/s | |
test_getitem[list] | 0.2557ms | 52.7036μs | 18.9740 KOps/s | 18.2810 KOps/s | |
test_setitem_dim[int] | 55.6040μs | 32.4122μs | 30.8526 KOps/s | 32.2152 KOps/s | |
test_setitem_dim[slice_int] | 0.1303ms | 60.3778μs | 16.5624 KOps/s | 16.5967 KOps/s | |
test_setitem_dim[range] | 0.1370ms | 83.7550μs | 11.9396 KOps/s | 11.9318 KOps/s | |
test_setitem_dim[tuple] | 88.8860μs | 48.5941μs | 20.5786 KOps/s | 21.2139 KOps/s | |
test_setitem | 75.9320μs | 28.4753μs | 35.1181 KOps/s | 34.5569 KOps/s | |
test_set | 94.2060μs | 28.1090μs | 35.5758 KOps/s | 35.9518 KOps/s | |
test_set_shared | 2.9269ms | 0.2118ms | 4.7206 KOps/s | 4.7160 KOps/s | |
test_update | 0.1410ms | 34.0316μs | 29.3845 KOps/s | 28.8483 KOps/s | |
test_update_nested | 0.1982ms | 44.7098μs | 22.3665 KOps/s | 22.3177 KOps/s | |
test_update__nested | 77.1740μs | 33.2666μs | 30.0602 KOps/s | 29.7119 KOps/s | |
test_set_nested | 0.1158ms | 30.5778μs | 32.7035 KOps/s | 32.4535 KOps/s | |
test_set_nested_new | 0.1675ms | 35.0746μs | 28.5107 KOps/s | 27.7294 KOps/s | |
test_select | 0.1310ms | 52.4116μs | 19.0797 KOps/s | 18.7768 KOps/s | |
test_select_nested | 0.1299ms | 60.1066μs | 16.6371 KOps/s | 17.0168 KOps/s | |
test_exclude_nested | 0.1455ms | 74.4988μs | 13.4230 KOps/s | 13.3363 KOps/s | |
test_empty[True] | 0.4573ms | 0.3157ms | 3.1672 KOps/s | 3.1940 KOps/s | |
test_empty[False] | 10.0632μs | 1.2004μs | 833.0321 KOps/s | 811.9339 KOps/s | |
test_unbind_speed | 0.4486ms | 0.2975ms | 3.3610 KOps/s | 3.3279 KOps/s | |
test_unbind_speed_stack0 | 0.3956ms | 0.2895ms | 3.4540 KOps/s | 3.4097 KOps/s | |
test_unbind_speed_stack1 | 98.4096ms | 0.8017ms | 1.2473 KOps/s | 1.3420 KOps/s | |
test_split | 92.6644ms | 2.2092ms | 452.6428 Ops/s | 455.3096 Ops/s | |
test_chunk | 3.2352ms | 2.0236ms | 494.1765 Ops/s | 458.4457 Ops/s | |
test_creation[device0] | 0.2295ms | 0.1157ms | 8.6435 KOps/s | 8.3425 KOps/s | |
test_creation_from_tensor | 3.1945ms | 0.1159ms | 8.6275 KOps/s | 8.5507 KOps/s | |
test_add_one[memmap_tensor0] | 0.3484ms | 7.3237μs | 136.5436 KOps/s | 142.8680 KOps/s | |
test_contiguous[memmap_tensor0] | 28.5730μs | 1.9277μs | 518.7409 KOps/s | 502.3206 KOps/s | |
test_stack[memmap_tensor0] | 54.3920μs | 5.5374μs | 180.5893 KOps/s | 177.3932 KOps/s | |
test_memmaptd_index | 1.2375ms | 0.4001ms | 2.4996 KOps/s | 2.5708 KOps/s | |
test_memmaptd_index_astensor | 0.7627ms | 0.4807ms | 2.0802 KOps/s | 2.1474 KOps/s | |
test_memmaptd_index_op | 1.4637ms | 0.9933ms | 1.0068 KOps/s | 1.0275 KOps/s | |
test_serialize_model | 0.2175s | 0.1314s | 7.6091 Ops/s | 8.4033 Ops/s | |
test_serialize_model_pickle | 0.4649s | 0.3955s | 2.5286 Ops/s | 2.4905 Ops/s | |
test_serialize_weights | 0.1257s | 0.1169s | 8.5565 Ops/s | 7.2941 Ops/s | |
test_serialize_weights_returnearly | 0.1719s | 0.1590s | 6.2910 Ops/s | 6.1573 Ops/s | |
test_serialize_weights_pickle | 0.6025s | 0.4249s | 2.3537 Ops/s | 2.1899 Ops/s | |
test_serialize_weights_filesystem | 0.2338s | 0.1557s | 6.4210 Ops/s | 6.9030 Ops/s | |
test_serialize_model_filesystem | 0.1562s | 0.1466s | 6.8191 Ops/s | 5.7571 Ops/s | |
test_reshape_pytree | 84.0780μs | 38.2765μs | 26.1257 KOps/s | 26.1273 KOps/s | |
test_reshape_td | 0.1034ms | 45.3401μs | 22.0555 KOps/s | 22.1134 KOps/s | |
test_view_pytree | 92.5630μs | 37.6438μs | 26.5648 KOps/s | 26.7467 KOps/s | |
test_view_td | 0.1042ms | 51.9202μs | 19.2603 KOps/s | 19.3032 KOps/s | |
test_unbind_pytree | 82.9060μs | 35.2898μs | 28.3368 KOps/s | 27.6940 KOps/s | |
test_unbind_td | 0.3199ms | 44.0870μs | 22.6824 KOps/s | 22.0782 KOps/s | |
test_split_pytree | 85.6800μs | 37.4715μs | 26.6870 KOps/s | 26.8687 KOps/s | |
test_split_td | 0.5214ms | 59.7166μs | 16.7458 KOps/s | 17.6112 KOps/s | |
test_add_pytree | 98.9050μs | 43.5987μs | 22.9365 KOps/s | 23.5686 KOps/s | |
test_add_td | 0.1762ms | 76.4429μs | 13.0817 KOps/s | 12.5997 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2231ms | 56.8378μs | 17.5939 KOps/s | 17.7058 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3136ms | 0.1763ms | 5.6735 KOps/s | 5.6968 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1370ms | 55.4243μs | 18.0426 KOps/s | 17.8919 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2729ms | 0.1383ms | 7.2325 KOps/s | 7.3191 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 48.2410μs | 21.2249μs | 47.1144 KOps/s | 46.6351 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1841ms | 66.4408μs | 15.0510 KOps/s | 15.1571 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1288ms | 74.2630μs | 13.4657 KOps/s | 13.3779 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1147ms | 67.0376μs | 14.9170 KOps/s | 14.8675 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2560ms | 0.1721ms | 5.8123 KOps/s | 5.8159 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2706ms | 0.1869ms | 5.3500 KOps/s | 5.3821 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1112ms | 46.6167μs | 21.4515 KOps/s | 21.0374 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1516ms | 68.0439μs | 14.6964 KOps/s | 15.0705 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3593ms | 0.1759ms | 5.6846 KOps/s | 5.7623 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.4718ms | 0.2825ms | 3.5393 KOps/s | 3.6060 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.2889ms | 0.1991ms | 5.0218 KOps/s | 4.9617 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3764ms | 0.1731ms | 5.7763 KOps/s | 5.7255 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1123ms | 61.1688μs | 16.3482 KOps/s | 16.2519 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 97.3310μs | 45.8669μs | 21.8022 KOps/s | 20.8953 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4198ms | 0.2294ms | 4.3596 KOps/s | 4.3222 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2808ms | 0.1745ms | 5.7301 KOps/s | 5.6824 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2738ms | 0.1037ms | 9.6410 KOps/s | 9.7598 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1235ms | 57.3870μs | 17.4256 KOps/s | 17.1312 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1663ms | 75.1391μs | 13.3086 KOps/s | 12.9856 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1506ms | 68.1921μs | 14.6644 KOps/s | 14.6457 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3651ms | 0.1914ms | 5.2252 KOps/s | 5.0641 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.1171ms | 1.6260ms | 614.9934 Ops/s | 613.2288 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2977ms | 0.1881ms | 5.3172 KOps/s | 5.0118 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.7213ms | 1.0667ms | 937.4997 Ops/s | 923.7373 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5341ms | 0.4121ms | 2.4268 KOps/s | 2.3958 KOps/s | |
test_compile_assign_and_add_stack[eager] | 5.8792ms | 3.7303ms | 268.0718 Ops/s | 277.2219 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 83.5060μs | 33.5779μs | 29.7815 KOps/s | 29.8112 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6727ms | 48.4769μs | 20.6284 KOps/s | 20.8211 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 91.4910μs | 29.3611μs | 34.0586 KOps/s | 34.4776 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 83.1850μs | 29.1070μs | 34.3560 KOps/s | 34.7207 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1005ms | 29.0259μs | 34.4520 KOps/s | 35.0314 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 85.9110μs | 28.7247μs | 34.8133 KOps/s | 34.3810 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2440ms | 73.2402μs | 13.6537 KOps/s | 13.8663 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5378ms | 28.4201μs | 35.1863 KOps/s | 35.8351 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1273ms | 66.5784μs | 15.0199 KOps/s | 14.8823 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 80.2200μs | 23.0931μs | 43.3030 KOps/s | 42.7301 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1860ms | 67.4268μs | 14.8309 KOps/s | 14.9917 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 75.9520μs | 23.0080μs | 43.4632 KOps/s | 43.5305 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1398ms | 71.1691μs | 14.0510 KOps/s | 13.7582 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9767ms | 27.9445μs | 35.7853 KOps/s | 36.3398 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1276ms | 66.2001μs | 15.1057 KOps/s | 14.9194 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 68.5790μs | 22.3643μs | 44.7141 KOps/s | 43.1113 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1672ms | 66.4543μs | 15.0479 KOps/s | 14.9477 KOps/s | |
test_compile_indexing[int-pytree-eager] | 82.7670μs | 22.6958μs | 44.0611 KOps/s | 43.3036 KOps/s | |
test_mod_add[eager] | 0.1215ms | 24.5286μs | 40.7688 KOps/s | 42.3595 KOps/s | |
test_mod_add[compile] | 0.1113ms | 39.1095μs | 25.5692 KOps/s | 27.6964 KOps/s | |
test_mod_add[compile-overhead] | 0.1031ms | 38.0533μs | 26.2789 KOps/s | 27.4023 KOps/s | |
test_mod_wrap[eager] | 0.4267ms | 0.2020ms | 4.9505 KOps/s | 4.9436 KOps/s | |
test_mod_wrap[compile] | 0.4541ms | 0.2268ms | 4.4097 KOps/s | 4.4040 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3066ms | 0.2239ms | 4.4664 KOps/s | 4.4132 KOps/s | |
test_mod_wrap_and_backward[eager] | 12.3678ms | 10.6438ms | 93.9512 Ops/s | 87.2255 Ops/s | |
test_mod_wrap_and_backward[compile] | 12.1259ms | 10.8964ms | 91.7736 Ops/s | 80.7164 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 12.0500ms | 10.8419ms | 92.2348 Ops/s | 78.3017 Ops/s | |
test_seq_add[eager] | 0.2033ms | 88.2239μs | 11.3348 KOps/s | 11.5466 KOps/s | |
test_seq_add[compile] | 0.1274ms | 63.0001μs | 15.8730 KOps/s | 16.2177 KOps/s | |
test_seq_add[compile-overhead] | 0.1175ms | 62.0104μs | 16.1263 KOps/s | 16.2071 KOps/s | |
test_seq_wrap[eager] | 0.6465ms | 0.3737ms | 2.6758 KOps/s | 2.6616 KOps/s | |
test_seq_wrap[compile] | 0.5051ms | 0.2624ms | 3.8105 KOps/s | 3.7638 KOps/s | |
test_seq_wrap[compile-overhead] | 0.5104ms | 0.2637ms | 3.7915 KOps/s | 3.7794 KOps/s | |
test_func_call_runtime[False-eager] | 0.7864ms | 0.5176ms | 1.9319 KOps/s | 1.9836 KOps/s | |
test_func_call_runtime[False-compile] | 0.9333ms | 0.4898ms | 2.0416 KOps/s | 2.0414 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6660ms | 0.4879ms | 2.0497 KOps/s | 2.0319 KOps/s | |
test_func_call_runtime[True-eager] | 1.1893ms | 0.7318ms | 1.3664 KOps/s | 1.3936 KOps/s | |
test_func_call_runtime[True-compile] | 0.9074ms | 0.5014ms | 1.9945 KOps/s | 1.9751 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8071ms | 0.4995ms | 2.0021 KOps/s | 1.9898 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7150ms | 0.5174ms | 1.9328 KOps/s | 1.9903 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8142ms | 0.5064ms | 1.9746 KOps/s | 2.0234 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9258ms | 0.4914ms | 2.0348 KOps/s | 2.0336 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.4459ms | 0.8565ms | 1.1675 KOps/s | 1.1710 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.8475ms | 0.7272ms | 1.3752 KOps/s | 1.3731 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4012ms | 0.7375ms | 1.3559 KOps/s | 1.3605 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4036ms | 1.8221ms | 548.8167 Ops/s | 540.0278 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 3.0303ms | 1.8905ms | 528.9665 Ops/s | 523.7056 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6655ms | 1.8857ms | 530.3204 Ops/s | 524.5807 Ops/s | |
test_distributed | 0.2668ms | 0.1235ms | 8.0983 KOps/s | 7.8322 KOps/s | |
test_tdmodule | 34.0250μs | 17.3854μs | 57.5195 KOps/s | 58.1015 KOps/s | |
test_tdmodule_dispatch | 65.3640μs | 34.0685μs | 29.3527 KOps/s | 28.0191 KOps/s | |
test_tdseq | 48.5420μs | 19.8723μs | 50.3213 KOps/s | 49.2960 KOps/s | |
test_tdseq_dispatch | 84.5430μs | 39.1531μs | 25.5408 KOps/s | 24.5903 KOps/s | |
test_instantiation_functorch | 2.0270ms | 1.5434ms | 647.9335 Ops/s | 629.0759 Ops/s | |
test_instantiation_td | 1.9145ms | 1.1341ms | 881.7715 Ops/s | 868.4491 Ops/s | |
test_exec_functorch | 0.3340ms | 0.1758ms | 5.6878 KOps/s | 5.5629 KOps/s | |
test_exec_functional_call | 0.3392ms | 0.1714ms | 5.8353 KOps/s | 5.9026 KOps/s | |
test_exec_td | 0.3254ms | 0.1673ms | 5.9759 KOps/s | 5.9855 KOps/s | |
test_exec_td_decorator | 0.9307ms | 0.2166ms | 4.6164 KOps/s | 4.5776 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9298ms | 0.6278ms | 1.5929 KOps/s | 1.5756 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8613ms | 0.6267ms | 1.5955 KOps/s | 1.5805 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.8066ms | 0.4837ms | 2.0672 KOps/s | 2.0504 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7737ms | 0.4857ms | 2.0587 KOps/s | 2.0330 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3534ms | 0.5998ms | 1.6672 KOps/s | 1.5793 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9034ms | 0.6053ms | 1.6520 KOps/s | 1.6352 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6789ms | 0.4965ms | 2.0141 KOps/s | 1.9955 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6961ms | 0.4964ms | 2.0145 KOps/s | 1.9821 KOps/s | |
test_to_module_speed[True] | 1.8736ms | 1.2815ms | 780.3559 Ops/s | 775.7036 Ops/s | |
test_to_module_speed[False] | 1.4908ms | 1.2627ms | 791.9441 Ops/s | 792.9546 Ops/s | |
test_tc_init | 0.1078ms | 41.4043μs | 24.1521 KOps/s | 22.5425 KOps/s | |
test_tc_init_nested | 0.1415ms | 81.5599μs | 12.2609 KOps/s | 11.2806 KOps/s | |
test_tc_first_layer_tensor | 17.4430μs | 1.5228μs | 656.6838 KOps/s | 658.0449 KOps/s | |
test_tc_first_layer_nontensor | 20.9690μs | 4.6953μs | 212.9787 KOps/s | 211.6949 KOps/s | |
test_tc_second_layer_tensor | 30.5480μs | 2.8492μs | 350.9771 KOps/s | 349.8067 KOps/s | |
test_tc_second_layer_nontensor | 40.6770μs | 6.0129μs | 166.3085 KOps/s | 164.8618 KOps/s | |
test_unbind | 0.4890s | 15.2346ms | 65.6401 Ops/s | 75.3549 Ops/s | |
test_full_like | 9.3031ms | 7.4348ms | 134.5031 Ops/s | 128.7461 Ops/s | |
test_zeros_like | 13.4871ms | 6.8376ms | 146.2503 Ops/s | 337.0434 Ops/s | |
test_ones_like | 15.1644ms | 8.1021ms | 123.4247 Ops/s | 150.4293 Ops/s | |
test_clone | 14.4438ms | 9.1353ms | 109.4654 Ops/s | 108.9289 Ops/s | |
test_squeeze | 70.2830μs | 12.4154μs | 80.5451 KOps/s | 79.6245 KOps/s | |
test_unsqueeze | 0.3487ms | 91.6932μs | 10.9059 KOps/s | 10.5484 KOps/s | |
test_split | 0.3899ms | 0.1942ms | 5.1489 KOps/s | 5.0908 KOps/s | |
test_permute | 0.3231ms | 0.2177ms | 4.5936 KOps/s | 4.5058 KOps/s | |
test_stack | 32.7349ms | 25.5789ms | 39.0947 Ops/s | 38.9143 Ops/s | |
test_cat | 29.6561ms | 25.3201ms | 39.4943 Ops/s | 38.9932 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1393ms | 13.5180μs | 73.9756 KOps/s | 67.2163 KOps/s | |
test_plain_set_stack_nested | 37.9510μs | 13.8008μs | 72.4597 KOps/s | 66.8901 KOps/s | |
test_plain_set_nested_inplace | 43.9710μs | 14.7247μs | 67.9133 KOps/s | 62.4165 KOps/s | |
test_plain_set_stack_nested_inplace | 43.2010μs | 14.4941μs | 68.9935 KOps/s | 62.0576 KOps/s | |
test_items | 28.1110μs | 2.8884μs | 346.2116 KOps/s | 342.0922 KOps/s | |
test_items_nested | 0.4354ms | 0.3281ms | 3.0480 KOps/s | 3.0880 KOps/s | |
test_items_nested_locked | 0.3751ms | 0.3286ms | 3.0434 KOps/s | 3.0799 KOps/s | |
test_items_nested_leaf | 0.1871ms | 55.7903μs | 17.9243 KOps/s | 17.9807 KOps/s | |
test_items_stack_nested | 0.3703ms | 0.3270ms | 3.0586 KOps/s | 3.0631 KOps/s | |
test_items_stack_nested_leaf | 86.4320μs | 56.9259μs | 17.5667 KOps/s | 17.6720 KOps/s | |
test_items_stack_nested_locked | 0.3808ms | 0.3258ms | 3.0696 KOps/s | 3.0101 KOps/s | |
test_keys | 34.6600μs | 3.4648μs | 288.6209 KOps/s | 292.9958 KOps/s | |
test_keys_nested | 88.6620μs | 56.6134μs | 17.6637 KOps/s | 17.7099 KOps/s | |
test_keys_nested_locked | 2.6752ms | 62.9049μs | 15.8970 KOps/s | 16.0232 KOps/s | |
test_keys_nested_leaf | 78.9420μs | 47.5474μs | 21.0316 KOps/s | 20.8404 KOps/s | |
test_keys_stack_nested | 91.3020μs | 56.3944μs | 17.7322 KOps/s | 17.7619 KOps/s | |
test_keys_stack_nested_leaf | 84.7220μs | 48.6672μs | 20.5477 KOps/s | 20.7964 KOps/s | |
test_keys_stack_nested_locked | 94.5720μs | 61.3430μs | 16.3018 KOps/s | 16.2902 KOps/s | |
test_values | 5.7017μs | 0.8447μs | 1.1838 MOps/s | 1.1792 MOps/s | |
test_values_nested | 72.4320μs | 41.2846μs | 24.2221 KOps/s | 24.5713 KOps/s | |
test_values_nested_locked | 69.5520μs | 43.2580μs | 23.1171 KOps/s | 23.4335 KOps/s | |
test_values_nested_leaf | 57.2510μs | 35.6947μs | 28.0154 KOps/s | 28.3293 KOps/s | |
test_values_stack_nested | 71.5410μs | 42.0704μs | 23.7697 KOps/s | 23.9692 KOps/s | |
test_values_stack_nested_leaf | 65.5320μs | 35.9301μs | 27.8318 KOps/s | 27.9520 KOps/s | |
test_values_stack_nested_locked | 74.8620μs | 43.8617μs | 22.7989 KOps/s | 22.9419 KOps/s | |
test_membership | 1.8886μs | 0.5043μs | 1.9829 MOps/s | 1.9849 MOps/s | |
test_membership_nested | 17.2355μs | 1.9013μs | 525.9510 KOps/s | 531.7707 KOps/s | |
test_membership_nested_leaf | 15.2150μs | 1.9013μs | 525.9694 KOps/s | 539.1182 KOps/s | |
test_membership_stacked_nested | 28.4210μs | 1.9653μs | 508.8253 KOps/s | 530.1217 KOps/s | |
test_membership_stacked_nested_leaf | 30.7510μs | 1.9758μs | 506.1239 KOps/s | 521.2209 KOps/s | |
test_membership_nested_last | 31.7210μs | 2.8056μs | 356.4357 KOps/s | 363.4395 KOps/s | |
test_membership_nested_leaf_last | 28.7710μs | 2.8433μs | 351.7084 KOps/s | 363.0943 KOps/s | |
test_membership_stacked_nested_last | 28.6000μs | 3.4614μs | 288.8995 KOps/s | 127.2262 KOps/s | |
test_membership_stacked_nested_leaf_last | 31.4610μs | 3.4287μs | 291.6559 KOps/s | 128.7964 KOps/s | |
test_nested_getleaf | 48.6710μs | 6.1026μs | 163.8644 KOps/s | 163.5518 KOps/s | |
test_nested_get | 32.3810μs | 5.6830μs | 175.9624 KOps/s | 174.1617 KOps/s | |
test_stacked_getleaf | 42.5710μs | 6.0154μs | 166.2404 KOps/s | 164.0631 KOps/s | |
test_stacked_get | 32.5410μs | 5.6268μs | 177.7196 KOps/s | 175.3282 KOps/s | |
test_nested_getitemleaf | 0.1589ms | 6.1691μs | 162.0974 KOps/s | 163.9320 KOps/s | |
test_nested_getitem | 33.7310μs | 5.7626μs | 173.5336 KOps/s | 173.9801 KOps/s | |
test_stacked_getitemleaf | 33.4400μs | 6.1116μs | 163.6222 KOps/s | 163.2538 KOps/s | |
test_stacked_getitem | 37.6010μs | 5.6822μs | 175.9886 KOps/s | 175.1809 KOps/s | |
test_lock_nested | 6.9217ms | 0.4172ms | 2.3972 KOps/s | 2.3564 KOps/s | |
test_lock_stack_nested | 0.5073ms | 0.3734ms | 2.6781 KOps/s | 2.6631 KOps/s | |
test_unlock_nested | 0.7516ms | 0.3504ms | 2.8539 KOps/s | 2.7758 KOps/s | |
test_unlock_stack_nested | 0.3935ms | 0.3121ms | 3.2042 KOps/s | 3.1767 KOps/s | |
test_flatten_speed | 0.1495ms | 69.0903μs | 14.4738 KOps/s | 14.4189 KOps/s | |
test_unflatten_speed | 0.3738ms | 0.2799ms | 3.5728 KOps/s | 3.5092 KOps/s | |
test_common_ops | 1.5568ms | 1.2287ms | 813.8364 Ops/s | 782.1285 Ops/s | |
test_creation | 25.7110μs | 1.4976μs | 667.7428 KOps/s | 670.8931 KOps/s | |
test_creation_empty | 39.8510μs | 14.7538μs | 67.7792 KOps/s | 56.5462 KOps/s | |
test_creation_nested_1 | 55.8920μs | 16.4712μs | 60.7120 KOps/s | 51.1122 KOps/s | |
test_creation_nested_2 | 49.7610μs | 19.3424μs | 51.6998 KOps/s | 45.0154 KOps/s | |
test_clone | 0.2204ms | 29.0868μs | 34.3799 KOps/s | 34.4339 KOps/s | |
test_getitem[int] | 1.3674ms | 15.3232μs | 65.2604 KOps/s | 62.2933 KOps/s | |
test_getitem[slice_int] | 0.1149ms | 26.4742μs | 37.7726 KOps/s | 36.1952 KOps/s | |
test_getitem[range] | 0.2456ms | 0.1070ms | 9.3450 KOps/s | 9.3357 KOps/s | |
test_getitem[tuple] | 0.1236ms | 22.8903μs | 43.6867 KOps/s | 41.6764 KOps/s | |
test_getitem[list] | 0.2716ms | 95.8522μs | 10.4327 KOps/s | 10.3534 KOps/s | |
test_setitem_dim[int] | 0.2530ms | 43.9832μs | 22.7360 KOps/s | 23.1586 KOps/s | |
test_setitem_dim[slice_int] | 0.1079ms | 65.6925μs | 15.2224 KOps/s | 15.1717 KOps/s | |
test_setitem_dim[range] | 0.2602ms | 0.1255ms | 7.9686 KOps/s | 7.9271 KOps/s | |
test_setitem_dim[tuple] | 85.2720μs | 59.5068μs | 16.8048 KOps/s | 16.7087 KOps/s | |
test_setitem | 84.6820μs | 41.5515μs | 24.0665 KOps/s | 23.4767 KOps/s | |
test_set | 0.1900ms | 40.5606μs | 24.6545 KOps/s | 24.0223 KOps/s | |
test_set_shared | 0.3496ms | 50.8902μs | 19.6501 KOps/s | 19.8010 KOps/s | |
test_update | 0.2102ms | 48.0956μs | 20.7919 KOps/s | 19.5204 KOps/s | |
test_update_nested | 0.1904ms | 54.7485μs | 18.2653 KOps/s | 17.0974 KOps/s | |
test_update__nested | 0.1995ms | 58.3312μs | 17.1435 KOps/s | 16.7183 KOps/s | |
test_set_nested | 77.1320μs | 42.4820μs | 23.5394 KOps/s | 22.6056 KOps/s | |
test_set_nested_new | 0.2078ms | 45.9360μs | 21.7694 KOps/s | 20.5323 KOps/s | |
test_select | 0.2125ms | 59.1227μs | 16.9140 KOps/s | 15.7335 KOps/s | |
test_select_nested | 69.3220μs | 42.2216μs | 23.6846 KOps/s | 23.8664 KOps/s | |
test_exclude_nested | 87.0020μs | 59.8335μs | 16.7131 KOps/s | 17.0264 KOps/s | |
test_empty[True] | 0.3167ms | 0.2439ms | 4.1001 KOps/s | 3.9636 KOps/s | |
test_empty[False] | 2.9421μs | 0.7433μs | 1.3453 MOps/s | 1.3442 MOps/s | |
test_to | 0.1360ms | 25.1610μs | 39.7440 KOps/s | 38.8730 KOps/s | |
test_to_nonblocking | 0.2063ms | 24.0589μs | 41.5646 KOps/s | 40.9607 KOps/s | |
test_unbind_speed | 1.4011ms | 0.2737ms | 3.6533 KOps/s | 3.5543 KOps/s | |
test_unbind_speed_stack0 | 0.2999ms | 0.2685ms | 3.7245 KOps/s | 3.6310 KOps/s | |
test_unbind_speed_stack1 | 92.6087ms | 0.6897ms | 1.4500 KOps/s | 1.4331 KOps/s | |
test_split | 93.6391ms | 2.1135ms | 473.1586 Ops/s | 455.0897 Ops/s | |
test_chunk | 95.5422ms | 2.1332ms | 468.7746 Ops/s | 453.4320 Ops/s | |
test_creation[device0] | 0.3406ms | 0.1256ms | 7.9628 KOps/s | 7.7945 KOps/s | |
test_creation_from_tensor | 0.3501ms | 0.1277ms | 7.8320 KOps/s | 7.7619 KOps/s | |
test_add_one[memmap_tensor0] | 0.1366ms | 8.3007μs | 120.4724 KOps/s | 116.9092 KOps/s | |
test_contiguous[memmap_tensor0] | 22.8300μs | 2.2056μs | 453.3922 KOps/s | 447.5495 KOps/s | |
test_stack[memmap_tensor0] | 0.1233ms | 6.4294μs | 155.5352 KOps/s | 151.9702 KOps/s | |
test_memmaptd_index | 1.0495ms | 0.4184ms | 2.3899 KOps/s | 2.2990 KOps/s | |
test_memmaptd_index_astensor | 0.7448ms | 0.4757ms | 2.1023 KOps/s | 2.0446 KOps/s | |
test_memmaptd_index_op | 1.3956ms | 0.9834ms | 1.0169 KOps/s | 925.1552 Ops/s | |
test_serialize_model | 0.1312s | 0.1295s | 7.7199 Ops/s | 7.7219 Ops/s | |
test_serialize_model_pickle | 1.3481s | 1.2131s | 0.8243 Ops/s | 0.8197 Ops/s | |
test_serialize_weights | 0.1310s | 0.1295s | 7.7228 Ops/s | 6.9967 Ops/s | |
test_serialize_weights_returnearly | 0.2428s | 62.5006ms | 15.9998 Ops/s | 17.8109 Ops/s | |
test_serialize_weights_pickle | 1.3460s | 1.2115s | 0.8254 Ops/s | 0.8248 Ops/s | |
test_reshape_pytree | 88.8420μs | 35.6505μs | 28.0501 KOps/s | 27.5950 KOps/s | |
test_reshape_td | 0.1714ms | 43.4796μs | 22.9993 KOps/s | 21.7794 KOps/s | |
test_view_pytree | 0.1763ms | 35.6087μs | 28.0830 KOps/s | 27.3166 KOps/s | |
test_view_td | 79.0720μs | 47.1724μs | 21.1988 KOps/s | 20.6030 KOps/s | |
test_unbind_pytree | 89.0220μs | 34.6308μs | 28.8760 KOps/s | 28.3870 KOps/s | |
test_unbind_td | 0.5006ms | 42.7519μs | 23.3907 KOps/s | 22.6176 KOps/s | |
test_split_pytree | 0.1769ms | 46.0043μs | 21.7371 KOps/s | 20.1250 KOps/s | |
test_split_td | 0.6701ms | 58.8897μs | 16.9809 KOps/s | 17.8946 KOps/s | |
test_add_pytree | 0.2360ms | 61.5525μs | 16.2463 KOps/s | 17.7127 KOps/s | |
test_add_td | 0.2731ms | 97.0176μs | 10.3074 KOps/s | 10.2045 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4166ms | 0.2100ms | 4.7630 KOps/s | 4.7026 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2808ms | 0.1514ms | 6.6046 KOps/s | 6.5868 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2935ms | 0.1461ms | 6.8456 KOps/s | 6.7985 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3340ms | 0.1850ms | 5.4048 KOps/s | 5.5026 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1444ms | 21.1451μs | 47.2924 KOps/s | 44.4921 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 85.1820μs | 44.5189μs | 22.4624 KOps/s | 22.6078 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2227ms | 64.5306μs | 15.4965 KOps/s | 15.5897 KOps/s | |
test_compile_copy_nested[pytree-eager] | 89.8020μs | 50.0569μs | 19.9773 KOps/s | 20.3018 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4206ms | 0.3199ms | 3.1263 KOps/s | 3.1307 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3540ms | 0.2106ms | 4.7484 KOps/s | 4.8707 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2302ms | 0.1292ms | 7.7398 KOps/s | 7.4052 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2084ms | 61.2883μs | 16.3163 KOps/s | 15.7440 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4800ms | 0.3229ms | 3.0974 KOps/s | 3.0876 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8183ms | 0.6332ms | 1.5793 KOps/s | 1.5765 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3823ms | 0.2487ms | 4.0216 KOps/s | 3.9920 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3853ms | 0.3221ms | 3.1042 KOps/s | 3.0315 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1180ms | 71.6154μs | 13.9635 KOps/s | 13.4969 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2773ms | 0.1292ms | 7.7400 KOps/s | 7.2850 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6991ms | 0.5321ms | 1.8794 KOps/s | 1.8755 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4630ms | 0.3208ms | 3.1167 KOps/s | 3.1030 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1031ms | 17.9203μs | 55.8026 KOps/s | 50.8272 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 78.6620μs | 27.3252μs | 36.5962 KOps/s | 37.5049 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.2443ms | 70.3737μs | 14.2099 KOps/s | 14.1603 KOps/s | |
test_compile_copy_flat[pytree-eager] | 91.9620μs | 51.2604μs | 19.5082 KOps/s | 19.4659 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.3481ms | 0.8110ms | 1.2330 KOps/s | 1.1403 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.5407ms | 3.1284ms | 319.6547 Ops/s | 319.5217 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2898ms | 0.8103ms | 1.2341 KOps/s | 1.1448 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4604ms | 3.2118ms | 311.3516 Ops/s | 312.2448 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2611ms | 0.1139ms | 8.7834 KOps/s | 8.8277 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1996ms | 64.0646μs | 15.6093 KOps/s | 15.7103 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2406ms | 0.1029ms | 9.7137 KOps/s | 9.5862 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1938ms | 42.6331μs | 23.4560 KOps/s | 23.3071 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2573ms | 0.1047ms | 9.5473 KOps/s | 9.5385 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2337ms | 42.5867μs | 23.4815 KOps/s | 23.4033 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1885ms | 0.1379ms | 7.2530 KOps/s | 7.2839 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1747ms | 24.5228μs | 40.7784 KOps/s | 38.7773 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2884ms | 0.1325ms | 7.5473 KOps/s | 7.5810 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1627ms | 20.6746μs | 48.3685 KOps/s | 47.7311 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2941ms | 0.1329ms | 7.5233 KOps/s | 7.5294 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.2100ms | 20.8335μs | 47.9997 KOps/s | 48.3866 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2924ms | 0.1383ms | 7.2331 KOps/s | 7.2269 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4891ms | 24.2426μs | 41.2497 KOps/s | 39.3580 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2550ms | 0.1331ms | 7.5117 KOps/s | 7.5298 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.2094ms | 23.2299μs | 43.0480 KOps/s | 48.4464 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2562ms | 0.1317ms | 7.5903 KOps/s | 7.5477 KOps/s | |
test_compile_indexing[int-pytree-eager] | 60.1710μs | 20.0403μs | 49.8995 KOps/s | 48.2208 KOps/s | |
test_mod_add[eager] | 0.1737ms | 31.2899μs | 31.9592 KOps/s | 29.8687 KOps/s | |
test_mod_add[compile] | 0.1914ms | 69.1623μs | 14.4587 KOps/s | 14.1144 KOps/s | |
test_mod_add[compile-overhead] | 0.2615ms | 0.1362ms | 7.3408 KOps/s | 5.9002 KOps/s | |
test_mod_wrap[eager] | 0.4118ms | 0.2410ms | 4.1494 KOps/s | 4.2029 KOps/s | |
test_mod_wrap[compile] | 0.6896ms | 0.2953ms | 3.3862 KOps/s | 3.3393 KOps/s | |
test_mod_wrap[compile-overhead] | 7.4766ms | 4.0331ms | 247.9459 Ops/s | 249.6978 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5536ms | 1.3212ms | 756.8795 Ops/s | 716.8600 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5792ms | 1.3197ms | 757.7387 Ops/s | 703.8302 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.3341ms | 0.8986ms | 1.1128 KOps/s | 988.0595 Ops/s | |
test_seq_add[eager] | 0.4852ms | 96.6432μs | 10.3473 KOps/s | 9.8457 KOps/s | |
test_seq_add[compile] | 0.4647ms | 80.4341μs | 12.4325 KOps/s | 12.2483 KOps/s | |
test_seq_add[compile-overhead] | 0.1505ms | 0.1149ms | 8.7007 KOps/s | 8.5337 KOps/s | |
test_seq_wrap[eager] | 0.7661ms | 0.3709ms | 2.6959 KOps/s | 2.5803 KOps/s | |
test_seq_wrap[compile] | 0.7060ms | 0.3130ms | 3.1950 KOps/s | 3.1497 KOps/s | |
test_seq_wrap[compile-overhead] | 0.6099ms | 0.2184ms | 4.5780 KOps/s | 4.5156 KOps/s | |
test_func_call_runtime[False-eager] | 1.1249ms | 0.7222ms | 1.3847 KOps/s | 1.3988 KOps/s | |
test_func_call_runtime[False-compile] | 1.1844ms | 0.7920ms | 1.2625 KOps/s | 1.2619 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5119ms | 0.3608ms | 2.7717 KOps/s | 2.7527 KOps/s | |
test_func_call_runtime[True-eager] | 1.2890ms | 0.8867ms | 1.1278 KOps/s | 1.1387 KOps/s | |
test_func_call_runtime[True-compile] | 1.2357ms | 0.8285ms | 1.2070 KOps/s | 1.2062 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5363ms | 0.3948ms | 2.5331 KOps/s | 2.5129 KOps/s | |
test_func_call_cm_runtime[False-eager] | 1.1336ms | 0.7297ms | 1.3705 KOps/s | 1.4136 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.2045ms | 0.7950ms | 1.2578 KOps/s | 1.2686 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4782ms | 0.3629ms | 2.7557 KOps/s | 2.7368 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1609ms | 1.0163ms | 983.9434 Ops/s | 1.0165 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0056ms | 0.8515ms | 1.1745 KOps/s | 1.1710 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.5527ms | 0.4188ms | 2.3877 KOps/s | 2.3545 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.4515ms | 1.9901ms | 502.4933 Ops/s | 496.0132 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 1.0158ms | 0.8638ms | 1.1576 KOps/s | 1.1528 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5874ms | 0.4251ms | 2.3525 KOps/s | 2.3368 KOps/s | |
test_distributed | 2.8889ms | 0.2116ms | 4.7252 KOps/s | 8.8615 KOps/s | |
test_tdmodule | 33.1310μs | 14.0156μs | 71.3492 KOps/s | 61.0636 KOps/s | |
test_tdmodule_dispatch | 49.8410μs | 27.6418μs | 36.1771 KOps/s | 32.1850 KOps/s | |
test_tdseq | 35.1200μs | 14.8581μs | 67.3035 KOps/s | 57.4817 KOps/s | |
test_tdseq_dispatch | 58.2210μs | 30.3281μs | 32.9727 KOps/s | 27.8655 KOps/s | |
test_instantiation_functorch | 2.3705ms | 1.8515ms | 540.1103 Ops/s | 530.6703 Ops/s | |
test_instantiation_td | 1.7906ms | 1.1975ms | 835.0951 Ops/s | 824.4090 Ops/s | |
test_exec_functorch | 0.3345ms | 0.2081ms | 4.8050 KOps/s | 4.8705 KOps/s | |
test_exec_functional_call | 0.5913ms | 0.2057ms | 4.8611 KOps/s | 4.9613 KOps/s | |
test_exec_td | 0.2759ms | 0.2103ms | 4.7558 KOps/s | 4.5750 KOps/s | |
test_exec_td_decorator | 0.6472ms | 0.2547ms | 3.9260 KOps/s | 3.9398 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0710ms | 0.6694ms | 1.4938 KOps/s | 1.4415 KOps/s | |
test_vmap_mlp_speed[True-False] | 1.0771ms | 0.6682ms | 1.4966 KOps/s | 1.4009 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.9681ms | 0.5612ms | 1.7818 KOps/s | 1.7720 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.9679ms | 0.5667ms | 1.7646 KOps/s | 1.7772 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.3855ms | 0.6542ms | 1.5286 KOps/s | 1.5101 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8213ms | 0.6554ms | 1.5258 KOps/s | 1.5154 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7568ms | 0.5922ms | 1.6886 KOps/s | 1.7349 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7825ms | 0.6093ms | 1.6413 KOps/s | 1.7295 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.5552ms | 8.1074ms | 123.3442 Ops/s | 121.7504 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.2191ms | 8.0821ms | 123.7297 Ops/s | 121.9495 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.3083ms | 7.9268ms | 126.1539 Ops/s | 125.1864 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.2952ms | 7.9240ms | 126.1982 Ops/s | 125.1301 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.1490ms | 18.9490ms | 52.7733 Ops/s | 52.5887 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.7359ms | 19.0008ms | 52.6293 Ops/s | 52.4453 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5790ms | 18.8859ms | 52.9497 Ops/s | 53.0164 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.1916ms | 18.8553ms | 53.0354 Ops/s | 52.8592 Ops/s | |
test_to_module_speed[True] | 1.4631ms | 0.9617ms | 1.0398 KOps/s | 1.0613 KOps/s | |
test_to_module_speed[False] | 1.3224ms | 0.9333ms | 1.0715 KOps/s | 1.0889 KOps/s | |
test_tc_init | 0.4290ms | 34.1066μs | 29.3198 KOps/s | 26.9044 KOps/s | |
test_tc_init_nested | 0.2480ms | 68.6502μs | 14.5666 KOps/s | 13.7687 KOps/s | |
test_tc_first_layer_tensor | 3.3987μs | 0.6827μs | 1.4648 MOps/s | 1.5027 MOps/s | |
test_tc_first_layer_nontensor | 23.6810μs | 2.2414μs | 446.1480 KOps/s | 447.3710 KOps/s | |
test_tc_second_layer_tensor | 97.0648μs | 1.3543μs | 738.3664 KOps/s | 738.8531 KOps/s | |
test_tc_second_layer_nontensor | 0.3870ms | 2.9456μs | 339.4935 KOps/s | 341.5296 KOps/s | |
test_unbind | 0.1962s | 12.0421ms | 83.0423 Ops/s | 92.1236 Ops/s | |
test_full_like | 0.7906ms | 0.5755ms | 1.7375 KOps/s | 1.7379 KOps/s | |
test_zeros_like | 0.3535ms | 0.1979ms | 5.0527 KOps/s | 5.0476 KOps/s | |
test_ones_like | 0.5913ms | 0.1978ms | 5.0544 KOps/s | 5.0536 KOps/s | |
test_clone | 0.7150ms | 0.4145ms | 2.4127 KOps/s | 2.4173 KOps/s | |
test_squeeze | 36.3210μs | 9.8736μs | 101.2800 KOps/s | 102.2895 KOps/s | |
test_unsqueeze | 0.4569ms | 75.1560μs | 13.3057 KOps/s | 13.2662 KOps/s | |
test_split | 0.5328ms | 0.1580ms | 6.3305 KOps/s | 6.3320 KOps/s | |
test_permute | 0.3087ms | 0.1790ms | 5.5869 KOps/s | 5.3172 KOps/s | |
test_stack | 1.3834ms | 0.8468ms | 1.1809 KOps/s | 1.1793 KOps/s | |
test_cat | 1.3752ms | 1.2311ms | 812.2609 Ops/s | 811.7012 Ops/s |
vmoens
added a commit
that referenced
this pull request
Sep 17, 2024
ghstack-source-id: d312fc1dee177275a73482210c1ecfbe73b04f9e Pull Request resolved: #991
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):
selected_out_keys
arg in TDS constructor #993inplace
arg in TDM constructor #992