Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] torch.export and onnx compatibility #991

Merged
merged 8 commits into from
Sep 17, 2024

Conversation

[ghstack-poisoned]
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 16, 2024
@vmoens vmoens added the enhancement New feature or request label Sep 16, 2024
Copy link

github-actions bot commented Sep 16, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}11$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 53.5000μs 20.0270μs 49.9326 KOps/s 50.3315 KOps/s $\color{#d91a1a}-0.79\%$
test_plain_set_stack_nested 40.9770μs 20.0123μs 49.9694 KOps/s 49.8753 KOps/s $\color{#35bf28}+0.19\%$
test_plain_set_nested_inplace 59.1800μs 21.6906μs 46.1029 KOps/s 46.4534 KOps/s $\color{#d91a1a}-0.75\%$
test_plain_set_stack_nested_inplace 76.1660μs 21.8794μs 45.7051 KOps/s 46.8221 KOps/s $\color{#d91a1a}-2.39\%$
test_items 15.7590μs 4.1863μs 238.8730 KOps/s 231.3909 KOps/s $\color{#35bf28}+3.23\%$
test_items_nested 0.7628ms 0.3553ms 2.8149 KOps/s 2.7864 KOps/s $\color{#35bf28}+1.02\%$
test_items_nested_locked 0.6518ms 0.3591ms 2.7850 KOps/s 2.7742 KOps/s $\color{#35bf28}+0.39\%$
test_items_nested_leaf 0.1290ms 69.2831μs 14.4335 KOps/s 14.5852 KOps/s $\color{#d91a1a}-1.04\%$
test_items_stack_nested 0.6761ms 0.3618ms 2.7638 KOps/s 2.7680 KOps/s $\color{#d91a1a}-0.15\%$
test_items_stack_nested_leaf 0.1253ms 71.0421μs 14.0762 KOps/s 14.0234 KOps/s $\color{#35bf28}+0.38\%$
test_items_stack_nested_locked 0.6852ms 0.3612ms 2.7688 KOps/s 2.7548 KOps/s $\color{#35bf28}+0.51\%$
test_keys 30.7580μs 3.5175μs 284.2914 KOps/s 278.6451 KOps/s $\color{#35bf28}+2.03\%$
test_keys_nested 0.2027ms 99.9338μs 10.0066 KOps/s 10.2073 KOps/s $\color{#d91a1a}-1.97\%$
test_keys_nested_locked 0.8084ms 0.1059ms 9.4384 KOps/s 9.5028 KOps/s $\color{#d91a1a}-0.68\%$
test_keys_nested_leaf 0.1454ms 82.4196μs 12.1330 KOps/s 11.7037 KOps/s $\color{#35bf28}+3.67\%$
test_keys_stack_nested 0.1758ms 99.5402μs 10.0462 KOps/s 10.1548 KOps/s $\color{#d91a1a}-1.07\%$
test_keys_stack_nested_leaf 0.1441ms 82.2764μs 12.1542 KOps/s 11.8336 KOps/s $\color{#35bf28}+2.71\%$
test_keys_stack_nested_locked 0.1969ms 0.1047ms 9.5507 KOps/s 9.4298 KOps/s $\color{#35bf28}+1.28\%$
test_values 9.0950μs 1.0707μs 933.9812 KOps/s 913.9490 KOps/s $\color{#35bf28}+2.19\%$
test_values_nested 0.1755ms 72.3335μs 13.8249 KOps/s 13.9727 KOps/s $\color{#d91a1a}-1.06\%$
test_values_nested_locked 0.1211ms 71.9932μs 13.8902 KOps/s 13.8653 KOps/s $\color{#35bf28}+0.18\%$
test_values_nested_leaf 0.1190ms 60.9661μs 16.4026 KOps/s 16.1964 KOps/s $\color{#35bf28}+1.27\%$
test_values_stack_nested 0.1347ms 73.5834μs 13.5900 KOps/s 13.7656 KOps/s $\color{#d91a1a}-1.28\%$
test_values_stack_nested_leaf 0.1121ms 60.8482μs 16.4343 KOps/s 16.4843 KOps/s $\color{#d91a1a}-0.30\%$
test_values_stack_nested_locked 0.1287ms 73.5267μs 13.6005 KOps/s 13.3668 KOps/s $\color{#35bf28}+1.75\%$
test_membership 2.7792μs 0.6949μs 1.4391 MOps/s 1.4175 MOps/s $\color{#35bf28}+1.52\%$
test_membership_nested 23.7950μs 2.6945μs 371.1325 KOps/s 372.1803 KOps/s $\color{#d91a1a}-0.28\%$
test_membership_nested_leaf 44.2630μs 2.6849μs 372.4592 KOps/s 348.6172 KOps/s $\textbf{\color{#35bf28}+6.84\%}$
test_membership_stacked_nested 18.2150μs 2.7247μs 367.0167 KOps/s 374.5875 KOps/s $\color{#d91a1a}-2.02\%$
test_membership_stacked_nested_leaf 41.9780μs 2.7332μs 365.8676 KOps/s 368.3568 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_nested_last 25.4370μs 3.9758μs 251.5220 KOps/s 252.2308 KOps/s $\color{#d91a1a}-0.28\%$
test_membership_nested_leaf_last 29.6550μs 3.9498μs 253.1773 KOps/s 256.6277 KOps/s $\color{#d91a1a}-1.34\%$
test_membership_stacked_nested_last 24.5960μs 4.5718μs 218.7313 KOps/s 255.2595 KOps/s $\textbf{\color{#d91a1a}-14.31\%}$
test_membership_stacked_nested_leaf_last 53.4310μs 4.6053μs 217.1404 KOps/s 256.9286 KOps/s $\textbf{\color{#d91a1a}-15.49\%}$
test_nested_getleaf 46.3160μs 10.7576μs 92.9576 KOps/s 96.2351 KOps/s $\color{#d91a1a}-3.41\%$
test_nested_get 56.2050μs 10.1367μs 98.6511 KOps/s 100.4158 KOps/s $\color{#d91a1a}-1.76\%$
test_stacked_getleaf 49.5330μs 10.7084μs 93.3849 KOps/s 95.2779 KOps/s $\color{#d91a1a}-1.99\%$
test_stacked_get 51.9670μs 10.2445μs 97.6130 KOps/s 99.3600 KOps/s $\color{#d91a1a}-1.76\%$
test_nested_getitemleaf 54.6920μs 11.1085μs 90.0209 KOps/s 91.4451 KOps/s $\color{#d91a1a}-1.56\%$
test_nested_getitem 51.4660μs 10.4329μs 95.8506 KOps/s 98.4986 KOps/s $\color{#d91a1a}-2.69\%$
test_stacked_getitemleaf 37.4800μs 11.1175μs 89.9480 KOps/s 90.5625 KOps/s $\color{#d91a1a}-0.68\%$
test_stacked_getitem 61.4150μs 10.2694μs 97.3762 KOps/s 98.7129 KOps/s $\color{#d91a1a}-1.35\%$
test_lock_nested 86.5067ms 0.5597ms 1.7867 KOps/s 2.0719 KOps/s $\textbf{\color{#d91a1a}-13.77\%}$
test_lock_stack_nested 0.8824ms 0.4443ms 2.2509 KOps/s 2.1672 KOps/s $\color{#35bf28}+3.86\%$
test_unlock_nested 91.2891ms 0.4852ms 2.0609 KOps/s 2.3904 KOps/s $\textbf{\color{#d91a1a}-13.78\%}$
test_unlock_stack_nested 0.7402ms 0.3645ms 2.7436 KOps/s 2.6700 KOps/s $\color{#35bf28}+2.75\%$
test_flatten_speed 0.1645ms 86.7254μs 11.5306 KOps/s 11.3644 KOps/s $\color{#35bf28}+1.46\%$
test_unflatten_speed 0.8444ms 0.4632ms 2.1590 KOps/s 2.1919 KOps/s $\color{#d91a1a}-1.50\%$
test_common_ops 3.7379ms 1.0703ms 934.2917 Ops/s 892.8694 Ops/s $\color{#35bf28}+4.64\%$
test_creation 31.5960μs 2.0404μs 490.1062 KOps/s 462.1679 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_creation_empty 60.7330μs 17.2881μs 57.8432 KOps/s 58.2744 KOps/s $\color{#d91a1a}-0.74\%$
test_creation_nested_1 57.3770μs 20.0546μs 49.8640 KOps/s 48.1107 KOps/s $\color{#35bf28}+3.64\%$
test_creation_nested_2 81.4120μs 24.2788μs 41.1881 KOps/s 40.5068 KOps/s $\color{#35bf28}+1.68\%$
test_clone 0.1970ms 17.0227μs 58.7451 KOps/s 60.0009 KOps/s $\color{#d91a1a}-2.09\%$
test_getitem[int] 0.8823ms 16.6224μs 60.1600 KOps/s 60.0495 KOps/s $\color{#35bf28}+0.18\%$
test_getitem[slice_int] 0.1296ms 30.8158μs 32.4509 KOps/s 33.7901 KOps/s $\color{#d91a1a}-3.96\%$
test_getitem[range] 0.1651ms 56.8843μs 17.5795 KOps/s 17.0890 KOps/s $\color{#35bf28}+2.87\%$
test_getitem[tuple] 0.1878ms 25.2537μs 39.5981 KOps/s 40.6235 KOps/s $\color{#d91a1a}-2.52\%$
test_getitem[list] 0.2557ms 52.7036μs 18.9740 KOps/s 18.2810 KOps/s $\color{#35bf28}+3.79\%$
test_setitem_dim[int] 55.6040μs 32.4122μs 30.8526 KOps/s 32.2152 KOps/s $\color{#d91a1a}-4.23\%$
test_setitem_dim[slice_int] 0.1303ms 60.3778μs 16.5624 KOps/s 16.5967 KOps/s $\color{#d91a1a}-0.21\%$
test_setitem_dim[range] 0.1370ms 83.7550μs 11.9396 KOps/s 11.9318 KOps/s $\color{#35bf28}+0.07\%$
test_setitem_dim[tuple] 88.8860μs 48.5941μs 20.5786 KOps/s 21.2139 KOps/s $\color{#d91a1a}-2.99\%$
test_setitem 75.9320μs 28.4753μs 35.1181 KOps/s 34.5569 KOps/s $\color{#35bf28}+1.62\%$
test_set 94.2060μs 28.1090μs 35.5758 KOps/s 35.9518 KOps/s $\color{#d91a1a}-1.05\%$
test_set_shared 2.9269ms 0.2118ms 4.7206 KOps/s 4.7160 KOps/s $\color{#35bf28}+0.10\%$
test_update 0.1410ms 34.0316μs 29.3845 KOps/s 28.8483 KOps/s $\color{#35bf28}+1.86\%$
test_update_nested 0.1982ms 44.7098μs 22.3665 KOps/s 22.3177 KOps/s $\color{#35bf28}+0.22\%$
test_update__nested 77.1740μs 33.2666μs 30.0602 KOps/s 29.7119 KOps/s $\color{#35bf28}+1.17\%$
test_set_nested 0.1158ms 30.5778μs 32.7035 KOps/s 32.4535 KOps/s $\color{#35bf28}+0.77\%$
test_set_nested_new 0.1675ms 35.0746μs 28.5107 KOps/s 27.7294 KOps/s $\color{#35bf28}+2.82\%$
test_select 0.1310ms 52.4116μs 19.0797 KOps/s 18.7768 KOps/s $\color{#35bf28}+1.61\%$
test_select_nested 0.1299ms 60.1066μs 16.6371 KOps/s 17.0168 KOps/s $\color{#d91a1a}-2.23\%$
test_exclude_nested 0.1455ms 74.4988μs 13.4230 KOps/s 13.3363 KOps/s $\color{#35bf28}+0.65\%$
test_empty[True] 0.4573ms 0.3157ms 3.1672 KOps/s 3.1940 KOps/s $\color{#d91a1a}-0.84\%$
test_empty[False] 10.0632μs 1.2004μs 833.0321 KOps/s 811.9339 KOps/s $\color{#35bf28}+2.60\%$
test_unbind_speed 0.4486ms 0.2975ms 3.3610 KOps/s 3.3279 KOps/s $\color{#35bf28}+1.00\%$
test_unbind_speed_stack0 0.3956ms 0.2895ms 3.4540 KOps/s 3.4097 KOps/s $\color{#35bf28}+1.30\%$
test_unbind_speed_stack1 98.4096ms 0.8017ms 1.2473 KOps/s 1.3420 KOps/s $\textbf{\color{#d91a1a}-7.05\%}$
test_split 92.6644ms 2.2092ms 452.6428 Ops/s 455.3096 Ops/s $\color{#d91a1a}-0.59\%$
test_chunk 3.2352ms 2.0236ms 494.1765 Ops/s 458.4457 Ops/s $\textbf{\color{#35bf28}+7.79\%}$
test_creation[device0] 0.2295ms 0.1157ms 8.6435 KOps/s 8.3425 KOps/s $\color{#35bf28}+3.61\%$
test_creation_from_tensor 3.1945ms 0.1159ms 8.6275 KOps/s 8.5507 KOps/s $\color{#35bf28}+0.90\%$
test_add_one[memmap_tensor0] 0.3484ms 7.3237μs 136.5436 KOps/s 142.8680 KOps/s $\color{#d91a1a}-4.43\%$
test_contiguous[memmap_tensor0] 28.5730μs 1.9277μs 518.7409 KOps/s 502.3206 KOps/s $\color{#35bf28}+3.27\%$
test_stack[memmap_tensor0] 54.3920μs 5.5374μs 180.5893 KOps/s 177.3932 KOps/s $\color{#35bf28}+1.80\%$
test_memmaptd_index 1.2375ms 0.4001ms 2.4996 KOps/s 2.5708 KOps/s $\color{#d91a1a}-2.77\%$
test_memmaptd_index_astensor 0.7627ms 0.4807ms 2.0802 KOps/s 2.1474 KOps/s $\color{#d91a1a}-3.13\%$
test_memmaptd_index_op 1.4637ms 0.9933ms 1.0068 KOps/s 1.0275 KOps/s $\color{#d91a1a}-2.01\%$
test_serialize_model 0.2175s 0.1314s 7.6091 Ops/s 8.4033 Ops/s $\textbf{\color{#d91a1a}-9.45\%}$
test_serialize_model_pickle 0.4649s 0.3955s 2.5286 Ops/s 2.4905 Ops/s $\color{#35bf28}+1.53\%$
test_serialize_weights 0.1257s 0.1169s 8.5565 Ops/s 7.2941 Ops/s $\textbf{\color{#35bf28}+17.31\%}$
test_serialize_weights_returnearly 0.1719s 0.1590s 6.2910 Ops/s 6.1573 Ops/s $\color{#35bf28}+2.17\%$
test_serialize_weights_pickle 0.6025s 0.4249s 2.3537 Ops/s 2.1899 Ops/s $\textbf{\color{#35bf28}+7.48\%}$
test_serialize_weights_filesystem 0.2338s 0.1557s 6.4210 Ops/s 6.9030 Ops/s $\textbf{\color{#d91a1a}-6.98\%}$
test_serialize_model_filesystem 0.1562s 0.1466s 6.8191 Ops/s 5.7571 Ops/s $\textbf{\color{#35bf28}+18.45\%}$
test_reshape_pytree 84.0780μs 38.2765μs 26.1257 KOps/s 26.1273 KOps/s $-0.01\%$
test_reshape_td 0.1034ms 45.3401μs 22.0555 KOps/s 22.1134 KOps/s $\color{#d91a1a}-0.26\%$
test_view_pytree 92.5630μs 37.6438μs 26.5648 KOps/s 26.7467 KOps/s $\color{#d91a1a}-0.68\%$
test_view_td 0.1042ms 51.9202μs 19.2603 KOps/s 19.3032 KOps/s $\color{#d91a1a}-0.22\%$
test_unbind_pytree 82.9060μs 35.2898μs 28.3368 KOps/s 27.6940 KOps/s $\color{#35bf28}+2.32\%$
test_unbind_td 0.3199ms 44.0870μs 22.6824 KOps/s 22.0782 KOps/s $\color{#35bf28}+2.74\%$
test_split_pytree 85.6800μs 37.4715μs 26.6870 KOps/s 26.8687 KOps/s $\color{#d91a1a}-0.68\%$
test_split_td 0.5214ms 59.7166μs 16.7458 KOps/s 17.6112 KOps/s $\color{#d91a1a}-4.91\%$
test_add_pytree 98.9050μs 43.5987μs 22.9365 KOps/s 23.5686 KOps/s $\color{#d91a1a}-2.68\%$
test_add_td 0.1762ms 76.4429μs 13.0817 KOps/s 12.5997 KOps/s $\color{#35bf28}+3.82\%$
test_compile_add_one_nested[tensordict-compile] 0.2231ms 56.8378μs 17.5939 KOps/s 17.7058 KOps/s $\color{#d91a1a}-0.63\%$
test_compile_add_one_nested[tensordict-eager] 0.3136ms 0.1763ms 5.6735 KOps/s 5.6968 KOps/s $\color{#d91a1a}-0.41\%$
test_compile_add_one_nested[pytree-compile] 0.1370ms 55.4243μs 18.0426 KOps/s 17.8919 KOps/s $\color{#35bf28}+0.84\%$
test_compile_add_one_nested[pytree-eager] 0.2729ms 0.1383ms 7.2325 KOps/s 7.3191 KOps/s $\color{#d91a1a}-1.18\%$
test_compile_copy_nested[tensordict-compile] 48.2410μs 21.2249μs 47.1144 KOps/s 46.6351 KOps/s $\color{#35bf28}+1.03\%$
test_compile_copy_nested[tensordict-eager] 0.1841ms 66.4408μs 15.0510 KOps/s 15.1571 KOps/s $\color{#d91a1a}-0.70\%$
test_compile_copy_nested[pytree-compile] 0.1288ms 74.2630μs 13.4657 KOps/s 13.3779 KOps/s $\color{#35bf28}+0.66\%$
test_compile_copy_nested[pytree-eager] 0.1147ms 67.0376μs 14.9170 KOps/s 14.8675 KOps/s $\color{#35bf28}+0.33\%$
test_compile_add_one_flat[tensordict-compile] 0.2560ms 0.1721ms 5.8123 KOps/s 5.8159 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_one_flat[tensordict-eager] 0.2706ms 0.1869ms 5.3500 KOps/s 5.3821 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_add_one_flat[tensorclass-compile] 0.1112ms 46.6167μs 21.4515 KOps/s 21.0374 KOps/s $\color{#35bf28}+1.97\%$
test_compile_add_one_flat[tensorclass-eager] 0.1516ms 68.0439μs 14.6964 KOps/s 15.0705 KOps/s $\color{#d91a1a}-2.48\%$
test_compile_add_one_flat[pytree-compile] 0.3593ms 0.1759ms 5.6846 KOps/s 5.7623 KOps/s $\color{#d91a1a}-1.35\%$
test_compile_add_one_flat[pytree-eager] 0.4718ms 0.2825ms 3.5393 KOps/s 3.6060 KOps/s $\color{#d91a1a}-1.85\%$
test_compile_add_self_flat[tensordict-eager] 0.2889ms 0.1991ms 5.0218 KOps/s 4.9617 KOps/s $\color{#35bf28}+1.21\%$
test_compile_add_self_flat[tensordict-compile] 0.3764ms 0.1731ms 5.7763 KOps/s 5.7255 KOps/s $\color{#35bf28}+0.89\%$
test_compile_add_self_flat[tensorclass-eager] 0.1123ms 61.1688μs 16.3482 KOps/s 16.2519 KOps/s $\color{#35bf28}+0.59\%$
test_compile_add_self_flat[tensorclass-compile] 97.3310μs 45.8669μs 21.8022 KOps/s 20.8953 KOps/s $\color{#35bf28}+4.34\%$
test_compile_add_self_flat[pytree-eager] 0.4198ms 0.2294ms 4.3596 KOps/s 4.3222 KOps/s $\color{#35bf28}+0.87\%$
test_compile_add_self_flat[pytree-compile] 0.2808ms 0.1745ms 5.7301 KOps/s 5.6824 KOps/s $\color{#35bf28}+0.84\%$
test_compile_copy_flat[tensordict-compile] 0.2738ms 0.1037ms 9.6410 KOps/s 9.7598 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_copy_flat[tensordict-eager] 0.1235ms 57.3870μs 17.4256 KOps/s 17.1312 KOps/s $\color{#35bf28}+1.72\%$
test_compile_copy_flat[pytree-compile] 0.1663ms 75.1391μs 13.3086 KOps/s 12.9856 KOps/s $\color{#35bf28}+2.49\%$
test_compile_copy_flat[pytree-eager] 0.1506ms 68.1921μs 14.6644 KOps/s 14.6457 KOps/s $\color{#35bf28}+0.13\%$
test_compile_assign_and_add[tensordict-compile] 0.3651ms 0.1914ms 5.2252 KOps/s 5.0641 KOps/s $\color{#35bf28}+3.18\%$
test_compile_assign_and_add[tensordict-eager] 2.1171ms 1.6260ms 614.9934 Ops/s 613.2288 Ops/s $\color{#35bf28}+0.29\%$
test_compile_assign_and_add[pytree-compile] 0.2977ms 0.1881ms 5.3172 KOps/s 5.0118 KOps/s $\textbf{\color{#35bf28}+6.09\%}$
test_compile_assign_and_add[pytree-eager] 1.7213ms 1.0667ms 937.4997 Ops/s 923.7373 Ops/s $\color{#35bf28}+1.49\%$
test_compile_assign_and_add_stack[compile] 0.5341ms 0.4121ms 2.4268 KOps/s 2.3958 KOps/s $\color{#35bf28}+1.30\%$
test_compile_assign_and_add_stack[eager] 5.8792ms 3.7303ms 268.0718 Ops/s 277.2219 Ops/s $\color{#d91a1a}-3.30\%$
test_compile_indexing[tensor-tensordict-compile] 83.5060μs 33.5779μs 29.7815 KOps/s 29.8112 KOps/s $\color{#d91a1a}-0.10\%$
test_compile_indexing[tensor-tensordict-eager] 0.6727ms 48.4769μs 20.6284 KOps/s 20.8211 KOps/s $\color{#d91a1a}-0.93\%$
test_compile_indexing[tensor-tensorclass-compile] 91.4910μs 29.3611μs 34.0586 KOps/s 34.4776 KOps/s $\color{#d91a1a}-1.22\%$
test_compile_indexing[tensor-tensorclass-eager] 83.1850μs 29.1070μs 34.3560 KOps/s 34.7207 KOps/s $\color{#d91a1a}-1.05\%$
test_compile_indexing[tensor-pytree-compile] 0.1005ms 29.0259μs 34.4520 KOps/s 35.0314 KOps/s $\color{#d91a1a}-1.65\%$
test_compile_indexing[tensor-pytree-eager] 85.9110μs 28.7247μs 34.8133 KOps/s 34.3810 KOps/s $\color{#35bf28}+1.26\%$
test_compile_indexing[slice-tensordict-compile] 0.2440ms 73.2402μs 13.6537 KOps/s 13.8663 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_indexing[slice-tensordict-eager] 0.5378ms 28.4201μs 35.1863 KOps/s 35.8351 KOps/s $\color{#d91a1a}-1.81\%$
test_compile_indexing[slice-tensorclass-compile] 0.1273ms 66.5784μs 15.0199 KOps/s 14.8823 KOps/s $\color{#35bf28}+0.92\%$
test_compile_indexing[slice-tensorclass-eager] 80.2200μs 23.0931μs 43.3030 KOps/s 42.7301 KOps/s $\color{#35bf28}+1.34\%$
test_compile_indexing[slice-pytree-compile] 0.1860ms 67.4268μs 14.8309 KOps/s 14.9917 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_indexing[slice-pytree-eager] 75.9520μs 23.0080μs 43.4632 KOps/s 43.5305 KOps/s $\color{#d91a1a}-0.15\%$
test_compile_indexing[int-tensordict-compile] 0.1398ms 71.1691μs 14.0510 KOps/s 13.7582 KOps/s $\color{#35bf28}+2.13\%$
test_compile_indexing[int-tensordict-eager] 0.9767ms 27.9445μs 35.7853 KOps/s 36.3398 KOps/s $\color{#d91a1a}-1.53\%$
test_compile_indexing[int-tensorclass-compile] 0.1276ms 66.2001μs 15.1057 KOps/s 14.9194 KOps/s $\color{#35bf28}+1.25\%$
test_compile_indexing[int-tensorclass-eager] 68.5790μs 22.3643μs 44.7141 KOps/s 43.1113 KOps/s $\color{#35bf28}+3.72\%$
test_compile_indexing[int-pytree-compile] 0.1672ms 66.4543μs 15.0479 KOps/s 14.9477 KOps/s $\color{#35bf28}+0.67\%$
test_compile_indexing[int-pytree-eager] 82.7670μs 22.6958μs 44.0611 KOps/s 43.3036 KOps/s $\color{#35bf28}+1.75\%$
test_mod_add[eager] 0.1215ms 24.5286μs 40.7688 KOps/s 42.3595 KOps/s $\color{#d91a1a}-3.76\%$
test_mod_add[compile] 0.1113ms 39.1095μs 25.5692 KOps/s 27.6964 KOps/s $\textbf{\color{#d91a1a}-7.68\%}$
test_mod_add[compile-overhead] 0.1031ms 38.0533μs 26.2789 KOps/s 27.4023 KOps/s $\color{#d91a1a}-4.10\%$
test_mod_wrap[eager] 0.4267ms 0.2020ms 4.9505 KOps/s 4.9436 KOps/s $\color{#35bf28}+0.14\%$
test_mod_wrap[compile] 0.4541ms 0.2268ms 4.4097 KOps/s 4.4040 KOps/s $\color{#35bf28}+0.13\%$
test_mod_wrap[compile-overhead] 0.3066ms 0.2239ms 4.4664 KOps/s 4.4132 KOps/s $\color{#35bf28}+1.21\%$
test_mod_wrap_and_backward[eager] 12.3678ms 10.6438ms 93.9512 Ops/s 87.2255 Ops/s $\textbf{\color{#35bf28}+7.71\%}$
test_mod_wrap_and_backward[compile] 12.1259ms 10.8964ms 91.7736 Ops/s 80.7164 Ops/s $\textbf{\color{#35bf28}+13.70\%}$
test_mod_wrap_and_backward[compile-overhead] 12.0500ms 10.8419ms 92.2348 Ops/s 78.3017 Ops/s $\textbf{\color{#35bf28}+17.79\%}$
test_seq_add[eager] 0.2033ms 88.2239μs 11.3348 KOps/s 11.5466 KOps/s $\color{#d91a1a}-1.83\%$
test_seq_add[compile] 0.1274ms 63.0001μs 15.8730 KOps/s 16.2177 KOps/s $\color{#d91a1a}-2.13\%$
test_seq_add[compile-overhead] 0.1175ms 62.0104μs 16.1263 KOps/s 16.2071 KOps/s $\color{#d91a1a}-0.50\%$
test_seq_wrap[eager] 0.6465ms 0.3737ms 2.6758 KOps/s 2.6616 KOps/s $\color{#35bf28}+0.53\%$
test_seq_wrap[compile] 0.5051ms 0.2624ms 3.8105 KOps/s 3.7638 KOps/s $\color{#35bf28}+1.24\%$
test_seq_wrap[compile-overhead] 0.5104ms 0.2637ms 3.7915 KOps/s 3.7794 KOps/s $\color{#35bf28}+0.32\%$
test_func_call_runtime[False-eager] 0.7864ms 0.5176ms 1.9319 KOps/s 1.9836 KOps/s $\color{#d91a1a}-2.61\%$
test_func_call_runtime[False-compile] 0.9333ms 0.4898ms 2.0416 KOps/s 2.0414 KOps/s $+0.01\%$
test_func_call_runtime[False-compile-overhead] 0.6660ms 0.4879ms 2.0497 KOps/s 2.0319 KOps/s $\color{#35bf28}+0.88\%$
test_func_call_runtime[True-eager] 1.1893ms 0.7318ms 1.3664 KOps/s 1.3936 KOps/s $\color{#d91a1a}-1.95\%$
test_func_call_runtime[True-compile] 0.9074ms 0.5014ms 1.9945 KOps/s 1.9751 KOps/s $\color{#35bf28}+0.98\%$
test_func_call_runtime[True-compile-overhead] 0.8071ms 0.4995ms 2.0021 KOps/s 1.9898 KOps/s $\color{#35bf28}+0.62\%$
test_func_call_cm_runtime[False-eager] 0.7150ms 0.5174ms 1.9328 KOps/s 1.9903 KOps/s $\color{#d91a1a}-2.89\%$
test_func_call_cm_runtime[False-compile] 0.8142ms 0.5064ms 1.9746 KOps/s 2.0234 KOps/s $\color{#d91a1a}-2.41\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9258ms 0.4914ms 2.0348 KOps/s 2.0336 KOps/s $\color{#35bf28}+0.06\%$
test_func_call_cm_runtime[True-eager] 1.4459ms 0.8565ms 1.1675 KOps/s 1.1710 KOps/s $\color{#d91a1a}-0.30\%$
test_func_call_cm_runtime[True-compile] 0.8475ms 0.7272ms 1.3752 KOps/s 1.3731 KOps/s $\color{#35bf28}+0.15\%$
test_func_call_cm_runtime[True-compile-overhead] 1.4012ms 0.7375ms 1.3559 KOps/s 1.3605 KOps/s $\color{#d91a1a}-0.34\%$
test_vmap_func_call_cm_runtime[eager] 2.4036ms 1.8221ms 548.8167 Ops/s 540.0278 Ops/s $\color{#35bf28}+1.63\%$
test_vmap_func_call_cm_runtime[compile] 3.0303ms 1.8905ms 528.9665 Ops/s 523.7056 Ops/s $\color{#35bf28}+1.00\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6655ms 1.8857ms 530.3204 Ops/s 524.5807 Ops/s $\color{#35bf28}+1.09\%$
test_distributed 0.2668ms 0.1235ms 8.0983 KOps/s 7.8322 KOps/s $\color{#35bf28}+3.40\%$
test_tdmodule 34.0250μs 17.3854μs 57.5195 KOps/s 58.1015 KOps/s $\color{#d91a1a}-1.00\%$
test_tdmodule_dispatch 65.3640μs 34.0685μs 29.3527 KOps/s 28.0191 KOps/s $\color{#35bf28}+4.76\%$
test_tdseq 48.5420μs 19.8723μs 50.3213 KOps/s 49.2960 KOps/s $\color{#35bf28}+2.08\%$
test_tdseq_dispatch 84.5430μs 39.1531μs 25.5408 KOps/s 24.5903 KOps/s $\color{#35bf28}+3.87\%$
test_instantiation_functorch 2.0270ms 1.5434ms 647.9335 Ops/s 629.0759 Ops/s $\color{#35bf28}+3.00\%$
test_instantiation_td 1.9145ms 1.1341ms 881.7715 Ops/s 868.4491 Ops/s $\color{#35bf28}+1.53\%$
test_exec_functorch 0.3340ms 0.1758ms 5.6878 KOps/s 5.5629 KOps/s $\color{#35bf28}+2.25\%$
test_exec_functional_call 0.3392ms 0.1714ms 5.8353 KOps/s 5.9026 KOps/s $\color{#d91a1a}-1.14\%$
test_exec_td 0.3254ms 0.1673ms 5.9759 KOps/s 5.9855 KOps/s $\color{#d91a1a}-0.16\%$
test_exec_td_decorator 0.9307ms 0.2166ms 4.6164 KOps/s 4.5776 KOps/s $\color{#35bf28}+0.85\%$
test_vmap_mlp_speed[True-True] 0.9298ms 0.6278ms 1.5929 KOps/s 1.5756 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed[True-False] 0.8613ms 0.6267ms 1.5955 KOps/s 1.5805 KOps/s $\color{#35bf28}+0.95\%$
test_vmap_mlp_speed[False-True] 0.8066ms 0.4837ms 2.0672 KOps/s 2.0504 KOps/s $\color{#35bf28}+0.82\%$
test_vmap_mlp_speed[False-False] 0.7737ms 0.4857ms 2.0587 KOps/s 2.0330 KOps/s $\color{#35bf28}+1.26\%$
test_vmap_mlp_speed_decorator[True-True] 1.3534ms 0.5998ms 1.6672 KOps/s 1.5793 KOps/s $\textbf{\color{#35bf28}+5.56\%}$
test_vmap_mlp_speed_decorator[True-False] 0.9034ms 0.6053ms 1.6520 KOps/s 1.6352 KOps/s $\color{#35bf28}+1.03\%$
test_vmap_mlp_speed_decorator[False-True] 0.6789ms 0.4965ms 2.0141 KOps/s 1.9955 KOps/s $\color{#35bf28}+0.93\%$
test_vmap_mlp_speed_decorator[False-False] 0.6961ms 0.4964ms 2.0145 KOps/s 1.9821 KOps/s $\color{#35bf28}+1.64\%$
test_to_module_speed[True] 1.8736ms 1.2815ms 780.3559 Ops/s 775.7036 Ops/s $\color{#35bf28}+0.60\%$
test_to_module_speed[False] 1.4908ms 1.2627ms 791.9441 Ops/s 792.9546 Ops/s $\color{#d91a1a}-0.13\%$
test_tc_init 0.1078ms 41.4043μs 24.1521 KOps/s 22.5425 KOps/s $\textbf{\color{#35bf28}+7.14\%}$
test_tc_init_nested 0.1415ms 81.5599μs 12.2609 KOps/s 11.2806 KOps/s $\textbf{\color{#35bf28}+8.69\%}$
test_tc_first_layer_tensor 17.4430μs 1.5228μs 656.6838 KOps/s 658.0449 KOps/s $\color{#d91a1a}-0.21\%$
test_tc_first_layer_nontensor 20.9690μs 4.6953μs 212.9787 KOps/s 211.6949 KOps/s $\color{#35bf28}+0.61\%$
test_tc_second_layer_tensor 30.5480μs 2.8492μs 350.9771 KOps/s 349.8067 KOps/s $\color{#35bf28}+0.33\%$
test_tc_second_layer_nontensor 40.6770μs 6.0129μs 166.3085 KOps/s 164.8618 KOps/s $\color{#35bf28}+0.88\%$
test_unbind 0.4890s 15.2346ms 65.6401 Ops/s 75.3549 Ops/s $\textbf{\color{#d91a1a}-12.89\%}$
test_full_like 9.3031ms 7.4348ms 134.5031 Ops/s 128.7461 Ops/s $\color{#35bf28}+4.47\%$
test_zeros_like 13.4871ms 6.8376ms 146.2503 Ops/s 337.0434 Ops/s $\textbf{\color{#d91a1a}-56.61\%}$
test_ones_like 15.1644ms 8.1021ms 123.4247 Ops/s 150.4293 Ops/s $\textbf{\color{#d91a1a}-17.95\%}$
test_clone 14.4438ms 9.1353ms 109.4654 Ops/s 108.9289 Ops/s $\color{#35bf28}+0.49\%$
test_squeeze 70.2830μs 12.4154μs 80.5451 KOps/s 79.6245 KOps/s $\color{#35bf28}+1.16\%$
test_unsqueeze 0.3487ms 91.6932μs 10.9059 KOps/s 10.5484 KOps/s $\color{#35bf28}+3.39\%$
test_split 0.3899ms 0.1942ms 5.1489 KOps/s 5.0908 KOps/s $\color{#35bf28}+1.14\%$
test_permute 0.3231ms 0.2177ms 4.5936 KOps/s 4.5058 KOps/s $\color{#35bf28}+1.95\%$
test_stack 32.7349ms 25.5789ms 39.0947 Ops/s 38.9143 Ops/s $\color{#35bf28}+0.46\%$
test_cat 29.6561ms 25.3201ms 39.4943 Ops/s 38.9932 Ops/s $\color{#35bf28}+1.29\%$

Copy link

github-actions bot commented Sep 16, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}37$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1393ms 13.5180μs 73.9756 KOps/s 67.2163 KOps/s $\textbf{\color{#35bf28}+10.06\%}$
test_plain_set_stack_nested 37.9510μs 13.8008μs 72.4597 KOps/s 66.8901 KOps/s $\textbf{\color{#35bf28}+8.33\%}$
test_plain_set_nested_inplace 43.9710μs 14.7247μs 67.9133 KOps/s 62.4165 KOps/s $\textbf{\color{#35bf28}+8.81\%}$
test_plain_set_stack_nested_inplace 43.2010μs 14.4941μs 68.9935 KOps/s 62.0576 KOps/s $\textbf{\color{#35bf28}+11.18\%}$
test_items 28.1110μs 2.8884μs 346.2116 KOps/s 342.0922 KOps/s $\color{#35bf28}+1.20\%$
test_items_nested 0.4354ms 0.3281ms 3.0480 KOps/s 3.0880 KOps/s $\color{#d91a1a}-1.30\%$
test_items_nested_locked 0.3751ms 0.3286ms 3.0434 KOps/s 3.0799 KOps/s $\color{#d91a1a}-1.19\%$
test_items_nested_leaf 0.1871ms 55.7903μs 17.9243 KOps/s 17.9807 KOps/s $\color{#d91a1a}-0.31\%$
test_items_stack_nested 0.3703ms 0.3270ms 3.0586 KOps/s 3.0631 KOps/s $\color{#d91a1a}-0.15\%$
test_items_stack_nested_leaf 86.4320μs 56.9259μs 17.5667 KOps/s 17.6720 KOps/s $\color{#d91a1a}-0.60\%$
test_items_stack_nested_locked 0.3808ms 0.3258ms 3.0696 KOps/s 3.0101 KOps/s $\color{#35bf28}+1.98\%$
test_keys 34.6600μs 3.4648μs 288.6209 KOps/s 292.9958 KOps/s $\color{#d91a1a}-1.49\%$
test_keys_nested 88.6620μs 56.6134μs 17.6637 KOps/s 17.7099 KOps/s $\color{#d91a1a}-0.26\%$
test_keys_nested_locked 2.6752ms 62.9049μs 15.8970 KOps/s 16.0232 KOps/s $\color{#d91a1a}-0.79\%$
test_keys_nested_leaf 78.9420μs 47.5474μs 21.0316 KOps/s 20.8404 KOps/s $\color{#35bf28}+0.92\%$
test_keys_stack_nested 91.3020μs 56.3944μs 17.7322 KOps/s 17.7619 KOps/s $\color{#d91a1a}-0.17\%$
test_keys_stack_nested_leaf 84.7220μs 48.6672μs 20.5477 KOps/s 20.7964 KOps/s $\color{#d91a1a}-1.20\%$
test_keys_stack_nested_locked 94.5720μs 61.3430μs 16.3018 KOps/s 16.2902 KOps/s $\color{#35bf28}+0.07\%$
test_values 5.7017μs 0.8447μs 1.1838 MOps/s 1.1792 MOps/s $\color{#35bf28}+0.39\%$
test_values_nested 72.4320μs 41.2846μs 24.2221 KOps/s 24.5713 KOps/s $\color{#d91a1a}-1.42\%$
test_values_nested_locked 69.5520μs 43.2580μs 23.1171 KOps/s 23.4335 KOps/s $\color{#d91a1a}-1.35\%$
test_values_nested_leaf 57.2510μs 35.6947μs 28.0154 KOps/s 28.3293 KOps/s $\color{#d91a1a}-1.11\%$
test_values_stack_nested 71.5410μs 42.0704μs 23.7697 KOps/s 23.9692 KOps/s $\color{#d91a1a}-0.83\%$
test_values_stack_nested_leaf 65.5320μs 35.9301μs 27.8318 KOps/s 27.9520 KOps/s $\color{#d91a1a}-0.43\%$
test_values_stack_nested_locked 74.8620μs 43.8617μs 22.7989 KOps/s 22.9419 KOps/s $\color{#d91a1a}-0.62\%$
test_membership 1.8886μs 0.5043μs 1.9829 MOps/s 1.9849 MOps/s $\color{#d91a1a}-0.10\%$
test_membership_nested 17.2355μs 1.9013μs 525.9510 KOps/s 531.7707 KOps/s $\color{#d91a1a}-1.09\%$
test_membership_nested_leaf 15.2150μs 1.9013μs 525.9694 KOps/s 539.1182 KOps/s $\color{#d91a1a}-2.44\%$
test_membership_stacked_nested 28.4210μs 1.9653μs 508.8253 KOps/s 530.1217 KOps/s $\color{#d91a1a}-4.02\%$
test_membership_stacked_nested_leaf 30.7510μs 1.9758μs 506.1239 KOps/s 521.2209 KOps/s $\color{#d91a1a}-2.90\%$
test_membership_nested_last 31.7210μs 2.8056μs 356.4357 KOps/s 363.4395 KOps/s $\color{#d91a1a}-1.93\%$
test_membership_nested_leaf_last 28.7710μs 2.8433μs 351.7084 KOps/s 363.0943 KOps/s $\color{#d91a1a}-3.14\%$
test_membership_stacked_nested_last 28.6000μs 3.4614μs 288.8995 KOps/s 127.2262 KOps/s $\textbf{\color{#35bf28}+127.08\%}$
test_membership_stacked_nested_leaf_last 31.4610μs 3.4287μs 291.6559 KOps/s 128.7964 KOps/s $\textbf{\color{#35bf28}+126.45\%}$
test_nested_getleaf 48.6710μs 6.1026μs 163.8644 KOps/s 163.5518 KOps/s $\color{#35bf28}+0.19\%$
test_nested_get 32.3810μs 5.6830μs 175.9624 KOps/s 174.1617 KOps/s $\color{#35bf28}+1.03\%$
test_stacked_getleaf 42.5710μs 6.0154μs 166.2404 KOps/s 164.0631 KOps/s $\color{#35bf28}+1.33\%$
test_stacked_get 32.5410μs 5.6268μs 177.7196 KOps/s 175.3282 KOps/s $\color{#35bf28}+1.36\%$
test_nested_getitemleaf 0.1589ms 6.1691μs 162.0974 KOps/s 163.9320 KOps/s $\color{#d91a1a}-1.12\%$
test_nested_getitem 33.7310μs 5.7626μs 173.5336 KOps/s 173.9801 KOps/s $\color{#d91a1a}-0.26\%$
test_stacked_getitemleaf 33.4400μs 6.1116μs 163.6222 KOps/s 163.2538 KOps/s $\color{#35bf28}+0.23\%$
test_stacked_getitem 37.6010μs 5.6822μs 175.9886 KOps/s 175.1809 KOps/s $\color{#35bf28}+0.46\%$
test_lock_nested 6.9217ms 0.4172ms 2.3972 KOps/s 2.3564 KOps/s $\color{#35bf28}+1.73\%$
test_lock_stack_nested 0.5073ms 0.3734ms 2.6781 KOps/s 2.6631 KOps/s $\color{#35bf28}+0.56\%$
test_unlock_nested 0.7516ms 0.3504ms 2.8539 KOps/s 2.7758 KOps/s $\color{#35bf28}+2.81\%$
test_unlock_stack_nested 0.3935ms 0.3121ms 3.2042 KOps/s 3.1767 KOps/s $\color{#35bf28}+0.87\%$
test_flatten_speed 0.1495ms 69.0903μs 14.4738 KOps/s 14.4189 KOps/s $\color{#35bf28}+0.38\%$
test_unflatten_speed 0.3738ms 0.2799ms 3.5728 KOps/s 3.5092 KOps/s $\color{#35bf28}+1.81\%$
test_common_ops 1.5568ms 1.2287ms 813.8364 Ops/s 782.1285 Ops/s $\color{#35bf28}+4.05\%$
test_creation 25.7110μs 1.4976μs 667.7428 KOps/s 670.8931 KOps/s $\color{#d91a1a}-0.47\%$
test_creation_empty 39.8510μs 14.7538μs 67.7792 KOps/s 56.5462 KOps/s $\textbf{\color{#35bf28}+19.87\%}$
test_creation_nested_1 55.8920μs 16.4712μs 60.7120 KOps/s 51.1122 KOps/s $\textbf{\color{#35bf28}+18.78\%}$
test_creation_nested_2 49.7610μs 19.3424μs 51.6998 KOps/s 45.0154 KOps/s $\textbf{\color{#35bf28}+14.85\%}$
test_clone 0.2204ms 29.0868μs 34.3799 KOps/s 34.4339 KOps/s $\color{#d91a1a}-0.16\%$
test_getitem[int] 1.3674ms 15.3232μs 65.2604 KOps/s 62.2933 KOps/s $\color{#35bf28}+4.76\%$
test_getitem[slice_int] 0.1149ms 26.4742μs 37.7726 KOps/s 36.1952 KOps/s $\color{#35bf28}+4.36\%$
test_getitem[range] 0.2456ms 0.1070ms 9.3450 KOps/s 9.3357 KOps/s $\color{#35bf28}+0.10\%$
test_getitem[tuple] 0.1236ms 22.8903μs 43.6867 KOps/s 41.6764 KOps/s $\color{#35bf28}+4.82\%$
test_getitem[list] 0.2716ms 95.8522μs 10.4327 KOps/s 10.3534 KOps/s $\color{#35bf28}+0.77\%$
test_setitem_dim[int] 0.2530ms 43.9832μs 22.7360 KOps/s 23.1586 KOps/s $\color{#d91a1a}-1.82\%$
test_setitem_dim[slice_int] 0.1079ms 65.6925μs 15.2224 KOps/s 15.1717 KOps/s $\color{#35bf28}+0.33\%$
test_setitem_dim[range] 0.2602ms 0.1255ms 7.9686 KOps/s 7.9271 KOps/s $\color{#35bf28}+0.52\%$
test_setitem_dim[tuple] 85.2720μs 59.5068μs 16.8048 KOps/s 16.7087 KOps/s $\color{#35bf28}+0.58\%$
test_setitem 84.6820μs 41.5515μs 24.0665 KOps/s 23.4767 KOps/s $\color{#35bf28}+2.51\%$
test_set 0.1900ms 40.5606μs 24.6545 KOps/s 24.0223 KOps/s $\color{#35bf28}+2.63\%$
test_set_shared 0.3496ms 50.8902μs 19.6501 KOps/s 19.8010 KOps/s $\color{#d91a1a}-0.76\%$
test_update 0.2102ms 48.0956μs 20.7919 KOps/s 19.5204 KOps/s $\textbf{\color{#35bf28}+6.51\%}$
test_update_nested 0.1904ms 54.7485μs 18.2653 KOps/s 17.0974 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_update__nested 0.1995ms 58.3312μs 17.1435 KOps/s 16.7183 KOps/s $\color{#35bf28}+2.54\%$
test_set_nested 77.1320μs 42.4820μs 23.5394 KOps/s 22.6056 KOps/s $\color{#35bf28}+4.13\%$
test_set_nested_new 0.2078ms 45.9360μs 21.7694 KOps/s 20.5323 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_select 0.2125ms 59.1227μs 16.9140 KOps/s 15.7335 KOps/s $\textbf{\color{#35bf28}+7.50\%}$
test_select_nested 69.3220μs 42.2216μs 23.6846 KOps/s 23.8664 KOps/s $\color{#d91a1a}-0.76\%$
test_exclude_nested 87.0020μs 59.8335μs 16.7131 KOps/s 17.0264 KOps/s $\color{#d91a1a}-1.84\%$
test_empty[True] 0.3167ms 0.2439ms 4.1001 KOps/s 3.9636 KOps/s $\color{#35bf28}+3.44\%$
test_empty[False] 2.9421μs 0.7433μs 1.3453 MOps/s 1.3442 MOps/s $\color{#35bf28}+0.08\%$
test_to 0.1360ms 25.1610μs 39.7440 KOps/s 38.8730 KOps/s $\color{#35bf28}+2.24\%$
test_to_nonblocking 0.2063ms 24.0589μs 41.5646 KOps/s 40.9607 KOps/s $\color{#35bf28}+1.47\%$
test_unbind_speed 1.4011ms 0.2737ms 3.6533 KOps/s 3.5543 KOps/s $\color{#35bf28}+2.79\%$
test_unbind_speed_stack0 0.2999ms 0.2685ms 3.7245 KOps/s 3.6310 KOps/s $\color{#35bf28}+2.57\%$
test_unbind_speed_stack1 92.6087ms 0.6897ms 1.4500 KOps/s 1.4331 KOps/s $\color{#35bf28}+1.18\%$
test_split 93.6391ms 2.1135ms 473.1586 Ops/s 455.0897 Ops/s $\color{#35bf28}+3.97\%$
test_chunk 95.5422ms 2.1332ms 468.7746 Ops/s 453.4320 Ops/s $\color{#35bf28}+3.38\%$
test_creation[device0] 0.3406ms 0.1256ms 7.9628 KOps/s 7.7945 KOps/s $\color{#35bf28}+2.16\%$
test_creation_from_tensor 0.3501ms 0.1277ms 7.8320 KOps/s 7.7619 KOps/s $\color{#35bf28}+0.90\%$
test_add_one[memmap_tensor0] 0.1366ms 8.3007μs 120.4724 KOps/s 116.9092 KOps/s $\color{#35bf28}+3.05\%$
test_contiguous[memmap_tensor0] 22.8300μs 2.2056μs 453.3922 KOps/s 447.5495 KOps/s $\color{#35bf28}+1.31\%$
test_stack[memmap_tensor0] 0.1233ms 6.4294μs 155.5352 KOps/s 151.9702 KOps/s $\color{#35bf28}+2.35\%$
test_memmaptd_index 1.0495ms 0.4184ms 2.3899 KOps/s 2.2990 KOps/s $\color{#35bf28}+3.95\%$
test_memmaptd_index_astensor 0.7448ms 0.4757ms 2.1023 KOps/s 2.0446 KOps/s $\color{#35bf28}+2.82\%$
test_memmaptd_index_op 1.3956ms 0.9834ms 1.0169 KOps/s 925.1552 Ops/s $\textbf{\color{#35bf28}+9.91\%}$
test_serialize_model 0.1312s 0.1295s 7.7199 Ops/s 7.7219 Ops/s $\color{#d91a1a}-0.03\%$
test_serialize_model_pickle 1.3481s 1.2131s 0.8243 Ops/s 0.8197 Ops/s $\color{#35bf28}+0.57\%$
test_serialize_weights 0.1310s 0.1295s 7.7228 Ops/s 6.9967 Ops/s $\textbf{\color{#35bf28}+10.38\%}$
test_serialize_weights_returnearly 0.2428s 62.5006ms 15.9998 Ops/s 17.8109 Ops/s $\textbf{\color{#d91a1a}-10.17\%}$
test_serialize_weights_pickle 1.3460s 1.2115s 0.8254 Ops/s 0.8248 Ops/s $\color{#35bf28}+0.07\%$
test_reshape_pytree 88.8420μs 35.6505μs 28.0501 KOps/s 27.5950 KOps/s $\color{#35bf28}+1.65\%$
test_reshape_td 0.1714ms 43.4796μs 22.9993 KOps/s 21.7794 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_view_pytree 0.1763ms 35.6087μs 28.0830 KOps/s 27.3166 KOps/s $\color{#35bf28}+2.81\%$
test_view_td 79.0720μs 47.1724μs 21.1988 KOps/s 20.6030 KOps/s $\color{#35bf28}+2.89\%$
test_unbind_pytree 89.0220μs 34.6308μs 28.8760 KOps/s 28.3870 KOps/s $\color{#35bf28}+1.72\%$
test_unbind_td 0.5006ms 42.7519μs 23.3907 KOps/s 22.6176 KOps/s $\color{#35bf28}+3.42\%$
test_split_pytree 0.1769ms 46.0043μs 21.7371 KOps/s 20.1250 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_split_td 0.6701ms 58.8897μs 16.9809 KOps/s 17.8946 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_add_pytree 0.2360ms 61.5525μs 16.2463 KOps/s 17.7127 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_add_td 0.2731ms 97.0176μs 10.3074 KOps/s 10.2045 KOps/s $\color{#35bf28}+1.01\%$
test_compile_add_one_nested[tensordict-compile] 0.4166ms 0.2100ms 4.7630 KOps/s 4.7026 KOps/s $\color{#35bf28}+1.29\%$
test_compile_add_one_nested[tensordict-eager] 0.2808ms 0.1514ms 6.6046 KOps/s 6.5868 KOps/s $\color{#35bf28}+0.27\%$
test_compile_add_one_nested[pytree-compile] 0.2935ms 0.1461ms 6.8456 KOps/s 6.7985 KOps/s $\color{#35bf28}+0.69\%$
test_compile_add_one_nested[pytree-eager] 0.3340ms 0.1850ms 5.4048 KOps/s 5.5026 KOps/s $\color{#d91a1a}-1.78\%$
test_compile_copy_nested[tensordict-compile] 0.1444ms 21.1451μs 47.2924 KOps/s 44.4921 KOps/s $\textbf{\color{#35bf28}+6.29\%}$
test_compile_copy_nested[tensordict-eager] 85.1820μs 44.5189μs 22.4624 KOps/s 22.6078 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_copy_nested[pytree-compile] 0.2227ms 64.5306μs 15.4965 KOps/s 15.5897 KOps/s $\color{#d91a1a}-0.60\%$
test_compile_copy_nested[pytree-eager] 89.8020μs 50.0569μs 19.9773 KOps/s 20.3018 KOps/s $\color{#d91a1a}-1.60\%$
test_compile_add_one_flat[tensordict-compile] 0.4206ms 0.3199ms 3.1263 KOps/s 3.1307 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensordict-eager] 0.3540ms 0.2106ms 4.7484 KOps/s 4.8707 KOps/s $\color{#d91a1a}-2.51\%$
test_compile_add_one_flat[tensorclass-compile] 0.2302ms 0.1292ms 7.7398 KOps/s 7.4052 KOps/s $\color{#35bf28}+4.52\%$
test_compile_add_one_flat[tensorclass-eager] 0.2084ms 61.2883μs 16.3163 KOps/s 15.7440 KOps/s $\color{#35bf28}+3.64\%$
test_compile_add_one_flat[pytree-compile] 0.4800ms 0.3229ms 3.0974 KOps/s 3.0876 KOps/s $\color{#35bf28}+0.32\%$
test_compile_add_one_flat[pytree-eager] 0.8183ms 0.6332ms 1.5793 KOps/s 1.5765 KOps/s $\color{#35bf28}+0.18\%$
test_compile_add_self_flat[tensordict-eager] 0.3823ms 0.2487ms 4.0216 KOps/s 3.9920 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_self_flat[tensordict-compile] 0.3853ms 0.3221ms 3.1042 KOps/s 3.0315 KOps/s $\color{#35bf28}+2.40\%$
test_compile_add_self_flat[tensorclass-eager] 0.1180ms 71.6154μs 13.9635 KOps/s 13.4969 KOps/s $\color{#35bf28}+3.46\%$
test_compile_add_self_flat[tensorclass-compile] 0.2773ms 0.1292ms 7.7400 KOps/s 7.2850 KOps/s $\textbf{\color{#35bf28}+6.25\%}$
test_compile_add_self_flat[pytree-eager] 0.6991ms 0.5321ms 1.8794 KOps/s 1.8755 KOps/s $\color{#35bf28}+0.21\%$
test_compile_add_self_flat[pytree-compile] 0.4630ms 0.3208ms 3.1167 KOps/s 3.1030 KOps/s $\color{#35bf28}+0.44\%$
test_compile_copy_flat[tensordict-compile] 0.1031ms 17.9203μs 55.8026 KOps/s 50.8272 KOps/s $\textbf{\color{#35bf28}+9.79\%}$
test_compile_copy_flat[tensordict-eager] 78.6620μs 27.3252μs 36.5962 KOps/s 37.5049 KOps/s $\color{#d91a1a}-2.42\%$
test_compile_copy_flat[pytree-compile] 0.2443ms 70.3737μs 14.2099 KOps/s 14.1603 KOps/s $\color{#35bf28}+0.35\%$
test_compile_copy_flat[pytree-eager] 91.9620μs 51.2604μs 19.5082 KOps/s 19.4659 KOps/s $\color{#35bf28}+0.22\%$
test_compile_assign_and_add[tensordict-compile] 2.3481ms 0.8110ms 1.2330 KOps/s 1.1403 KOps/s $\textbf{\color{#35bf28}+8.13\%}$
test_compile_assign_and_add[tensordict-eager] 3.5407ms 3.1284ms 319.6547 Ops/s 319.5217 Ops/s $\color{#35bf28}+0.04\%$
test_compile_assign_and_add[pytree-compile] 2.2898ms 0.8103ms 1.2341 KOps/s 1.1448 KOps/s $\textbf{\color{#35bf28}+7.80\%}$
test_compile_assign_and_add[pytree-eager] 3.4604ms 3.2118ms 311.3516 Ops/s 312.2448 Ops/s $\color{#d91a1a}-0.29\%$
test_compile_indexing[tensor-tensordict-compile] 0.2611ms 0.1139ms 8.7834 KOps/s 8.8277 KOps/s $\color{#d91a1a}-0.50\%$
test_compile_indexing[tensor-tensordict-eager] 0.1996ms 64.0646μs 15.6093 KOps/s 15.7103 KOps/s $\color{#d91a1a}-0.64\%$
test_compile_indexing[tensor-tensorclass-compile] 0.2406ms 0.1029ms 9.7137 KOps/s 9.5862 KOps/s $\color{#35bf28}+1.33\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1938ms 42.6331μs 23.4560 KOps/s 23.3071 KOps/s $\color{#35bf28}+0.64\%$
test_compile_indexing[tensor-pytree-compile] 0.2573ms 0.1047ms 9.5473 KOps/s 9.5385 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[tensor-pytree-eager] 0.2337ms 42.5867μs 23.4815 KOps/s 23.4033 KOps/s $\color{#35bf28}+0.33\%$
test_compile_indexing[slice-tensordict-compile] 0.1885ms 0.1379ms 7.2530 KOps/s 7.2839 KOps/s $\color{#d91a1a}-0.42\%$
test_compile_indexing[slice-tensordict-eager] 0.1747ms 24.5228μs 40.7784 KOps/s 38.7773 KOps/s $\textbf{\color{#35bf28}+5.16\%}$
test_compile_indexing[slice-tensorclass-compile] 0.2884ms 0.1325ms 7.5473 KOps/s 7.5810 KOps/s $\color{#d91a1a}-0.44\%$
test_compile_indexing[slice-tensorclass-eager] 0.1627ms 20.6746μs 48.3685 KOps/s 47.7311 KOps/s $\color{#35bf28}+1.34\%$
test_compile_indexing[slice-pytree-compile] 0.2941ms 0.1329ms 7.5233 KOps/s 7.5294 KOps/s $\color{#d91a1a}-0.08\%$
test_compile_indexing[slice-pytree-eager] 0.2100ms 20.8335μs 47.9997 KOps/s 48.3866 KOps/s $\color{#d91a1a}-0.80\%$
test_compile_indexing[int-tensordict-compile] 0.2924ms 0.1383ms 7.2331 KOps/s 7.2269 KOps/s $\color{#35bf28}+0.09\%$
test_compile_indexing[int-tensordict-eager] 0.4891ms 24.2426μs 41.2497 KOps/s 39.3580 KOps/s $\color{#35bf28}+4.81\%$
test_compile_indexing[int-tensorclass-compile] 0.2550ms 0.1331ms 7.5117 KOps/s 7.5298 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_indexing[int-tensorclass-eager] 0.2094ms 23.2299μs 43.0480 KOps/s 48.4464 KOps/s $\textbf{\color{#d91a1a}-11.14\%}$
test_compile_indexing[int-pytree-compile] 0.2562ms 0.1317ms 7.5903 KOps/s 7.5477 KOps/s $\color{#35bf28}+0.56\%$
test_compile_indexing[int-pytree-eager] 60.1710μs 20.0403μs 49.8995 KOps/s 48.2208 KOps/s $\color{#35bf28}+3.48\%$
test_mod_add[eager] 0.1737ms 31.2899μs 31.9592 KOps/s 29.8687 KOps/s $\textbf{\color{#35bf28}+7.00\%}$
test_mod_add[compile] 0.1914ms 69.1623μs 14.4587 KOps/s 14.1144 KOps/s $\color{#35bf28}+2.44\%$
test_mod_add[compile-overhead] 0.2615ms 0.1362ms 7.3408 KOps/s 5.9002 KOps/s $\textbf{\color{#35bf28}+24.42\%}$
test_mod_wrap[eager] 0.4118ms 0.2410ms 4.1494 KOps/s 4.2029 KOps/s $\color{#d91a1a}-1.27\%$
test_mod_wrap[compile] 0.6896ms 0.2953ms 3.3862 KOps/s 3.3393 KOps/s $\color{#35bf28}+1.40\%$
test_mod_wrap[compile-overhead] 7.4766ms 4.0331ms 247.9459 Ops/s 249.6978 Ops/s $\color{#d91a1a}-0.70\%$
test_mod_wrap_and_backward[eager] 1.5536ms 1.3212ms 756.8795 Ops/s 716.8600 Ops/s $\textbf{\color{#35bf28}+5.58\%}$
test_mod_wrap_and_backward[compile] 1.5792ms 1.3197ms 757.7387 Ops/s 703.8302 Ops/s $\textbf{\color{#35bf28}+7.66\%}$
test_mod_wrap_and_backward[compile-overhead] 1.3341ms 0.8986ms 1.1128 KOps/s 988.0595 Ops/s $\textbf{\color{#35bf28}+12.63\%}$
test_seq_add[eager] 0.4852ms 96.6432μs 10.3473 KOps/s 9.8457 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_seq_add[compile] 0.4647ms 80.4341μs 12.4325 KOps/s 12.2483 KOps/s $\color{#35bf28}+1.50\%$
test_seq_add[compile-overhead] 0.1505ms 0.1149ms 8.7007 KOps/s 8.5337 KOps/s $\color{#35bf28}+1.96\%$
test_seq_wrap[eager] 0.7661ms 0.3709ms 2.6959 KOps/s 2.5803 KOps/s $\color{#35bf28}+4.48\%$
test_seq_wrap[compile] 0.7060ms 0.3130ms 3.1950 KOps/s 3.1497 KOps/s $\color{#35bf28}+1.44\%$
test_seq_wrap[compile-overhead] 0.6099ms 0.2184ms 4.5780 KOps/s 4.5156 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_runtime[False-eager] 1.1249ms 0.7222ms 1.3847 KOps/s 1.3988 KOps/s $\color{#d91a1a}-1.01\%$
test_func_call_runtime[False-compile] 1.1844ms 0.7920ms 1.2625 KOps/s 1.2619 KOps/s $\color{#35bf28}+0.05\%$
test_func_call_runtime[False-compile-overhead] 0.5119ms 0.3608ms 2.7717 KOps/s 2.7527 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_runtime[True-eager] 1.2890ms 0.8867ms 1.1278 KOps/s 1.1387 KOps/s $\color{#d91a1a}-0.95\%$
test_func_call_runtime[True-compile] 1.2357ms 0.8285ms 1.2070 KOps/s 1.2062 KOps/s $\color{#35bf28}+0.07\%$
test_func_call_runtime[True-compile-overhead] 0.5363ms 0.3948ms 2.5331 KOps/s 2.5129 KOps/s $\color{#35bf28}+0.80\%$
test_func_call_cm_runtime[False-eager] 1.1336ms 0.7297ms 1.3705 KOps/s 1.4136 KOps/s $\color{#d91a1a}-3.05\%$
test_func_call_cm_runtime[False-compile] 1.2045ms 0.7950ms 1.2578 KOps/s 1.2686 KOps/s $\color{#d91a1a}-0.85\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4782ms 0.3629ms 2.7557 KOps/s 2.7368 KOps/s $\color{#35bf28}+0.69\%$
test_func_call_cm_runtime[True-eager] 1.1609ms 1.0163ms 983.9434 Ops/s 1.0165 KOps/s $\color{#d91a1a}-3.20\%$
test_func_call_cm_runtime[True-compile] 1.0056ms 0.8515ms 1.1745 KOps/s 1.1710 KOps/s $\color{#35bf28}+0.29\%$
test_func_call_cm_runtime[True-compile-overhead] 0.5527ms 0.4188ms 2.3877 KOps/s 2.3545 KOps/s $\color{#35bf28}+1.41\%$
test_vmap_func_call_cm_runtime[eager] 2.4515ms 1.9901ms 502.4933 Ops/s 496.0132 Ops/s $\color{#35bf28}+1.31\%$
test_vmap_func_call_cm_runtime[compile] 1.0158ms 0.8638ms 1.1576 KOps/s 1.1528 KOps/s $\color{#35bf28}+0.42\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5874ms 0.4251ms 2.3525 KOps/s 2.3368 KOps/s $\color{#35bf28}+0.67\%$
test_distributed 2.8889ms 0.2116ms 4.7252 KOps/s 8.8615 KOps/s $\textbf{\color{#d91a1a}-46.68\%}$
test_tdmodule 33.1310μs 14.0156μs 71.3492 KOps/s 61.0636 KOps/s $\textbf{\color{#35bf28}+16.84\%}$
test_tdmodule_dispatch 49.8410μs 27.6418μs 36.1771 KOps/s 32.1850 KOps/s $\textbf{\color{#35bf28}+12.40\%}$
test_tdseq 35.1200μs 14.8581μs 67.3035 KOps/s 57.4817 KOps/s $\textbf{\color{#35bf28}+17.09\%}$
test_tdseq_dispatch 58.2210μs 30.3281μs 32.9727 KOps/s 27.8655 KOps/s $\textbf{\color{#35bf28}+18.33\%}$
test_instantiation_functorch 2.3705ms 1.8515ms 540.1103 Ops/s 530.6703 Ops/s $\color{#35bf28}+1.78\%$
test_instantiation_td 1.7906ms 1.1975ms 835.0951 Ops/s 824.4090 Ops/s $\color{#35bf28}+1.30\%$
test_exec_functorch 0.3345ms 0.2081ms 4.8050 KOps/s 4.8705 KOps/s $\color{#d91a1a}-1.35\%$
test_exec_functional_call 0.5913ms 0.2057ms 4.8611 KOps/s 4.9613 KOps/s $\color{#d91a1a}-2.02\%$
test_exec_td 0.2759ms 0.2103ms 4.7558 KOps/s 4.5750 KOps/s $\color{#35bf28}+3.95\%$
test_exec_td_decorator 0.6472ms 0.2547ms 3.9260 KOps/s 3.9398 KOps/s $\color{#d91a1a}-0.35\%$
test_vmap_mlp_speed[True-True] 1.0710ms 0.6694ms 1.4938 KOps/s 1.4415 KOps/s $\color{#35bf28}+3.63\%$
test_vmap_mlp_speed[True-False] 1.0771ms 0.6682ms 1.4966 KOps/s 1.4009 KOps/s $\textbf{\color{#35bf28}+6.83\%}$
test_vmap_mlp_speed[False-True] 0.9681ms 0.5612ms 1.7818 KOps/s 1.7720 KOps/s $\color{#35bf28}+0.55\%$
test_vmap_mlp_speed[False-False] 0.9679ms 0.5667ms 1.7646 KOps/s 1.7772 KOps/s $\color{#d91a1a}-0.71\%$
test_vmap_mlp_speed_decorator[True-True] 1.3855ms 0.6542ms 1.5286 KOps/s 1.5101 KOps/s $\color{#35bf28}+1.23\%$
test_vmap_mlp_speed_decorator[True-False] 0.8213ms 0.6554ms 1.5258 KOps/s 1.5154 KOps/s $\color{#35bf28}+0.69\%$
test_vmap_mlp_speed_decorator[False-True] 0.7568ms 0.5922ms 1.6886 KOps/s 1.7349 KOps/s $\color{#d91a1a}-2.67\%$
test_vmap_mlp_speed_decorator[False-False] 0.7825ms 0.6093ms 1.6413 KOps/s 1.7295 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_vmap_transformer_speed[True-True] 8.5552ms 8.1074ms 123.3442 Ops/s 121.7504 Ops/s $\color{#35bf28}+1.31\%$
test_vmap_transformer_speed[True-False] 8.2191ms 8.0821ms 123.7297 Ops/s 121.9495 Ops/s $\color{#35bf28}+1.46\%$
test_vmap_transformer_speed[False-True] 8.3083ms 7.9268ms 126.1539 Ops/s 125.1864 Ops/s $\color{#35bf28}+0.77\%$
test_vmap_transformer_speed[False-False] 8.2952ms 7.9240ms 126.1982 Ops/s 125.1301 Ops/s $\color{#35bf28}+0.85\%$
test_vmap_transformer_speed_decorator[True-True] 19.1490ms 18.9490ms 52.7733 Ops/s 52.5887 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_transformer_speed_decorator[True-False] 19.7359ms 19.0008ms 52.6293 Ops/s 52.4453 Ops/s $\color{#35bf28}+0.35\%$
test_vmap_transformer_speed_decorator[False-True] 19.5790ms 18.8859ms 52.9497 Ops/s 53.0164 Ops/s $\color{#d91a1a}-0.13\%$
test_vmap_transformer_speed_decorator[False-False] 19.1916ms 18.8553ms 53.0354 Ops/s 52.8592 Ops/s $\color{#35bf28}+0.33\%$
test_to_module_speed[True] 1.4631ms 0.9617ms 1.0398 KOps/s 1.0613 KOps/s $\color{#d91a1a}-2.02\%$
test_to_module_speed[False] 1.3224ms 0.9333ms 1.0715 KOps/s 1.0889 KOps/s $\color{#d91a1a}-1.60\%$
test_tc_init 0.4290ms 34.1066μs 29.3198 KOps/s 26.9044 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_tc_init_nested 0.2480ms 68.6502μs 14.5666 KOps/s 13.7687 KOps/s $\textbf{\color{#35bf28}+5.80\%}$
test_tc_first_layer_tensor 3.3987μs 0.6827μs 1.4648 MOps/s 1.5027 MOps/s $\color{#d91a1a}-2.52\%$
test_tc_first_layer_nontensor 23.6810μs 2.2414μs 446.1480 KOps/s 447.3710 KOps/s $\color{#d91a1a}-0.27\%$
test_tc_second_layer_tensor 97.0648μs 1.3543μs 738.3664 KOps/s 738.8531 KOps/s $\color{#d91a1a}-0.07\%$
test_tc_second_layer_nontensor 0.3870ms 2.9456μs 339.4935 KOps/s 341.5296 KOps/s $\color{#d91a1a}-0.60\%$
test_unbind 0.1962s 12.0421ms 83.0423 Ops/s 92.1236 Ops/s $\textbf{\color{#d91a1a}-9.86\%}$
test_full_like 0.7906ms 0.5755ms 1.7375 KOps/s 1.7379 KOps/s $\color{#d91a1a}-0.02\%$
test_zeros_like 0.3535ms 0.1979ms 5.0527 KOps/s 5.0476 KOps/s $\color{#35bf28}+0.10\%$
test_ones_like 0.5913ms 0.1978ms 5.0544 KOps/s 5.0536 KOps/s $\color{#35bf28}+0.02\%$
test_clone 0.7150ms 0.4145ms 2.4127 KOps/s 2.4173 KOps/s $\color{#d91a1a}-0.19\%$
test_squeeze 36.3210μs 9.8736μs 101.2800 KOps/s 102.2895 KOps/s $\color{#d91a1a}-0.99\%$
test_unsqueeze 0.4569ms 75.1560μs 13.3057 KOps/s 13.2662 KOps/s $\color{#35bf28}+0.30\%$
test_split 0.5328ms 0.1580ms 6.3305 KOps/s 6.3320 KOps/s $\color{#d91a1a}-0.02\%$
test_permute 0.3087ms 0.1790ms 5.5869 KOps/s 5.3172 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_stack 1.3834ms 0.8468ms 1.1809 KOps/s 1.1793 KOps/s $\color{#35bf28}+0.14\%$
test_cat 1.3752ms 1.2311ms 812.2609 Ops/s 811.7012 Ops/s $\color{#35bf28}+0.07\%$

[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
[ghstack-poisoned]
@vmoens vmoens merged commit d058609 into gh/vmoens/18/base Sep 17, 2024
4 of 8 checks passed
vmoens added a commit that referenced this pull request Sep 17, 2024
ghstack-source-id: d312fc1dee177275a73482210c1ecfbe73b04f9e
Pull Request resolved: #991
@vmoens vmoens deleted the gh/vmoens/18/head branch September 17, 2024 00:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants