Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix td device sync when error is raised #988

Merged
merged 1 commit into from
Sep 13, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Sep 12, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Sep 12, 2024
ghstack-source-id: d0e810c71ca1c9945561ca5a9e71cb71445095e4
Pull Request resolved: #988
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 12, 2024
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 222. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}16$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 54.7420μs 20.5189μs 48.7355 KOps/s 48.0005 KOps/s $\color{#35bf28}+1.53\%$
test_plain_set_stack_nested 45.1240μs 20.7673μs 48.1526 KOps/s 47.9990 KOps/s $\color{#35bf28}+0.32\%$
test_plain_set_nested_inplace 77.2840μs 22.5869μs 44.2734 KOps/s 44.3374 KOps/s $\color{#d91a1a}-0.14\%$
test_plain_set_stack_nested_inplace 78.1240μs 22.3068μs 44.8293 KOps/s 43.9829 KOps/s $\color{#35bf28}+1.92\%$
test_items 21.5610μs 4.2193μs 237.0056 KOps/s 238.4673 KOps/s $\color{#d91a1a}-0.61\%$
test_items_nested 0.4905ms 0.3275ms 3.0533 KOps/s 3.0304 KOps/s $\color{#35bf28}+0.75\%$
test_items_nested_locked 0.5218ms 0.3273ms 3.0553 KOps/s 2.9969 KOps/s $\color{#35bf28}+1.95\%$
test_items_nested_leaf 0.1467ms 85.5178μs 11.6935 KOps/s 11.9260 KOps/s $\color{#d91a1a}-1.95\%$
test_items_stack_nested 0.5091ms 0.3320ms 3.0119 KOps/s 2.9874 KOps/s $\color{#35bf28}+0.82\%$
test_items_stack_nested_leaf 0.1469ms 83.1171μs 12.0312 KOps/s 11.6643 KOps/s $\color{#35bf28}+3.15\%$
test_items_stack_nested_locked 0.4709ms 0.3316ms 3.0161 KOps/s 2.9518 KOps/s $\color{#35bf28}+2.18\%$
test_keys 22.7820μs 3.6360μs 275.0237 KOps/s 284.0990 KOps/s $\color{#d91a1a}-3.19\%$
test_keys_nested 0.1728ms 96.3639μs 10.3773 KOps/s 10.1298 KOps/s $\color{#35bf28}+2.44\%$
test_keys_nested_locked 1.6647ms 0.1038ms 9.6312 KOps/s 9.7220 KOps/s $\color{#d91a1a}-0.93\%$
test_keys_nested_leaf 0.1566ms 83.4445μs 11.9840 KOps/s 12.1161 KOps/s $\color{#d91a1a}-1.09\%$
test_keys_stack_nested 0.1771ms 95.9299μs 10.4243 KOps/s 10.3688 KOps/s $\color{#35bf28}+0.54\%$
test_keys_stack_nested_leaf 0.1622ms 80.3740μs 12.4418 KOps/s 12.1579 KOps/s $\color{#35bf28}+2.34\%$
test_keys_stack_nested_locked 0.1679ms 99.1356μs 10.0872 KOps/s 9.8312 KOps/s $\color{#35bf28}+2.60\%$
test_values 5.3800μs 1.0825μs 923.7746 KOps/s 902.1684 KOps/s $\color{#35bf28}+2.39\%$
test_values_nested 95.9290μs 47.7403μs 20.9467 KOps/s 21.0168 KOps/s $\color{#d91a1a}-0.33\%$
test_values_nested_locked 90.6200μs 47.6515μs 20.9857 KOps/s 20.9511 KOps/s $\color{#35bf28}+0.16\%$
test_values_nested_leaf 83.2560μs 42.3830μs 23.5944 KOps/s 23.4996 KOps/s $\color{#35bf28}+0.40\%$
test_values_stack_nested 94.0860μs 47.4899μs 21.0571 KOps/s 20.8287 KOps/s $\color{#35bf28}+1.10\%$
test_values_stack_nested_leaf 80.4400μs 41.2398μs 24.2484 KOps/s 23.9887 KOps/s $\color{#35bf28}+1.08\%$
test_values_stack_nested_locked 98.1940μs 47.8084μs 20.9168 KOps/s 20.5985 KOps/s $\color{#35bf28}+1.55\%$
test_membership 2.6315μs 0.6805μs 1.4695 MOps/s 1.2244 MOps/s $\textbf{\color{#35bf28}+20.02\%}$
test_membership_nested 22.0510μs 2.6874μs 372.1018 KOps/s 377.6579 KOps/s $\color{#d91a1a}-1.47\%$
test_membership_nested_leaf 39.4140μs 2.6865μs 372.2378 KOps/s 377.8384 KOps/s $\color{#d91a1a}-1.48\%$
test_membership_stacked_nested 25.7880μs 2.6886μs 371.9460 KOps/s 381.8723 KOps/s $\color{#d91a1a}-2.60\%$
test_membership_stacked_nested_leaf 30.5070μs 2.7030μs 369.9598 KOps/s 380.5307 KOps/s $\color{#d91a1a}-2.78\%$
test_membership_nested_last 42.3410μs 3.8186μs 261.8777 KOps/s 263.0406 KOps/s $\color{#d91a1a}-0.44\%$
test_membership_nested_leaf_last 44.3230μs 3.8904μs 257.0431 KOps/s 260.3649 KOps/s $\color{#d91a1a}-1.28\%$
test_membership_stacked_nested_last 33.8630μs 12.9106μs 77.4558 KOps/s 259.7614 KOps/s $\textbf{\color{#d91a1a}-70.18\%}$
test_membership_stacked_nested_leaf_last 63.1880μs 12.8078μs 78.0772 KOps/s 260.4866 KOps/s $\textbf{\color{#d91a1a}-70.03\%}$
test_nested_getleaf 53.6010μs 10.8064μs 92.5378 KOps/s 90.8269 KOps/s $\color{#35bf28}+1.88\%$
test_nested_get 52.7090μs 10.2751μs 97.3230 KOps/s 96.0310 KOps/s $\color{#35bf28}+1.35\%$
test_stacked_getleaf 55.6760μs 10.7996μs 92.5964 KOps/s 94.2640 KOps/s $\color{#d91a1a}-1.77\%$
test_stacked_get 52.8090μs 10.3581μs 96.5427 KOps/s 98.4362 KOps/s $\color{#d91a1a}-1.92\%$
test_nested_getitemleaf 56.3350μs 11.2502μs 88.8873 KOps/s 89.0524 KOps/s $\color{#d91a1a}-0.19\%$
test_nested_getitem 44.4640μs 10.4940μs 95.2927 KOps/s 96.1441 KOps/s $\color{#d91a1a}-0.89\%$
test_stacked_getitemleaf 50.7150μs 11.1777μs 89.4639 KOps/s 90.2674 KOps/s $\color{#d91a1a}-0.89\%$
test_stacked_getitem 41.6680μs 10.5202μs 95.0549 KOps/s 96.8811 KOps/s $\color{#d91a1a}-1.89\%$
test_lock_nested 79.6208ms 0.5526ms 1.8097 KOps/s 2.1453 KOps/s $\textbf{\color{#d91a1a}-15.64\%}$
test_lock_stack_nested 0.6781ms 0.4287ms 2.3325 KOps/s 2.2621 KOps/s $\color{#35bf28}+3.11\%$
test_unlock_nested 83.2885ms 0.4807ms 2.0804 KOps/s 2.5126 KOps/s $\textbf{\color{#d91a1a}-17.20\%}$
test_unlock_stack_nested 0.5348ms 0.3495ms 2.8611 KOps/s 2.7414 KOps/s $\color{#35bf28}+4.37\%$
test_flatten_speed 0.1901ms 0.1040ms 9.6180 KOps/s 9.6642 KOps/s $\color{#d91a1a}-0.48\%$
test_unflatten_speed 0.8218ms 0.4580ms 2.1834 KOps/s 2.1849 KOps/s $\color{#d91a1a}-0.07\%$
test_common_ops 4.0124ms 1.0845ms 922.0705 Ops/s 919.5362 Ops/s $\color{#35bf28}+0.28\%$
test_creation 19.5670μs 2.1154μs 472.7249 KOps/s 492.9698 KOps/s $\color{#d91a1a}-4.11\%$
test_creation_empty 49.2930μs 17.8537μs 56.0109 KOps/s 54.8276 KOps/s $\color{#35bf28}+2.16\%$
test_creation_nested_1 49.7130μs 20.9418μs 47.7513 KOps/s 46.6025 KOps/s $\color{#35bf28}+2.47\%$
test_creation_nested_2 57.5380μs 24.9276μs 40.1162 KOps/s 38.6895 KOps/s $\color{#35bf28}+3.69\%$
test_clone 77.4440μs 16.5460μs 60.4376 KOps/s 60.1801 KOps/s $\color{#35bf28}+0.43\%$
test_getitem[int] 1.0814ms 16.9326μs 59.0575 KOps/s 59.5212 KOps/s $\color{#d91a1a}-0.78\%$
test_getitem[slice_int] 0.1338ms 31.0954μs 32.1591 KOps/s 32.8472 KOps/s $\color{#d91a1a}-2.09\%$
test_getitem[range] 0.1926ms 58.0244μs 17.2341 KOps/s 18.1600 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_getitem[tuple] 0.1305ms 25.0993μs 39.8417 KOps/s 39.7057 KOps/s $\color{#35bf28}+0.34\%$
test_getitem[list] 0.2121ms 54.1091μs 18.4812 KOps/s 19.6785 KOps/s $\textbf{\color{#d91a1a}-6.08\%}$
test_setitem_dim[int] 60.5530μs 32.3482μs 30.9136 KOps/s 30.8577 KOps/s $\color{#35bf28}+0.18\%$
test_setitem_dim[slice_int] 0.1127ms 60.9405μs 16.4094 KOps/s 17.0364 KOps/s $\color{#d91a1a}-3.68\%$
test_setitem_dim[range] 0.1437ms 85.0727μs 11.7547 KOps/s 12.3442 KOps/s $\color{#d91a1a}-4.78\%$
test_setitem_dim[tuple] 75.3810μs 48.6891μs 20.5385 KOps/s 21.0064 KOps/s $\color{#d91a1a}-2.23\%$
test_setitem 0.1096ms 29.4906μs 33.9091 KOps/s 33.6360 KOps/s $\color{#35bf28}+0.81\%$
test_set 0.1163ms 28.9492μs 34.5432 KOps/s 35.0636 KOps/s $\color{#d91a1a}-1.48\%$
test_set_shared 3.0329ms 0.2092ms 4.7797 KOps/s 4.7277 KOps/s $\color{#35bf28}+1.10\%$
test_update 0.1473ms 34.9365μs 28.6233 KOps/s 28.0573 KOps/s $\color{#35bf28}+2.02\%$
test_update_nested 0.1228ms 45.5543μs 21.9518 KOps/s 21.6739 KOps/s $\color{#35bf28}+1.28\%$
test_update__nested 89.5780μs 33.1523μs 30.1638 KOps/s 29.1284 KOps/s $\color{#35bf28}+3.55\%$
test_set_nested 92.8840μs 31.1212μs 32.1325 KOps/s 31.4579 KOps/s $\color{#35bf28}+2.14\%$
test_set_nested_new 0.1158ms 36.4982μs 27.3986 KOps/s 27.1990 KOps/s $\color{#35bf28}+0.73\%$
test_select 0.2076ms 54.8420μs 18.2342 KOps/s 18.3415 KOps/s $\color{#d91a1a}-0.59\%$
test_select_nested 0.1345ms 59.1843μs 16.8964 KOps/s 16.5042 KOps/s $\color{#35bf28}+2.38\%$
test_exclude_nested 0.1563ms 76.2837μs 13.1090 KOps/s 13.1098 KOps/s $-0.01\%$
test_empty[True] 0.4850ms 0.3104ms 3.2220 KOps/s 3.1090 KOps/s $\color{#35bf28}+3.64\%$
test_empty[False] 10.8930μs 1.2818μs 780.1828 KOps/s 821.3826 KOps/s $\textbf{\color{#d91a1a}-5.02\%}$
test_unbind_speed 0.5340ms 0.2978ms 3.3579 KOps/s 3.4009 KOps/s $\color{#d91a1a}-1.27\%$
test_unbind_speed_stack0 0.5589ms 0.2844ms 3.5163 KOps/s 3.4628 KOps/s $\color{#35bf28}+1.55\%$
test_unbind_speed_stack1 86.9151ms 0.7581ms 1.3191 KOps/s 1.3923 KOps/s $\textbf{\color{#d91a1a}-5.26\%}$
test_split 2.1831ms 2.0384ms 490.5774 Ops/s 461.7836 Ops/s $\textbf{\color{#35bf28}+6.24\%}$
test_chunk 87.1319ms 2.2013ms 454.2785 Ops/s 463.0762 Ops/s $\color{#d91a1a}-1.90\%$
test_creation[device0] 0.2178ms 0.1164ms 8.5933 KOps/s 8.5431 KOps/s $\color{#35bf28}+0.59\%$
test_creation_from_tensor 3.1833ms 0.1176ms 8.5012 KOps/s 8.5174 KOps/s $\color{#d91a1a}-0.19\%$
test_add_one[memmap_tensor0] 0.1517ms 7.5495μs 132.4589 KOps/s 133.0456 KOps/s $\color{#d91a1a}-0.44\%$
test_contiguous[memmap_tensor0] 19.3960μs 1.8624μs 536.9508 KOps/s 529.6316 KOps/s $\color{#35bf28}+1.38\%$
test_stack[memmap_tensor0] 46.7170μs 5.9816μs 167.1793 KOps/s 174.0996 KOps/s $\color{#d91a1a}-3.97\%$
test_memmaptd_index 1.1287ms 0.3999ms 2.5008 KOps/s 2.5687 KOps/s $\color{#d91a1a}-2.64\%$
test_memmaptd_index_astensor 0.8051ms 0.4801ms 2.0828 KOps/s 2.1346 KOps/s $\color{#d91a1a}-2.43\%$
test_memmaptd_index_op 1.7907ms 1.0153ms 984.9176 Ops/s 983.2601 Ops/s $\color{#35bf28}+0.17\%$
test_serialize_model 0.2167s 0.1284s 7.7878 Ops/s 8.3014 Ops/s $\textbf{\color{#d91a1a}-6.19\%}$
test_serialize_model_pickle 0.5104s 0.4066s 2.4592 Ops/s 2.5058 Ops/s $\color{#d91a1a}-1.86\%$
test_serialize_weights 0.1218s 0.1134s 8.8209 Ops/s 7.5571 Ops/s $\textbf{\color{#35bf28}+16.72\%}$
test_serialize_weights_returnearly 0.2368s 0.1699s 5.8845 Ops/s 6.3495 Ops/s $\textbf{\color{#d91a1a}-7.32\%}$
test_serialize_weights_pickle 0.6323s 0.4458s 2.2434 Ops/s 2.3844 Ops/s $\textbf{\color{#d91a1a}-5.91\%}$
test_serialize_weights_filesystem 0.1448s 0.1386s 7.2159 Ops/s 7.1621 Ops/s $\color{#35bf28}+0.75\%$
test_serialize_model_filesystem 0.1540s 0.1437s 6.9598 Ops/s 5.8337 Ops/s $\textbf{\color{#35bf28}+19.30\%}$
test_reshape_pytree 87.9440μs 39.2741μs 25.4621 KOps/s 26.2823 KOps/s $\color{#d91a1a}-3.12\%$
test_reshape_td 0.1157ms 44.5145μs 22.4646 KOps/s 21.9343 KOps/s $\color{#35bf28}+2.42\%$
test_view_pytree 99.4860μs 38.0850μs 26.2570 KOps/s 26.5735 KOps/s $\color{#d91a1a}-1.19\%$
test_view_td 0.3465ms 49.5518μs 20.1809 KOps/s 19.1426 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_unbind_pytree 93.1040μs 36.0144μs 27.7667 KOps/s 27.7847 KOps/s $\color{#d91a1a}-0.07\%$
test_unbind_td 0.3031ms 44.0363μs 22.7085 KOps/s 22.6118 KOps/s $\color{#35bf28}+0.43\%$
test_split_pytree 0.2604ms 40.9397μs 24.4262 KOps/s 26.7127 KOps/s $\textbf{\color{#d91a1a}-8.56\%}$
test_split_td 0.4852ms 57.7773μs 17.3078 KOps/s 17.7818 KOps/s $\color{#d91a1a}-2.67\%$
test_add_pytree 0.1184ms 45.2140μs 22.1170 KOps/s 22.7474 KOps/s $\color{#d91a1a}-2.77\%$
test_add_td 0.1757ms 80.3765μs 12.4414 KOps/s 12.8211 KOps/s $\color{#d91a1a}-2.96\%$
test_compile_add_one_nested[tensordict-compile] 0.1154ms 54.9512μs 18.1980 KOps/s 17.8780 KOps/s $\color{#35bf28}+1.79\%$
test_compile_add_one_nested[tensordict-eager] 0.2978ms 0.1831ms 5.4625 KOps/s 5.4585 KOps/s $\color{#35bf28}+0.07\%$
test_compile_add_one_nested[pytree-compile] 0.1096ms 55.7051μs 17.9517 KOps/s 17.9316 KOps/s $\color{#35bf28}+0.11\%$
test_compile_add_one_nested[pytree-eager] 0.3415ms 0.1415ms 7.0680 KOps/s 7.0849 KOps/s $\color{#d91a1a}-0.24\%$
test_compile_copy_nested[tensordict-compile] 59.9120μs 20.4981μs 48.7849 KOps/s 48.8895 KOps/s $\color{#d91a1a}-0.21\%$
test_compile_copy_nested[tensordict-eager] 0.1521ms 66.7523μs 14.9808 KOps/s 15.3682 KOps/s $\color{#d91a1a}-2.52\%$
test_compile_copy_nested[pytree-compile] 0.1625ms 76.6109μs 13.0530 KOps/s 13.0389 KOps/s $\color{#35bf28}+0.11\%$
test_compile_copy_nested[pytree-eager] 0.1533ms 68.3531μs 14.6299 KOps/s 14.7775 KOps/s $\color{#d91a1a}-1.00\%$
test_compile_add_one_flat[tensordict-compile] 0.5801ms 0.1748ms 5.7200 KOps/s 5.8046 KOps/s $\color{#d91a1a}-1.46\%$
test_compile_add_one_flat[tensordict-eager] 0.3501ms 0.1886ms 5.3033 KOps/s 5.2928 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_flat[tensorclass-compile] 0.1017ms 45.3924μs 22.0301 KOps/s 21.6167 KOps/s $\color{#35bf28}+1.91\%$
test_compile_add_one_flat[tensorclass-eager] 0.5213ms 67.2109μs 14.8785 KOps/s 14.2203 KOps/s $\color{#35bf28}+4.63\%$
test_compile_add_one_flat[pytree-compile] 0.3932ms 0.1726ms 5.7931 KOps/s 5.7126 KOps/s $\color{#35bf28}+1.41\%$
test_compile_add_one_flat[pytree-eager] 0.6326ms 0.2966ms 3.3714 KOps/s 3.4275 KOps/s $\color{#d91a1a}-1.64\%$
test_compile_add_self_flat[tensordict-eager] 0.3501ms 0.2018ms 4.9566 KOps/s 4.9360 KOps/s $\color{#35bf28}+0.42\%$
test_compile_add_self_flat[tensordict-compile] 0.3433ms 0.1722ms 5.8065 KOps/s 5.8101 KOps/s $\color{#d91a1a}-0.06\%$
test_compile_add_self_flat[tensorclass-eager] 0.1309ms 62.4921μs 16.0020 KOps/s 15.9607 KOps/s $\color{#35bf28}+0.26\%$
test_compile_add_self_flat[tensorclass-compile] 0.1145ms 46.7061μs 21.4105 KOps/s 21.2532 KOps/s $\color{#35bf28}+0.74\%$
test_compile_add_self_flat[pytree-eager] 0.4879ms 0.2335ms 4.2829 KOps/s 4.2996 KOps/s $\color{#d91a1a}-0.39\%$
test_compile_add_self_flat[pytree-compile] 0.3418ms 0.1773ms 5.6403 KOps/s 5.7639 KOps/s $\color{#d91a1a}-2.15\%$
test_compile_copy_flat[tensordict-compile] 0.2041ms 0.1014ms 9.8655 KOps/s 9.8780 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_copy_flat[tensordict-eager] 0.1217ms 57.7168μs 17.3260 KOps/s 17.3664 KOps/s $\color{#d91a1a}-0.23\%$
test_compile_copy_flat[pytree-compile] 0.1427ms 74.9923μs 13.3347 KOps/s 12.9857 KOps/s $\color{#35bf28}+2.69\%$
test_compile_copy_flat[pytree-eager] 0.1684ms 67.6593μs 14.7799 KOps/s 14.4175 KOps/s $\color{#35bf28}+2.51\%$
test_compile_assign_and_add[tensordict-compile] 0.2896ms 0.1953ms 5.1208 KOps/s 5.1048 KOps/s $\color{#35bf28}+0.31\%$
test_compile_assign_and_add[tensordict-eager] 1.8342ms 1.6418ms 609.1020 Ops/s 615.4773 Ops/s $\color{#d91a1a}-1.04\%$
test_compile_assign_and_add[pytree-compile] 0.3927ms 0.1958ms 5.1061 KOps/s 5.1399 KOps/s $\color{#d91a1a}-0.66\%$
test_compile_assign_and_add[pytree-eager] 1.2408ms 1.1110ms 900.1267 Ops/s 920.3571 Ops/s $\color{#d91a1a}-2.20\%$
test_compile_assign_and_add_stack[compile] 0.8119ms 0.4147ms 2.4112 KOps/s 2.4008 KOps/s $\color{#35bf28}+0.43\%$
test_compile_assign_and_add_stack[eager] 4.0468ms 3.7243ms 268.5061 Ops/s 261.8371 Ops/s $\color{#35bf28}+2.55\%$
test_compile_indexing[tensor-tensordict-compile] 0.1021ms 33.2488μs 30.0763 KOps/s 29.4254 KOps/s $\color{#35bf28}+2.21\%$
test_compile_indexing[tensor-tensordict-eager] 1.0230ms 49.8845μs 20.0463 KOps/s 21.8330 KOps/s $\textbf{\color{#d91a1a}-8.18\%}$
test_compile_indexing[tensor-tensorclass-compile] 86.4420μs 28.5852μs 34.9832 KOps/s 34.3898 KOps/s $\color{#35bf28}+1.73\%$
test_compile_indexing[tensor-tensorclass-eager] 87.9340μs 29.1185μs 34.3425 KOps/s 35.9451 KOps/s $\color{#d91a1a}-4.46\%$
test_compile_indexing[tensor-pytree-compile] 99.4160μs 28.9139μs 34.5854 KOps/s 35.1161 KOps/s $\color{#d91a1a}-1.51\%$
test_compile_indexing[tensor-pytree-eager] 73.0570μs 28.9691μs 34.5195 KOps/s 35.7150 KOps/s $\color{#d91a1a}-3.35\%$
test_compile_indexing[slice-tensordict-compile] 0.1615ms 72.7969μs 13.7368 KOps/s 13.7553 KOps/s $\color{#d91a1a}-0.13\%$
test_compile_indexing[slice-tensordict-eager] 0.3306ms 29.2920μs 34.1390 KOps/s 38.1520 KOps/s $\textbf{\color{#d91a1a}-10.52\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1379ms 67.5257μs 14.8092 KOps/s 14.5380 KOps/s $\color{#35bf28}+1.87\%$
test_compile_indexing[slice-tensorclass-eager] 82.2840μs 23.9973μs 41.6713 KOps/s 44.4864 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_compile_indexing[slice-pytree-compile] 0.1370ms 67.2969μs 14.8595 KOps/s 15.0067 KOps/s $\color{#d91a1a}-0.98\%$
test_compile_indexing[slice-pytree-eager] 89.5570μs 23.3950μs 42.7441 KOps/s 44.0442 KOps/s $\color{#d91a1a}-2.95\%$
test_compile_indexing[int-tensordict-compile] 0.1503ms 71.0132μs 14.0819 KOps/s 13.8051 KOps/s $\color{#35bf28}+2.01\%$
test_compile_indexing[int-tensordict-eager] 0.7651ms 28.3622μs 35.2582 KOps/s 38.6481 KOps/s $\textbf{\color{#d91a1a}-8.77\%}$
test_compile_indexing[int-tensorclass-compile] 0.1443ms 67.2360μs 14.8730 KOps/s 15.0254 KOps/s $\color{#d91a1a}-1.01\%$
test_compile_indexing[int-tensorclass-eager] 64.9410μs 23.1968μs 43.1094 KOps/s 44.8198 KOps/s $\color{#d91a1a}-3.82\%$
test_compile_indexing[int-pytree-compile] 0.1465ms 67.7560μs 14.7588 KOps/s 15.0701 KOps/s $\color{#d91a1a}-2.07\%$
test_compile_indexing[int-pytree-eager] 63.7400μs 23.0279μs 43.4255 KOps/s 44.1298 KOps/s $\color{#d91a1a}-1.60\%$
test_mod_add[eager] 85.0990μs 23.9574μs 41.7408 KOps/s 42.8291 KOps/s $\color{#d91a1a}-2.54\%$
test_mod_add[compile] 0.1108ms 38.5543μs 25.9374 KOps/s 25.8387 KOps/s $\color{#35bf28}+0.38\%$
test_mod_add[compile-overhead] 0.1171ms 38.8462μs 25.7425 KOps/s 26.0543 KOps/s $\color{#d91a1a}-1.20\%$
test_mod_wrap[eager] 0.4121ms 0.2100ms 4.7622 KOps/s 4.9320 KOps/s $\color{#d91a1a}-3.44\%$
test_mod_wrap[compile] 0.3199ms 0.2304ms 4.3407 KOps/s 4.3872 KOps/s $\color{#d91a1a}-1.06\%$
test_mod_wrap[compile-overhead] 0.3436ms 0.2333ms 4.2869 KOps/s 4.4573 KOps/s $\color{#d91a1a}-3.82\%$
test_mod_wrap_and_backward[eager] 13.4459ms 11.3711ms 87.9422 Ops/s 86.8494 Ops/s $\color{#35bf28}+1.26\%$
test_mod_wrap_and_backward[compile] 17.3160ms 11.9237ms 83.8668 Ops/s 82.2129 Ops/s $\color{#35bf28}+2.01\%$
test_mod_wrap_and_backward[compile-overhead] 14.4306ms 11.9051ms 83.9979 Ops/s 84.8219 Ops/s $\color{#d91a1a}-0.97\%$
test_seq_add[eager] 0.2009ms 88.9791μs 11.2386 KOps/s 11.5296 KOps/s $\color{#d91a1a}-2.52\%$
test_seq_add[compile] 0.1521ms 63.0316μs 15.8650 KOps/s 16.3794 KOps/s $\color{#d91a1a}-3.14\%$
test_seq_add[compile-overhead] 0.1234ms 60.1963μs 16.6123 KOps/s 16.1774 KOps/s $\color{#35bf28}+2.69\%$
test_seq_wrap[eager] 0.5063ms 0.3813ms 2.6224 KOps/s 2.7005 KOps/s $\color{#d91a1a}-2.89\%$
test_seq_wrap[compile] 0.4197ms 0.2673ms 3.7414 KOps/s 3.7773 KOps/s $\color{#d91a1a}-0.95\%$
test_seq_wrap[compile-overhead] 0.4825ms 0.2713ms 3.6855 KOps/s 3.8296 KOps/s $\color{#d91a1a}-3.76\%$
test_func_call_runtime[False-eager] 0.7009ms 0.5302ms 1.8859 KOps/s 1.9592 KOps/s $\color{#d91a1a}-3.74\%$
test_func_call_runtime[False-compile] 0.9359ms 0.5021ms 1.9917 KOps/s 1.9963 KOps/s $\color{#d91a1a}-0.23\%$
test_func_call_runtime[False-compile-overhead] 0.8977ms 0.4991ms 2.0034 KOps/s 2.0264 KOps/s $\color{#d91a1a}-1.13\%$
test_func_call_runtime[True-eager] 0.8987ms 0.7535ms 1.3271 KOps/s 1.3834 KOps/s $\color{#d91a1a}-4.07\%$
test_func_call_runtime[True-compile] 0.5908ms 0.4983ms 2.0066 KOps/s 1.9960 KOps/s $\color{#35bf28}+0.53\%$
test_func_call_runtime[True-compile-overhead] 0.8761ms 0.5080ms 1.9685 KOps/s 1.9812 KOps/s $\color{#d91a1a}-0.64\%$
test_func_call_cm_runtime[False-eager] 0.7258ms 0.5243ms 1.9071 KOps/s 2.0010 KOps/s $\color{#d91a1a}-4.69\%$
test_func_call_cm_runtime[False-compile] 0.9294ms 0.4986ms 2.0055 KOps/s 1.9858 KOps/s $\color{#35bf28}+0.99\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9335ms 0.4982ms 2.0073 KOps/s 2.0196 KOps/s $\color{#d91a1a}-0.61\%$
test_func_call_cm_runtime[True-eager] 1.3900ms 0.8828ms 1.1328 KOps/s 1.1770 KOps/s $\color{#d91a1a}-3.76\%$
test_func_call_cm_runtime[True-compile] 1.0570ms 0.7391ms 1.3530 KOps/s 1.3841 KOps/s $\color{#d91a1a}-2.25\%$
test_func_call_cm_runtime[True-compile-overhead] 0.8495ms 0.7358ms 1.3590 KOps/s 1.3735 KOps/s $\color{#d91a1a}-1.05\%$
test_vmap_func_call_cm_runtime[eager] 2.7565ms 1.8449ms 542.0459 Ops/s 546.7703 Ops/s $\color{#d91a1a}-0.86\%$
test_vmap_func_call_cm_runtime[compile] 2.8769ms 1.9089ms 523.8739 Ops/s 531.6914 Ops/s $\color{#d91a1a}-1.47\%$
test_vmap_func_call_cm_runtime[compile-overhead] 2.6371ms 1.9006ms 526.1523 Ops/s 531.2148 Ops/s $\color{#d91a1a}-0.95\%$
test_distributed 0.2807ms 0.1246ms 8.0257 KOps/s 7.9131 KOps/s $\color{#35bf28}+1.42\%$
test_tdmodule 44.3430μs 17.5273μs 57.0538 KOps/s 55.9870 KOps/s $\color{#35bf28}+1.91\%$
test_tdmodule_dispatch 58.6100μs 35.5939μs 28.0947 KOps/s 27.0535 KOps/s $\color{#35bf28}+3.85\%$
test_tdseq 41.1270μs 20.1822μs 49.5486 KOps/s 49.8072 KOps/s $\color{#d91a1a}-0.52\%$
test_tdseq_dispatch 63.3690μs 40.4377μs 24.7294 KOps/s 24.1009 KOps/s $\color{#35bf28}+2.61\%$
test_instantiation_functorch 1.8009ms 1.5538ms 643.5786 Ops/s 632.1718 Ops/s $\color{#35bf28}+1.80\%$
test_instantiation_td 1.8249ms 1.1345ms 881.4540 Ops/s 876.6405 Ops/s $\color{#35bf28}+0.55\%$
test_exec_functorch 0.4220ms 0.1861ms 5.3748 KOps/s 5.4786 KOps/s $\color{#d91a1a}-1.89\%$
test_exec_functional_call 0.3411ms 0.1753ms 5.7050 KOps/s 5.9313 KOps/s $\color{#d91a1a}-3.82\%$
test_exec_td 0.4019ms 0.1688ms 5.9250 KOps/s 6.1271 KOps/s $\color{#d91a1a}-3.30\%$
test_exec_td_decorator 0.8240ms 0.2218ms 4.5090 KOps/s 4.6250 KOps/s $\color{#d91a1a}-2.51\%$
test_vmap_mlp_speed[True-True] 0.9886ms 0.6350ms 1.5748 KOps/s 1.5801 KOps/s $\color{#d91a1a}-0.33\%$
test_vmap_mlp_speed[True-False] 0.8441ms 0.6295ms 1.5886 KOps/s 1.5807 KOps/s $\color{#35bf28}+0.50\%$
test_vmap_mlp_speed[False-True] 0.7569ms 0.4964ms 2.0147 KOps/s 2.0597 KOps/s $\color{#d91a1a}-2.19\%$
test_vmap_mlp_speed[False-False] 0.6953ms 0.4926ms 2.0300 KOps/s 2.0612 KOps/s $\color{#d91a1a}-1.51\%$
test_vmap_mlp_speed_decorator[True-True] 1.4154ms 0.6112ms 1.6361 KOps/s 1.6399 KOps/s $\color{#d91a1a}-0.23\%$
test_vmap_mlp_speed_decorator[True-False] 0.9691ms 0.6109ms 1.6369 KOps/s 1.6346 KOps/s $\color{#35bf28}+0.15\%$
test_vmap_mlp_speed_decorator[False-True] 0.7487ms 0.5057ms 1.9775 KOps/s 2.0011 KOps/s $\color{#d91a1a}-1.18\%$
test_vmap_mlp_speed_decorator[False-False] 0.8037ms 0.5061ms 1.9757 KOps/s 1.9977 KOps/s $\color{#d91a1a}-1.10\%$
test_to_module_speed[True] 1.4515ms 1.2814ms 780.4160 Ops/s 782.7674 Ops/s $\color{#d91a1a}-0.30\%$
test_to_module_speed[False] 2.0881ms 1.2641ms 791.0929 Ops/s 808.9284 Ops/s $\color{#d91a1a}-2.20\%$
test_tc_init 82.2140μs 43.6034μs 22.9340 KOps/s 21.9664 KOps/s $\color{#35bf28}+4.40\%$
test_tc_init_nested 0.1570ms 87.5241μs 11.4254 KOps/s 10.8081 KOps/s $\textbf{\color{#35bf28}+5.71\%}$
test_tc_first_layer_tensor 19.2860μs 1.5466μs 646.5917 KOps/s 659.3353 KOps/s $\color{#d91a1a}-1.93\%$
test_tc_first_layer_nontensor 25.3570μs 4.6678μs 214.2327 KOps/s 212.8103 KOps/s $\color{#35bf28}+0.67\%$
test_tc_second_layer_tensor 22.3610μs 2.9120μs 343.4122 KOps/s 359.3074 KOps/s $\color{#d91a1a}-4.42\%$
test_tc_second_layer_nontensor 45.4740μs 6.0299μs 165.8395 KOps/s 166.6291 KOps/s $\color{#d91a1a}-0.47\%$
test_unbind 0.4447s 14.7339ms 67.8705 Ops/s 70.3303 Ops/s $\color{#d91a1a}-3.50\%$
test_full_like 7.8114ms 6.9753ms 143.3635 Ops/s 144.2791 Ops/s $\color{#d91a1a}-0.63\%$
test_zeros_like 3.0024ms 2.6575ms 376.2902 Ops/s 375.8864 Ops/s $\color{#35bf28}+0.11\%$
test_ones_like 9.4232ms 5.8553ms 170.7864 Ops/s 166.4527 Ops/s $\color{#35bf28}+2.60\%$
test_clone 12.7104ms 7.4784ms 133.7192 Ops/s 131.2904 Ops/s $\color{#35bf28}+1.85\%$
test_squeeze 71.5440μs 12.2660μs 81.5264 KOps/s 81.2834 KOps/s $\color{#35bf28}+0.30\%$
test_unsqueeze 0.2121ms 90.8917μs 11.0021 KOps/s 11.2838 KOps/s $\color{#d91a1a}-2.50\%$
test_split 0.3511ms 0.1919ms 5.2109 KOps/s 5.2620 KOps/s $\color{#d91a1a}-0.97\%$
test_permute 0.3771ms 0.2192ms 4.5624 KOps/s 4.6663 KOps/s $\color{#d91a1a}-2.23\%$
test_stack 27.1586ms 23.0610ms 43.3633 Ops/s 41.7951 Ops/s $\color{#35bf28}+3.75\%$
test_cat 30.1735ms 23.1037ms 43.2831 Ops/s 42.2925 Ops/s $\color{#35bf28}+2.34\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 228. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}7$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1272ms 12.6479μs 79.0645 KOps/s 71.2511 KOps/s $\textbf{\color{#35bf28}+10.97\%}$
test_plain_set_stack_nested 43.2510μs 12.8152μs 78.0324 KOps/s 70.1767 KOps/s $\textbf{\color{#35bf28}+11.19\%}$
test_plain_set_nested_inplace 41.4410μs 13.8804μs 72.0440 KOps/s 66.0159 KOps/s $\textbf{\color{#35bf28}+9.13\%}$
test_plain_set_stack_nested_inplace 61.4710μs 13.8894μs 71.9972 KOps/s 66.6516 KOps/s $\textbf{\color{#35bf28}+8.02\%}$
test_items 45.5010μs 3.0318μs 329.8337 KOps/s 344.9332 KOps/s $\color{#d91a1a}-4.38\%$
test_items_nested 0.3503ms 0.3222ms 3.1036 KOps/s 3.1962 KOps/s $\color{#d91a1a}-2.90\%$
test_items_nested_locked 0.3647ms 0.3240ms 3.0862 KOps/s 3.1699 KOps/s $\color{#d91a1a}-2.64\%$
test_items_nested_leaf 87.5420μs 63.1215μs 15.8425 KOps/s 15.8725 KOps/s $\color{#d91a1a}-0.19\%$
test_items_stack_nested 0.3768ms 0.3243ms 3.0839 KOps/s 3.2002 KOps/s $\color{#d91a1a}-3.63\%$
test_items_stack_nested_leaf 91.5010μs 63.4123μs 15.7698 KOps/s 15.5043 KOps/s $\color{#35bf28}+1.71\%$
test_items_stack_nested_locked 0.3742ms 0.3255ms 3.0723 KOps/s 3.1784 KOps/s $\color{#d91a1a}-3.34\%$
test_keys 31.6600μs 3.4507μs 289.7946 KOps/s 278.1717 KOps/s $\color{#35bf28}+4.18\%$
test_keys_nested 84.3920μs 55.0573μs 18.1629 KOps/s 18.0186 KOps/s $\color{#35bf28}+0.80\%$
test_keys_nested_locked 2.5916ms 59.9591μs 16.6780 KOps/s 16.4156 KOps/s $\color{#35bf28}+1.60\%$
test_keys_nested_leaf 72.1220μs 45.2007μs 22.1236 KOps/s 21.5254 KOps/s $\color{#35bf28}+2.78\%$
test_keys_stack_nested 80.0520μs 54.8563μs 18.2294 KOps/s 17.7899 KOps/s $\color{#35bf28}+2.47\%$
test_keys_stack_nested_leaf 75.0620μs 46.3045μs 21.5962 KOps/s 20.9035 KOps/s $\color{#35bf28}+3.31\%$
test_keys_stack_nested_locked 0.1034ms 60.3762μs 16.5628 KOps/s 16.3752 KOps/s $\color{#35bf28}+1.15\%$
test_values 3.5061μs 0.8075μs 1.2384 MOps/s 1.1856 MOps/s $\color{#35bf28}+4.45\%$
test_values_nested 56.6010μs 27.4018μs 36.4939 KOps/s 36.3971 KOps/s $\color{#35bf28}+0.27\%$
test_values_nested_locked 60.4320μs 29.1845μs 34.2647 KOps/s 33.7568 KOps/s $\color{#35bf28}+1.50\%$
test_values_nested_leaf 49.9810μs 24.2089μs 41.3072 KOps/s 41.3675 KOps/s $\color{#d91a1a}-0.15\%$
test_values_stack_nested 63.7210μs 27.6704μs 36.1397 KOps/s 34.6599 KOps/s $\color{#35bf28}+4.27\%$
test_values_stack_nested_leaf 0.1619ms 24.4552μs 40.8911 KOps/s 39.3949 KOps/s $\color{#35bf28}+3.80\%$
test_values_stack_nested_locked 55.9010μs 29.6614μs 33.7139 KOps/s 32.3235 KOps/s $\color{#35bf28}+4.30\%$
test_membership 1.6651μs 0.4890μs 2.0451 MOps/s 2.0157 MOps/s $\color{#35bf28}+1.46\%$
test_membership_nested 14.9705μs 1.7941μs 557.3889 KOps/s 573.8109 KOps/s $\color{#d91a1a}-2.86\%$
test_membership_nested_leaf 12.7403μs 1.7592μs 568.4260 KOps/s 588.3350 KOps/s $\color{#d91a1a}-3.38\%$
test_membership_stacked_nested 35.7210μs 1.7971μs 556.4636 KOps/s 557.9374 KOps/s $\color{#d91a1a}-0.26\%$
test_membership_stacked_nested_leaf 29.7600μs 1.7739μs 563.7452 KOps/s 547.6994 KOps/s $\color{#35bf28}+2.93\%$
test_membership_nested_last 35.1200μs 2.6471μs 377.7692 KOps/s 380.3604 KOps/s $\color{#d91a1a}-0.68\%$
test_membership_nested_leaf_last 38.2710μs 2.6431μs 378.3365 KOps/s 379.3746 KOps/s $\color{#d91a1a}-0.27\%$
test_membership_stacked_nested_last 32.4410μs 2.6305μs 380.1617 KOps/s 387.1349 KOps/s $\color{#d91a1a}-1.80\%$
test_membership_stacked_nested_leaf_last 26.0610μs 2.6246μs 381.0102 KOps/s 387.3926 KOps/s $\color{#d91a1a}-1.65\%$
test_nested_getleaf 60.0610μs 6.0919μs 164.1530 KOps/s 164.2107 KOps/s $\color{#d91a1a}-0.04\%$
test_nested_get 34.7610μs 5.7195μs 174.8397 KOps/s 173.5060 KOps/s $\color{#35bf28}+0.77\%$
test_stacked_getleaf 29.3210μs 6.0826μs 164.4038 KOps/s 166.0135 KOps/s $\color{#d91a1a}-0.97\%$
test_stacked_get 43.3310μs 5.8221μs 171.7589 KOps/s 177.0581 KOps/s $\color{#d91a1a}-2.99\%$
test_nested_getitemleaf 27.4310μs 6.1445μs 162.7471 KOps/s 162.5708 KOps/s $\color{#35bf28}+0.11\%$
test_nested_getitem 43.6810μs 5.7260μs 174.6427 KOps/s 176.1164 KOps/s $\color{#d91a1a}-0.84\%$
test_stacked_getitemleaf 33.6810μs 6.1504μs 162.5899 KOps/s 165.7532 KOps/s $\color{#d91a1a}-1.91\%$
test_stacked_getitem 51.4510μs 5.7452μs 174.0584 KOps/s 178.2240 KOps/s $\color{#d91a1a}-2.34\%$
test_lock_nested 4.9713ms 0.4208ms 2.3764 KOps/s 2.3936 KOps/s $\color{#d91a1a}-0.72\%$
test_lock_stack_nested 0.4687ms 0.3847ms 2.5997 KOps/s 2.6051 KOps/s $\color{#d91a1a}-0.21\%$
test_unlock_nested 0.7767ms 0.3539ms 2.8258 KOps/s 2.8443 KOps/s $\color{#d91a1a}-0.65\%$
test_unlock_stack_nested 0.3566ms 0.3208ms 3.1169 KOps/s 3.1526 KOps/s $\color{#d91a1a}-1.13\%$
test_flatten_speed 0.3104ms 79.6385μs 12.5567 KOps/s 12.6540 KOps/s $\color{#d91a1a}-0.77\%$
test_unflatten_speed 0.3267ms 0.2859ms 3.4978 KOps/s 3.5564 KOps/s $\color{#d91a1a}-1.65\%$
test_common_ops 1.3845ms 1.1762ms 850.1715 Ops/s 798.0044 Ops/s $\textbf{\color{#35bf28}+6.54\%}$
test_creation 32.4310μs 1.4467μs 691.2330 KOps/s 693.9780 KOps/s $\color{#d91a1a}-0.40\%$
test_creation_empty 48.5410μs 13.2013μs 75.7500 KOps/s 62.8318 KOps/s $\textbf{\color{#35bf28}+20.56\%}$
test_creation_nested_1 44.8810μs 15.0899μs 66.2697 KOps/s 55.3472 KOps/s $\textbf{\color{#35bf28}+19.73\%}$
test_creation_nested_2 52.9010μs 17.4062μs 57.4508 KOps/s 48.7703 KOps/s $\textbf{\color{#35bf28}+17.80\%}$
test_clone 73.5110μs 31.1958μs 32.0556 KOps/s 35.0201 KOps/s $\textbf{\color{#d91a1a}-8.47\%}$
test_getitem[int] 1.4380ms 15.2153μs 65.7232 KOps/s 63.8887 KOps/s $\color{#35bf28}+2.87\%$
test_getitem[slice_int] 0.1174ms 26.4723μs 37.7753 KOps/s 36.6802 KOps/s $\color{#35bf28}+2.99\%$
test_getitem[range] 0.2271ms 0.1071ms 9.3379 KOps/s 9.1191 KOps/s $\color{#35bf28}+2.40\%$
test_getitem[tuple] 0.1168ms 22.3641μs 44.7146 KOps/s 43.4083 KOps/s $\color{#35bf28}+3.01\%$
test_getitem[list] 0.2153ms 95.5565μs 10.4650 KOps/s 10.2786 KOps/s $\color{#35bf28}+1.81\%$
test_setitem_dim[int] 67.9010μs 44.1850μs 22.6321 KOps/s 22.4516 KOps/s $\color{#35bf28}+0.80\%$
test_setitem_dim[slice_int] 92.1320μs 65.3007μs 15.3138 KOps/s 14.9776 KOps/s $\color{#35bf28}+2.24\%$
test_setitem_dim[range] 0.1562ms 0.1238ms 8.0743 KOps/s 7.9471 KOps/s $\color{#35bf28}+1.60\%$
test_setitem_dim[tuple] 90.6620μs 59.8246μs 16.7155 KOps/s 16.5676 KOps/s $\color{#35bf28}+0.89\%$
test_setitem 75.4410μs 39.4496μs 25.3488 KOps/s 24.1797 KOps/s $\color{#35bf28}+4.83\%$
test_set 0.1173ms 38.0212μs 26.3011 KOps/s 24.6316 KOps/s $\textbf{\color{#35bf28}+6.78\%}$
test_set_shared 0.3627ms 49.6505μs 20.1408 KOps/s 19.8430 KOps/s $\color{#35bf28}+1.50\%$
test_update 80.3610μs 45.2247μs 22.1118 KOps/s 20.5904 KOps/s $\textbf{\color{#35bf28}+7.39\%}$
test_update_nested 95.0420μs 52.8372μs 18.9261 KOps/s 18.0119 KOps/s $\textbf{\color{#35bf28}+5.08\%}$
test_update__nested 0.2019ms 58.8796μs 16.9838 KOps/s 17.0755 KOps/s $\color{#d91a1a}-0.54\%$
test_set_nested 0.1970ms 40.9434μs 24.4240 KOps/s 23.2449 KOps/s $\textbf{\color{#35bf28}+5.07\%}$
test_set_nested_new 92.9520μs 44.5847μs 22.4292 KOps/s 21.5635 KOps/s $\color{#35bf28}+4.01\%$
test_select 0.2061ms 57.1929μs 17.4847 KOps/s 16.8426 KOps/s $\color{#35bf28}+3.81\%$
test_select_nested 68.9620μs 42.4739μs 23.5439 KOps/s 23.6305 KOps/s $\color{#d91a1a}-0.37\%$
test_exclude_nested 86.6120μs 59.5582μs 16.7903 KOps/s 17.2406 KOps/s $\color{#d91a1a}-2.61\%$
test_empty[True] 0.3007ms 0.2480ms 4.0319 KOps/s 4.1560 KOps/s $\color{#d91a1a}-2.99\%$
test_empty[False] 5.4381μs 0.8112μs 1.2327 MOps/s 1.2241 MOps/s $\color{#35bf28}+0.70\%$
test_to 66.9110μs 25.4625μs 39.2734 KOps/s 39.3421 KOps/s $\color{#d91a1a}-0.17\%$
test_to_nonblocking 98.2620μs 24.2922μs 41.1654 KOps/s 41.1804 KOps/s $\color{#d91a1a}-0.04\%$
test_unbind_speed 1.2004ms 0.2754ms 3.6305 KOps/s 3.6428 KOps/s $\color{#d91a1a}-0.34\%$
test_unbind_speed_stack0 0.3732ms 0.2721ms 3.6754 KOps/s 3.6705 KOps/s $\color{#35bf28}+0.14\%$
test_unbind_speed_stack1 92.7395ms 0.7133ms 1.4020 KOps/s 1.4131 KOps/s $\color{#d91a1a}-0.79\%$
test_split 93.9280ms 2.1714ms 460.5306 Ops/s 462.5995 Ops/s $\color{#d91a1a}-0.45\%$
test_chunk 94.1099ms 2.1024ms 475.6544 Ops/s 464.5054 Ops/s $\color{#35bf28}+2.40\%$
test_creation[device0] 0.3465ms 0.1251ms 7.9962 KOps/s 8.0375 KOps/s $\color{#d91a1a}-0.51\%$
test_creation_from_tensor 0.3878ms 0.1293ms 7.7310 KOps/s 7.8486 KOps/s $\color{#d91a1a}-1.50\%$
test_add_one[memmap_tensor0] 0.2304ms 9.6655μs 103.4605 KOps/s 114.0736 KOps/s $\textbf{\color{#d91a1a}-9.30\%}$
test_contiguous[memmap_tensor0] 22.4610μs 2.1238μs 470.8497 KOps/s 470.0223 KOps/s $\color{#35bf28}+0.18\%$
test_stack[memmap_tensor0] 36.0410μs 6.4422μs 155.2253 KOps/s 151.1681 KOps/s $\color{#35bf28}+2.68\%$
test_memmaptd_index 1.0969ms 0.4059ms 2.4636 KOps/s 2.4182 KOps/s $\color{#35bf28}+1.88\%$
test_memmaptd_index_astensor 0.7351ms 0.4651ms 2.1499 KOps/s 2.1042 KOps/s $\color{#35bf28}+2.17\%$
test_memmaptd_index_op 1.4159ms 0.9776ms 1.0229 KOps/s 982.1350 Ops/s $\color{#35bf28}+4.15\%$
test_serialize_model 0.1299s 0.1294s 7.7269 Ops/s 7.7022 Ops/s $\color{#35bf28}+0.32\%$
test_serialize_model_pickle 1.3485s 1.2128s 0.8245 Ops/s 0.8250 Ops/s $\color{#d91a1a}-0.06\%$
test_serialize_weights 0.1298s 0.1284s 7.7852 Ops/s 7.0121 Ops/s $\textbf{\color{#35bf28}+11.03\%}$
test_serialize_weights_returnearly 0.2129s 55.2839ms 18.0885 Ops/s 17.8675 Ops/s $\color{#35bf28}+1.24\%$
test_serialize_weights_pickle 1.3719s 1.2166s 0.8220 Ops/s 0.8213 Ops/s $\color{#35bf28}+0.08\%$
test_reshape_pytree 79.9220μs 35.1871μs 28.4195 KOps/s 28.2253 KOps/s $\color{#35bf28}+0.69\%$
test_reshape_td 0.1151ms 40.9953μs 24.3930 KOps/s 24.0161 KOps/s $\color{#35bf28}+1.57\%$
test_view_pytree 68.5410μs 34.6722μs 28.8415 KOps/s 28.4167 KOps/s $\color{#35bf28}+1.49\%$
test_view_td 86.2020μs 45.0386μs 22.2032 KOps/s 21.1398 KOps/s $\textbf{\color{#35bf28}+5.03\%}$
test_unbind_pytree 70.5210μs 33.9355μs 29.4676 KOps/s 29.5058 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_td 0.4104ms 42.4608μs 23.5511 KOps/s 23.7364 KOps/s $\color{#d91a1a}-0.78\%$
test_split_pytree 0.3764ms 43.7660μs 22.8488 KOps/s 22.3311 KOps/s $\color{#35bf28}+2.32\%$
test_split_td 93.5106ms 63.1541μs 15.8343 KOps/s 18.2463 KOps/s $\textbf{\color{#d91a1a}-13.22\%}$
test_add_pytree 0.1055ms 55.7842μs 17.9262 KOps/s 17.3529 KOps/s $\color{#35bf28}+3.30\%$
test_add_td 0.1588ms 88.2193μs 11.3354 KOps/s 11.2727 KOps/s $\color{#35bf28}+0.56\%$
test_compile_add_one_nested[tensordict-compile] 0.4018ms 0.2034ms 4.9174 KOps/s 4.7689 KOps/s $\color{#35bf28}+3.11\%$
test_compile_add_one_nested[tensordict-eager] 0.2966ms 0.1558ms 6.4166 KOps/s 6.4119 KOps/s $\color{#35bf28}+0.07\%$
test_compile_add_one_nested[pytree-compile] 0.1955ms 0.1410ms 7.0901 KOps/s 7.0544 KOps/s $\color{#35bf28}+0.51\%$
test_compile_add_one_nested[pytree-eager] 0.2378ms 0.1768ms 5.6546 KOps/s 5.1736 KOps/s $\textbf{\color{#35bf28}+9.30\%}$
test_compile_copy_nested[tensordict-compile] 72.9410μs 20.3098μs 49.2374 KOps/s 48.1040 KOps/s $\color{#35bf28}+2.36\%$
test_compile_copy_nested[tensordict-eager] 90.3520μs 44.0442μs 22.7045 KOps/s 22.9716 KOps/s $\color{#d91a1a}-1.16\%$
test_compile_copy_nested[pytree-compile] 0.3261ms 63.9328μs 15.6414 KOps/s 15.8899 KOps/s $\color{#d91a1a}-1.56\%$
test_compile_copy_nested[pytree-eager] 0.1357ms 48.9603μs 20.4247 KOps/s 20.1194 KOps/s $\color{#35bf28}+1.52\%$
test_compile_add_one_flat[tensordict-compile] 0.4141ms 0.3092ms 3.2339 KOps/s 3.2689 KOps/s $\color{#d91a1a}-1.07\%$
test_compile_add_one_flat[tensordict-eager] 0.2666ms 0.2052ms 4.8737 KOps/s 4.7444 KOps/s $\color{#35bf28}+2.72\%$
test_compile_add_one_flat[tensorclass-compile] 0.1933ms 0.1299ms 7.6992 KOps/s 7.9547 KOps/s $\color{#d91a1a}-3.21\%$
test_compile_add_one_flat[tensorclass-eager] 0.1219ms 61.4854μs 16.2640 KOps/s 16.1005 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_one_flat[pytree-compile] 0.3562ms 0.3071ms 3.2560 KOps/s 3.2555 KOps/s $\color{#35bf28}+0.02\%$
test_compile_add_one_flat[pytree-eager] 0.6827ms 0.5927ms 1.6872 KOps/s 1.5409 KOps/s $\textbf{\color{#35bf28}+9.49\%}$
test_compile_add_self_flat[tensordict-eager] 0.3910ms 0.2466ms 4.0553 KOps/s 3.9409 KOps/s $\color{#35bf28}+2.90\%$
test_compile_add_self_flat[tensordict-compile] 0.3634ms 0.3103ms 3.2231 KOps/s 3.2291 KOps/s $\color{#d91a1a}-0.19\%$
test_compile_add_self_flat[tensorclass-eager] 0.1616ms 69.7314μs 14.3407 KOps/s 13.7803 KOps/s $\color{#35bf28}+4.07\%$
test_compile_add_self_flat[tensorclass-compile] 0.2304ms 0.1296ms 7.7172 KOps/s 7.8484 KOps/s $\color{#d91a1a}-1.67\%$
test_compile_add_self_flat[pytree-eager] 0.5883ms 0.5049ms 1.9806 KOps/s 1.7486 KOps/s $\textbf{\color{#35bf28}+13.27\%}$
test_compile_add_self_flat[pytree-compile] 0.3975ms 0.3082ms 3.2447 KOps/s 3.2403 KOps/s $\color{#35bf28}+0.13\%$
test_compile_copy_flat[tensordict-compile] 79.9120μs 19.1846μs 52.1250 KOps/s 56.0616 KOps/s $\textbf{\color{#d91a1a}-7.02\%}$
test_compile_copy_flat[tensordict-eager] 0.1041ms 29.4354μs 33.9727 KOps/s 34.5564 KOps/s $\color{#d91a1a}-1.69\%$
test_compile_copy_flat[pytree-compile] 0.1016ms 68.2966μs 14.6420 KOps/s 14.5123 KOps/s $\color{#35bf28}+0.89\%$
test_compile_copy_flat[pytree-eager] 0.1017ms 51.3213μs 19.4851 KOps/s 19.3524 KOps/s $\color{#35bf28}+0.69\%$
test_compile_assign_and_add[tensordict-compile] 2.2503ms 0.7754ms 1.2897 KOps/s 1.1800 KOps/s $\textbf{\color{#35bf28}+9.29\%}$
test_compile_assign_and_add[tensordict-eager] 3.3243ms 3.1424ms 318.2268 Ops/s 306.5514 Ops/s $\color{#35bf28}+3.81\%$
test_compile_assign_and_add[pytree-compile] 2.2384ms 0.7754ms 1.2897 KOps/s 1.1620 KOps/s $\textbf{\color{#35bf28}+10.99\%}$
test_compile_assign_and_add[pytree-eager] 3.1975ms 3.0562ms 327.2024 Ops/s 301.9275 Ops/s $\textbf{\color{#35bf28}+8.37\%}$
test_compile_indexing[tensor-tensordict-compile] 0.1539ms 0.1071ms 9.3368 KOps/s 8.9468 KOps/s $\color{#35bf28}+4.36\%$
test_compile_indexing[tensor-tensordict-eager] 0.1850ms 57.9551μs 17.2547 KOps/s 15.6137 KOps/s $\textbf{\color{#35bf28}+10.51\%}$
test_compile_indexing[tensor-tensorclass-compile] 0.2102ms 0.1017ms 9.8295 KOps/s 9.8345 KOps/s $\color{#d91a1a}-0.05\%$
test_compile_indexing[tensor-tensorclass-eager] 74.4510μs 41.0971μs 24.3326 KOps/s 23.5153 KOps/s $\color{#35bf28}+3.48\%$
test_compile_indexing[tensor-pytree-compile] 0.1427ms 0.1041ms 9.6046 KOps/s 9.7244 KOps/s $\color{#d91a1a}-1.23\%$
test_compile_indexing[tensor-pytree-eager] 82.8820μs 44.8733μs 22.2850 KOps/s 23.6254 KOps/s $\textbf{\color{#d91a1a}-5.67\%}$
test_compile_indexing[slice-tensordict-compile] 0.2644ms 0.1350ms 7.4098 KOps/s 7.4517 KOps/s $\color{#d91a1a}-0.56\%$
test_compile_indexing[slice-tensordict-eager] 0.1497ms 24.2411μs 41.2523 KOps/s 40.7895 KOps/s $\color{#35bf28}+1.13\%$
test_compile_indexing[slice-tensorclass-compile] 0.1871ms 0.1280ms 7.8118 KOps/s 7.7729 KOps/s $\color{#35bf28}+0.50\%$
test_compile_indexing[slice-tensorclass-eager] 54.1110μs 20.3984μs 49.0236 KOps/s 50.0281 KOps/s $\color{#d91a1a}-2.01\%$
test_compile_indexing[slice-pytree-compile] 0.1881ms 0.1288ms 7.7632 KOps/s 7.7411 KOps/s $\color{#35bf28}+0.29\%$
test_compile_indexing[slice-pytree-eager] 52.7410μs 19.9443μs 50.1396 KOps/s 49.9158 KOps/s $\color{#35bf28}+0.45\%$
test_compile_indexing[int-tensordict-compile] 0.1963ms 0.1346ms 7.4303 KOps/s 7.4087 KOps/s $\color{#35bf28}+0.29\%$
test_compile_indexing[int-tensordict-eager] 0.4974ms 24.2330μs 41.2660 KOps/s 40.3213 KOps/s $\color{#35bf28}+2.34\%$
test_compile_indexing[int-tensorclass-compile] 0.2725ms 0.1332ms 7.5065 KOps/s 7.7306 KOps/s $\color{#d91a1a}-2.90\%$
test_compile_indexing[int-tensorclass-eager] 46.9810μs 20.1197μs 49.7026 KOps/s 49.4971 KOps/s $\color{#35bf28}+0.42\%$
test_compile_indexing[int-pytree-compile] 0.1729ms 0.1286ms 7.7763 KOps/s 7.7427 KOps/s $\color{#35bf28}+0.43\%$
test_compile_indexing[int-pytree-eager] 50.9610μs 20.1450μs 49.6402 KOps/s 49.2318 KOps/s $\color{#35bf28}+0.83\%$
test_mod_add[eager] 72.4910μs 29.7805μs 33.5790 KOps/s 32.2745 KOps/s $\color{#35bf28}+4.04\%$
test_mod_add[compile] 0.2963ms 68.1403μs 14.6756 KOps/s 14.4776 KOps/s $\color{#35bf28}+1.37\%$
test_mod_add[compile-overhead] 0.2604ms 0.1382ms 7.2382 KOps/s 7.2275 KOps/s $\color{#35bf28}+0.15\%$
test_mod_wrap[eager] 0.3169ms 0.2327ms 4.2977 KOps/s 4.1527 KOps/s $\color{#35bf28}+3.49\%$
test_mod_wrap[compile] 1.1144ms 0.2787ms 3.5887 KOps/s 3.3521 KOps/s $\textbf{\color{#35bf28}+7.06\%}$
test_mod_wrap[compile-overhead] 7.9303ms 4.1651ms 240.0909 Ops/s 245.0415 Ops/s $\color{#d91a1a}-2.02\%$
test_mod_wrap_and_backward[eager] 1.4354ms 1.3335ms 749.9201 Ops/s 695.6471 Ops/s $\textbf{\color{#35bf28}+7.80\%}$
test_mod_wrap_and_backward[compile] 1.6683ms 1.3060ms 765.7096 Ops/s 706.3970 Ops/s $\textbf{\color{#35bf28}+8.40\%}$
test_mod_wrap_and_backward[compile-overhead] 1.2739ms 0.8724ms 1.1462 KOps/s 1.0246 KOps/s $\textbf{\color{#35bf28}+11.87\%}$
test_seq_add[eager] 0.1413ms 90.1744μs 11.0896 KOps/s 10.3740 KOps/s $\textbf{\color{#35bf28}+6.90\%}$
test_seq_add[compile] 0.3686ms 81.2631μs 12.3057 KOps/s 12.6034 KOps/s $\color{#d91a1a}-2.36\%$
test_seq_add[compile-overhead] 0.1649ms 0.1175ms 8.5125 KOps/s 8.9336 KOps/s $\color{#d91a1a}-4.71\%$
test_seq_wrap[eager] 0.4269ms 0.3731ms 2.6802 KOps/s 2.6070 KOps/s $\color{#35bf28}+2.81\%$
test_seq_wrap[compile] 0.4108ms 0.3048ms 3.2812 KOps/s 3.2936 KOps/s $\color{#d91a1a}-0.38\%$
test_seq_wrap[compile-overhead] 0.2583ms 0.2092ms 4.7796 KOps/s 4.8256 KOps/s $\color{#d91a1a}-0.95\%$
test_func_call_runtime[False-eager] 0.8610ms 0.7413ms 1.3491 KOps/s 1.3596 KOps/s $\color{#d91a1a}-0.77\%$
test_func_call_runtime[False-compile] 0.9782ms 0.7873ms 1.2701 KOps/s 1.2903 KOps/s $\color{#d91a1a}-1.57\%$
test_func_call_runtime[False-compile-overhead] 0.4119ms 0.3416ms 2.9271 KOps/s 2.9118 KOps/s $\color{#35bf28}+0.52\%$
test_func_call_runtime[True-eager] 1.0131ms 0.8878ms 1.1264 KOps/s 1.1217 KOps/s $\color{#35bf28}+0.42\%$
test_func_call_runtime[True-compile] 0.9340ms 0.7974ms 1.2541 KOps/s 1.2370 KOps/s $\color{#35bf28}+1.38\%$
test_func_call_runtime[True-compile-overhead] 0.4515ms 0.3734ms 2.6782 KOps/s 2.6576 KOps/s $\color{#35bf28}+0.77\%$
test_func_call_cm_runtime[False-eager] 0.8329ms 0.7150ms 1.3985 KOps/s 1.3767 KOps/s $\color{#35bf28}+1.59\%$
test_func_call_cm_runtime[False-compile] 0.8512ms 0.7652ms 1.3068 KOps/s 1.2833 KOps/s $\color{#35bf28}+1.83\%$
test_func_call_cm_runtime[False-compile-overhead] 0.3895ms 0.3419ms 2.9244 KOps/s 2.9003 KOps/s $\color{#35bf28}+0.83\%$
test_func_call_cm_runtime[True-eager] 1.0828ms 0.9765ms 1.0240 KOps/s 1.0150 KOps/s $\color{#35bf28}+0.89\%$
test_func_call_cm_runtime[True-compile] 0.9805ms 0.8238ms 1.2138 KOps/s 1.1945 KOps/s $\color{#35bf28}+1.62\%$
test_func_call_cm_runtime[True-compile-overhead] 0.4801ms 0.3961ms 2.5245 KOps/s 2.4745 KOps/s $\color{#35bf28}+2.02\%$
test_vmap_func_call_cm_runtime[eager] 2.5598ms 2.0446ms 489.0969 Ops/s 481.9476 Ops/s $\color{#35bf28}+1.48\%$
test_vmap_func_call_cm_runtime[compile] 0.9430ms 0.8377ms 1.1938 KOps/s 1.1571 KOps/s $\color{#35bf28}+3.17\%$
test_vmap_func_call_cm_runtime[compile-overhead] 0.5319ms 0.4034ms 2.4788 KOps/s 2.4479 KOps/s $\color{#35bf28}+1.26\%$
test_distributed 0.8357ms 0.1545ms 6.4707 KOps/s 8.4430 KOps/s $\textbf{\color{#d91a1a}-23.36\%}$
test_tdmodule 0.5401ms 14.0770μs 71.0379 KOps/s 66.6883 KOps/s $\textbf{\color{#35bf28}+6.52\%}$
test_tdmodule_dispatch 46.8610μs 26.1770μs 38.2014 KOps/s 32.4092 KOps/s $\textbf{\color{#35bf28}+17.87\%}$
test_tdseq 31.6600μs 13.5733μs 73.6738 KOps/s 60.1405 KOps/s $\textbf{\color{#35bf28}+22.50\%}$
test_tdseq_dispatch 61.0210μs 28.5867μs 34.9813 KOps/s 29.2981 KOps/s $\textbf{\color{#35bf28}+19.40\%}$
test_instantiation_functorch 1.9014ms 1.8019ms 554.9579 Ops/s 525.5641 Ops/s $\textbf{\color{#35bf28}+5.59\%}$
test_instantiation_td 1.8244ms 1.1748ms 851.2334 Ops/s 840.3486 Ops/s $\color{#35bf28}+1.30\%$
test_exec_functorch 0.2423ms 0.2072ms 4.8270 KOps/s 4.4588 KOps/s $\textbf{\color{#35bf28}+8.26\%}$
test_exec_functional_call 0.2386ms 0.2028ms 4.9304 KOps/s 4.4565 KOps/s $\textbf{\color{#35bf28}+10.63\%}$
test_exec_td 0.2416ms 0.2084ms 4.7976 KOps/s 4.6855 KOps/s $\color{#35bf28}+2.39\%$
test_exec_td_decorator 1.0271ms 0.2500ms 3.9994 KOps/s 3.8963 KOps/s $\color{#35bf28}+2.65\%$
test_vmap_mlp_speed[True-True] 0.7783ms 0.6678ms 1.4975 KOps/s 1.4635 KOps/s $\color{#35bf28}+2.32\%$
test_vmap_mlp_speed[True-False] 0.7483ms 0.6648ms 1.5043 KOps/s 1.4695 KOps/s $\color{#35bf28}+2.37\%$
test_vmap_mlp_speed[False-True] 0.6816ms 0.5684ms 1.7594 KOps/s 1.7407 KOps/s $\color{#35bf28}+1.08\%$
test_vmap_mlp_speed[False-False] 0.6118ms 0.5693ms 1.7567 KOps/s 1.7375 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed_decorator[True-True] 1.2838ms 0.6473ms 1.5449 KOps/s 1.4878 KOps/s $\color{#35bf28}+3.84\%$
test_vmap_mlp_speed_decorator[True-False] 0.7798ms 0.6505ms 1.5373 KOps/s 1.4246 KOps/s $\textbf{\color{#35bf28}+7.91\%}$
test_vmap_mlp_speed_decorator[False-True] 0.6835ms 0.5770ms 1.7331 KOps/s 1.6193 KOps/s $\textbf{\color{#35bf28}+7.03\%}$
test_vmap_mlp_speed_decorator[False-False] 0.7144ms 0.5885ms 1.6993 KOps/s 1.6080 KOps/s $\textbf{\color{#35bf28}+5.68\%}$
test_vmap_transformer_speed[True-True] 8.2680ms 8.1784ms 122.2736 Ops/s 119.6518 Ops/s $\color{#35bf28}+2.19\%$
test_vmap_transformer_speed[True-False] 8.5151ms 8.1632ms 122.5014 Ops/s 120.1227 Ops/s $\color{#35bf28}+1.98\%$
test_vmap_transformer_speed[False-True] 8.2883ms 7.9941ms 125.0920 Ops/s 123.0879 Ops/s $\color{#35bf28}+1.63\%$
test_vmap_transformer_speed[False-False] 8.3450ms 7.9808ms 125.3005 Ops/s 122.5445 Ops/s $\color{#35bf28}+2.25\%$
test_vmap_transformer_speed_decorator[True-True] 19.9069ms 19.2251ms 52.0154 Ops/s 51.6018 Ops/s $\color{#35bf28}+0.80\%$
test_vmap_transformer_speed_decorator[True-False] 19.8358ms 19.2965ms 51.8228 Ops/s 51.4379 Ops/s $\color{#35bf28}+0.75\%$
test_vmap_transformer_speed_decorator[False-True] 19.6789ms 19.0201ms 52.5760 Ops/s 51.8664 Ops/s $\color{#35bf28}+1.37\%$
test_vmap_transformer_speed_decorator[False-False] 20.1934ms 19.0286ms 52.5524 Ops/s 51.7766 Ops/s $\color{#35bf28}+1.50\%$
test_to_module_speed[True] 1.3450ms 0.9251ms 1.0809 KOps/s 1.0895 KOps/s $\color{#d91a1a}-0.78\%$
test_to_module_speed[False] 1.2820ms 0.9074ms 1.1021 KOps/s 1.1215 KOps/s $\color{#d91a1a}-1.73\%$
test_tc_init 63.4110μs 30.5347μs 32.7496 KOps/s 27.4110 KOps/s $\textbf{\color{#35bf28}+19.48\%}$
test_tc_init_nested 0.1027ms 62.1177μs 16.0985 KOps/s 13.9107 KOps/s $\textbf{\color{#35bf28}+15.73\%}$
test_tc_first_layer_tensor 5.1630μs 0.6900μs 1.4492 MOps/s 1.4215 MOps/s $\color{#35bf28}+1.95\%$
test_tc_first_layer_nontensor 22.7210μs 2.2340μs 447.6302 KOps/s 442.2745 KOps/s $\color{#35bf28}+1.21\%$
test_tc_second_layer_tensor 17.1870μs 1.3953μs 716.6764 KOps/s 719.0131 KOps/s $\color{#d91a1a}-0.32\%$
test_tc_second_layer_nontensor 25.7410μs 2.9278μs 341.5529 KOps/s 339.2350 KOps/s $\color{#35bf28}+0.68\%$
test_unbind 0.1904s 11.9468ms 83.7044 Ops/s 93.5707 Ops/s $\textbf{\color{#d91a1a}-10.54\%}$
test_full_like 0.6585ms 0.5756ms 1.7374 KOps/s 1.7431 KOps/s $\color{#d91a1a}-0.33\%$
test_zeros_like 0.2587ms 0.1979ms 5.0538 KOps/s 5.0523 KOps/s $\color{#35bf28}+0.03\%$
test_ones_like 0.2412ms 0.1978ms 5.0560 KOps/s 5.0558 KOps/s $+0.00\%$
test_clone 0.4488ms 0.4145ms 2.4128 KOps/s 2.4182 KOps/s $\color{#d91a1a}-0.22\%$
test_squeeze 37.5510μs 9.6871μs 103.2299 KOps/s 102.0674 KOps/s $\color{#35bf28}+1.14\%$
test_unsqueeze 0.2818ms 72.4216μs 13.8080 KOps/s 13.5565 KOps/s $\color{#35bf28}+1.86\%$
test_split 0.2497ms 0.1552ms 6.4454 KOps/s 6.4005 KOps/s $\color{#35bf28}+0.70\%$
test_permute 0.2241ms 0.1765ms 5.6662 KOps/s 5.7126 KOps/s $\color{#d91a1a}-0.81\%$
test_stack 1.2560ms 0.8557ms 1.1687 KOps/s 1.1488 KOps/s $\color{#35bf28}+1.73\%$
test_cat 1.2577ms 1.2314ms 812.0757 Ops/s 811.6963 Ops/s $\color{#35bf28}+0.05\%$

@vmoens vmoens added the bug Something isn't working label Sep 13, 2024
@vmoens vmoens merged commit d9d9225 into gh/vmoens/17/base Sep 13, 2024
42 of 50 checks passed
@vmoens vmoens deleted the gh/vmoens/17/head branch September 13, 2024 08:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants