Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Refactor] Cleanup deprecation of empty td filtering in apply #665

Merged
merged 6 commits into from
Feb 6, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Feb 6, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Feb 6, 2024
Copy link

github-actions bot commented Feb 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 124. Improved: $\large\color{#35bf28}9$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 40.3250μs 17.2379μs 58.0118 KOps/s 60.1854 KOps/s $\color{#d91a1a}-3.61\%$
test_plain_set_stack_nested 0.1777ms 0.1471ms 6.7994 KOps/s 6.9281 KOps/s $\color{#d91a1a}-1.86\%$
test_plain_set_nested_inplace 63.6890μs 19.5148μs 51.2430 KOps/s 52.2071 KOps/s $\color{#d91a1a}-1.85\%$
test_plain_set_stack_nested_inplace 0.2601ms 0.1798ms 5.5612 KOps/s 5.5086 KOps/s $\color{#35bf28}+0.96\%$
test_items 30.9880μs 2.4678μs 405.2215 KOps/s 408.4114 KOps/s $\color{#d91a1a}-0.78\%$
test_items_nested 0.3370ms 0.2697ms 3.7074 KOps/s 3.6960 KOps/s $\color{#35bf28}+0.31\%$
test_items_nested_locked 0.5790ms 0.2717ms 3.6803 KOps/s 3.6579 KOps/s $\color{#35bf28}+0.61\%$
test_items_nested_leaf 0.2905ms 0.1680ms 5.9535 KOps/s 5.9170 KOps/s $\color{#35bf28}+0.62\%$
test_items_stack_nested 6.2849ms 1.5067ms 663.6997 Ops/s 737.2071 Ops/s $\textbf{\color{#d91a1a}-9.97\%}$
test_items_stack_nested_leaf 1.3970ms 1.2133ms 824.1877 Ops/s 822.7686 Ops/s $\color{#35bf28}+0.17\%$
test_items_stack_nested_locked 1.9186ms 0.9225ms 1.0840 KOps/s 1.1322 KOps/s $\color{#d91a1a}-4.26\%$
test_keys 25.8780μs 3.8667μs 258.6160 KOps/s 253.2375 KOps/s $\color{#35bf28}+2.12\%$
test_keys_nested 1.4817ms 0.1482ms 6.7477 KOps/s 6.6802 KOps/s $\color{#35bf28}+1.01\%$
test_keys_nested_locked 0.2949ms 0.1505ms 6.6452 KOps/s 6.4863 KOps/s $\color{#35bf28}+2.45\%$
test_keys_nested_leaf 0.2649ms 0.1288ms 7.7643 KOps/s 7.5706 KOps/s $\color{#35bf28}+2.56\%$
test_keys_stack_nested 1.6322ms 1.2662ms 789.7480 Ops/s 772.6195 Ops/s $\color{#35bf28}+2.22\%$
test_keys_stack_nested_leaf 1.9934ms 1.2737ms 785.1119 Ops/s 764.8654 Ops/s $\color{#35bf28}+2.65\%$
test_keys_stack_nested_locked 1.3230ms 0.8128ms 1.2304 KOps/s 1.2174 KOps/s $\color{#35bf28}+1.06\%$
test_values 11.0908μs 1.1202μs 892.6681 KOps/s 794.5375 KOps/s $\textbf{\color{#35bf28}+12.35\%}$
test_values_nested 0.1153ms 51.3106μs 19.4891 KOps/s 19.1650 KOps/s $\color{#35bf28}+1.69\%$
test_values_nested_locked 2.4390ms 52.0232μs 19.2222 KOps/s 19.2913 KOps/s $\color{#d91a1a}-0.36\%$
test_values_nested_leaf 97.5120μs 47.0519μs 21.2531 KOps/s 21.2954 KOps/s $\color{#d91a1a}-0.20\%$
test_values_stack_nested 1.2535ms 1.0370ms 964.3237 Ops/s 922.1148 Ops/s $\color{#35bf28}+4.58\%$
test_values_stack_nested_leaf 1.9389ms 1.0334ms 967.7208 Ops/s 948.6606 Ops/s $\color{#35bf28}+2.01\%$
test_values_stack_nested_locked 1.1305ms 0.6121ms 1.6338 KOps/s 1.6170 KOps/s $\color{#35bf28}+1.04\%$
test_membership 20.6090μs 1.3630μs 733.6725 KOps/s 728.3296 KOps/s $\color{#35bf28}+0.73\%$
test_membership_nested 47.6090μs 3.4132μs 292.9768 KOps/s 290.3993 KOps/s $\color{#35bf28}+0.89\%$
test_membership_nested_leaf 38.9830μs 3.4386μs 290.8147 KOps/s 270.5696 KOps/s $\textbf{\color{#35bf28}+7.48\%}$
test_membership_stacked_nested 39.4130μs 11.6169μs 86.0813 KOps/s 82.9280 KOps/s $\color{#35bf28}+3.80\%$
test_membership_stacked_nested_leaf 39.7840μs 11.6305μs 85.9805 KOps/s 82.9069 KOps/s $\color{#35bf28}+3.71\%$
test_membership_nested_last 38.2710μs 6.6055μs 151.3881 KOps/s 149.8462 KOps/s $\color{#35bf28}+1.03\%$
test_membership_nested_leaf_last 42.9100μs 6.6696μs 149.9337 KOps/s 151.8556 KOps/s $\color{#d91a1a}-1.27\%$
test_membership_stacked_nested_last 0.2963ms 0.1786ms 5.5981 KOps/s 5.6812 KOps/s $\color{#d91a1a}-1.46\%$
test_membership_stacked_nested_leaf_last 52.4980μs 13.8512μs 72.1959 KOps/s 71.0455 KOps/s $\color{#35bf28}+1.62\%$
test_nested_getleaf 46.6370μs 10.5853μs 94.4710 KOps/s 88.9857 KOps/s $\textbf{\color{#35bf28}+6.16\%}$
test_nested_get 66.0130μs 10.0677μs 99.3279 KOps/s 97.8670 KOps/s $\color{#35bf28}+1.49\%$
test_stacked_getleaf 0.6031ms 0.4015ms 2.4907 KOps/s 2.4203 KOps/s $\color{#35bf28}+2.91\%$
test_stacked_get 0.5410ms 0.3642ms 2.7459 KOps/s 2.6347 KOps/s $\color{#35bf28}+4.22\%$
test_nested_getitemleaf 50.1330μs 12.0339μs 83.0986 KOps/s 81.3889 KOps/s $\color{#35bf28}+2.10\%$
test_nested_getitem 55.4440μs 11.4779μs 87.1238 KOps/s 84.6358 KOps/s $\color{#35bf28}+2.94\%$
test_stacked_getitemleaf 0.5168ms 0.4048ms 2.4704 KOps/s 2.3925 KOps/s $\color{#35bf28}+3.26\%$
test_stacked_getitem 0.7510ms 0.3701ms 2.7019 KOps/s 2.5690 KOps/s $\textbf{\color{#35bf28}+5.17\%}$
test_lock_nested 3.0679ms 0.3458ms 2.8922 KOps/s 2.8641 KOps/s $\color{#35bf28}+0.98\%$
test_lock_stack_nested 97.6995ms 6.5670ms 152.2770 Ops/s 151.0803 Ops/s $\color{#35bf28}+0.79\%$
test_unlock_nested 83.8775ms 0.4296ms 2.3276 KOps/s 2.8690 KOps/s $\textbf{\color{#d91a1a}-18.87\%}$
test_unlock_stack_nested 99.3266ms 6.6320ms 150.7851 Ops/s 146.8489 Ops/s $\color{#35bf28}+2.68\%$
test_flatten_speed 0.8640ms 0.3620ms 2.7627 KOps/s 2.7181 KOps/s $\color{#35bf28}+1.64\%$
test_unflatten_speed 0.5679ms 0.4561ms 2.1923 KOps/s 2.1609 KOps/s $\color{#35bf28}+1.46\%$
test_common_ops 1.5001ms 0.7026ms 1.4233 KOps/s 1.4300 KOps/s $\color{#d91a1a}-0.47\%$
test_creation 26.1190μs 1.8579μs 538.2334 KOps/s 529.5345 KOps/s $\color{#35bf28}+1.64\%$
test_creation_empty 31.6390μs 10.4882μs 95.3456 KOps/s 101.4632 KOps/s $\textbf{\color{#d91a1a}-6.03\%}$
test_creation_nested_1 37.8710μs 12.9979μs 76.9356 KOps/s 80.1093 KOps/s $\color{#d91a1a}-3.96\%$
test_creation_nested_2 73.1970μs 16.3531μs 61.1507 KOps/s 62.3526 KOps/s $\color{#d91a1a}-1.93\%$
test_clone 49.3410μs 12.9067μs 77.4793 KOps/s 75.6702 KOps/s $\color{#35bf28}+2.39\%$
test_getitem[int] 0.1289ms 11.3177μs 88.3571 KOps/s 87.4979 KOps/s $\color{#35bf28}+0.98\%$
test_getitem[slice_int] 63.5280μs 22.8257μs 43.8104 KOps/s 44.3611 KOps/s $\color{#d91a1a}-1.24\%$
test_getitem[range] 0.1224ms 41.9781μs 23.8219 KOps/s 23.3822 KOps/s $\color{#35bf28}+1.88\%$
test_getitem[tuple] 47.9700μs 18.3755μs 54.4204 KOps/s 52.3221 KOps/s $\color{#35bf28}+4.01\%$
test_getitem[list] 0.2307ms 36.6402μs 27.2924 KOps/s 25.8591 KOps/s $\textbf{\color{#35bf28}+5.54\%}$
test_setitem_dim[int] 80.0700μs 31.9284μs 31.3201 KOps/s 32.3999 KOps/s $\color{#d91a1a}-3.33\%$
test_setitem_dim[slice_int] 99.4350μs 56.9869μs 17.5479 KOps/s 17.7227 KOps/s $\color{#d91a1a}-0.99\%$
test_setitem_dim[range] 0.1514ms 78.0432μs 12.8134 KOps/s 13.2637 KOps/s $\color{#d91a1a}-3.40\%$
test_setitem_dim[tuple] 93.0030μs 45.8694μs 21.8010 KOps/s 21.7483 KOps/s $\color{#35bf28}+0.24\%$
test_setitem 0.1033ms 20.2507μs 49.3810 KOps/s 48.9919 KOps/s $\color{#35bf28}+0.79\%$
test_set 0.1102ms 19.3612μs 51.6498 KOps/s 51.3711 KOps/s $\color{#35bf28}+0.54\%$
test_set_shared 3.5908ms 0.1421ms 7.0368 KOps/s 7.0808 KOps/s $\color{#d91a1a}-0.62\%$
test_update 0.1484ms 22.8634μs 43.7380 KOps/s 45.1178 KOps/s $\color{#d91a1a}-3.06\%$
test_update_nested 0.1590ms 31.0520μs 32.2040 KOps/s 33.5499 KOps/s $\color{#d91a1a}-4.01\%$
test_set_nested 0.1398ms 22.1803μs 45.0851 KOps/s 46.0798 KOps/s $\color{#d91a1a}-2.16\%$
test_set_nested_new 0.1149ms 25.3438μs 39.4574 KOps/s 39.4620 KOps/s $\color{#d91a1a}-0.01\%$
test_select 0.1226ms 38.1730μs 26.1966 KOps/s 25.9601 KOps/s $\color{#35bf28}+0.91\%$
test_select_nested 0.1653ms 58.5570μs 17.0774 KOps/s 17.0340 KOps/s $\color{#35bf28}+0.25\%$
test_exclude_nested 0.2106ms 0.1158ms 8.6344 KOps/s 8.4510 KOps/s $\color{#35bf28}+2.17\%$
test_empty[True] 0.5462ms 0.4056ms 2.4656 KOps/s 2.4317 KOps/s $\color{#35bf28}+1.39\%$
test_empty[False] 11.6116μs 1.0393μs 962.1523 KOps/s 973.0841 KOps/s $\color{#d91a1a}-1.12\%$
test_unbind_speed 3.7376ms 0.2694ms 3.7118 KOps/s 3.8515 KOps/s $\color{#d91a1a}-3.63\%$
test_unbind_speed_stack0 93.1091ms 3.5204ms 284.0589 Ops/s 305.1447 Ops/s $\textbf{\color{#d91a1a}-6.91\%}$
test_unbind_speed_stack1 36.5780μs 2.1177μs 472.2160 KOps/s 496.6142 KOps/s $\color{#d91a1a}-4.91\%$
test_split 85.1643ms 1.6579ms 603.1879 Ops/s 582.1753 Ops/s $\color{#35bf28}+3.61\%$
test_chunk 2.1878ms 1.4459ms 691.6258 Ops/s 609.7770 Ops/s $\textbf{\color{#35bf28}+13.42\%}$
test_creation[device0] 0.2974ms 0.1029ms 9.7136 KOps/s 9.4836 KOps/s $\color{#35bf28}+2.43\%$
test_creation_from_tensor 4.5053ms 83.2829μs 12.0073 KOps/s 11.8546 KOps/s $\color{#35bf28}+1.29\%$
test_add_one[memmap_tensor0] 0.3700ms 5.3855μs 185.6835 KOps/s 186.0810 KOps/s $\color{#d91a1a}-0.21\%$
test_contiguous[memmap_tensor0] 15.2890μs 0.6228μs 1.6055 MOps/s 1.5444 MOps/s $\color{#35bf28}+3.96\%$
test_stack[memmap_tensor0] 0.1151ms 3.6223μs 276.0639 KOps/s 271.4379 KOps/s $\color{#35bf28}+1.70\%$
test_memmaptd_index 1.1092ms 0.2394ms 4.1763 KOps/s 4.0088 KOps/s $\color{#35bf28}+4.18\%$
test_memmaptd_index_astensor 0.6541ms 0.3007ms 3.3259 KOps/s 3.2010 KOps/s $\color{#35bf28}+3.90\%$
test_memmaptd_index_op 85.8339ms 0.6576ms 1.5206 KOps/s 1.6443 KOps/s $\textbf{\color{#d91a1a}-7.52\%}$
test_serialize_model 0.1116s 0.1060s 9.4348 Ops/s 8.4292 Ops/s $\textbf{\color{#35bf28}+11.93\%}$
test_serialize_model_pickle 0.5113s 0.3920s 2.5508 Ops/s 2.5546 Ops/s $\color{#d91a1a}-0.15\%$
test_serialize_weights 0.1118s 0.1068s 9.3665 Ops/s 9.6503 Ops/s $\color{#d91a1a}-2.94\%$
test_serialize_weights_returnearly 0.2119s 0.1372s 7.2888 Ops/s 7.5863 Ops/s $\color{#d91a1a}-3.92\%$
test_serialize_weights_pickle 1.0416s 0.6265s 1.5962 Ops/s 2.3583 Ops/s $\textbf{\color{#d91a1a}-32.32\%}$
test_serialize_weights_filesystem 0.1058s 95.7363ms 10.4454 Ops/s 9.4955 Ops/s $\textbf{\color{#35bf28}+10.00\%}$
test_serialize_model_filesystem 0.1864s 0.1085s 9.2163 Ops/s 10.2039 Ops/s $\textbf{\color{#d91a1a}-9.68\%}$
test_reshape_pytree 0.1133ms 21.7827μs 45.9080 KOps/s 47.4999 KOps/s $\color{#d91a1a}-3.35\%$
test_reshape_td 89.3670μs 32.1794μs 31.0758 KOps/s 32.3196 KOps/s $\color{#d91a1a}-3.85\%$
test_view_pytree 56.9660μs 20.8435μs 47.9766 KOps/s 48.1601 KOps/s $\color{#d91a1a}-0.38\%$
test_view_td 94.7583ms 12.3486μs 80.9808 KOps/s 84.0831 KOps/s $\color{#d91a1a}-3.69\%$
test_unbind_pytree 63.7590μs 24.5907μs 40.6658 KOps/s 40.6062 KOps/s $\color{#35bf28}+0.15\%$
test_unbind_td 0.1331ms 36.6239μs 27.3046 KOps/s 27.4257 KOps/s $\color{#d91a1a}-0.44\%$
test_split_pytree 56.9670μs 23.6513μs 42.2810 KOps/s 41.7419 KOps/s $\color{#35bf28}+1.29\%$
test_split_td 0.4588ms 40.3711μs 24.7702 KOps/s 24.3280 KOps/s $\color{#35bf28}+1.82\%$
test_add_pytree 92.0420μs 29.7307μs 33.6353 KOps/s 32.7800 KOps/s $\color{#35bf28}+2.61\%$
test_add_td 0.1572ms 56.2543μs 17.7764 KOps/s 18.7669 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_distributed 0.2559ms 0.1022ms 9.7849 KOps/s 9.7174 KOps/s $\color{#35bf28}+0.69\%$
test_tdmodule 0.3141ms 22.9189μs 43.6320 KOps/s 44.2526 KOps/s $\color{#d91a1a}-1.40\%$
test_tdmodule_dispatch 0.2137ms 44.4653μs 22.4895 KOps/s 22.9512 KOps/s $\color{#d91a1a}-2.01\%$
test_tdseq 0.1182ms 25.5907μs 39.0767 KOps/s 38.6318 KOps/s $\color{#35bf28}+1.15\%$
test_tdseq_dispatch 0.1377ms 47.7583μs 20.9388 KOps/s 20.7904 KOps/s $\color{#35bf28}+0.71\%$
test_instantiation_functorch 2.1253ms 1.3224ms 756.1819 Ops/s 753.1875 Ops/s $\color{#35bf28}+0.40\%$
test_instantiation_td 1.7469ms 1.0220ms 978.4398 Ops/s 860.3935 Ops/s $\textbf{\color{#35bf28}+13.72\%}$
test_exec_functorch 0.3072ms 0.1578ms 6.3386 KOps/s 6.1310 KOps/s $\color{#35bf28}+3.39\%$
test_exec_functional_call 0.3394ms 0.1478ms 6.7637 KOps/s 6.5615 KOps/s $\color{#35bf28}+3.08\%$
test_exec_td 0.3487ms 0.1465ms 6.8276 KOps/s 6.8035 KOps/s $\color{#35bf28}+0.35\%$
test_exec_td_decorator 0.8739ms 0.1992ms 5.0198 KOps/s 4.9710 KOps/s $\color{#35bf28}+0.98\%$
test_vmap_mlp_speed[True-True] 1.2823ms 0.8748ms 1.1431 KOps/s 1.1086 KOps/s $\color{#35bf28}+3.12\%$
test_vmap_mlp_speed[True-False] 0.7157ms 0.4776ms 2.0937 KOps/s 2.0961 KOps/s $\color{#d91a1a}-0.11\%$
test_vmap_mlp_speed[False-True] 1.0259ms 0.7630ms 1.3107 KOps/s 1.2805 KOps/s $\color{#35bf28}+2.36\%$
test_vmap_mlp_speed[False-False] 0.6822ms 0.3927ms 2.5463 KOps/s 2.5507 KOps/s $\color{#d91a1a}-0.17\%$
test_vmap_mlp_speed_decorator[True-True] 3.2946ms 2.3129ms 432.3495 Ops/s 426.2614 Ops/s $\color{#35bf28}+1.43\%$
test_vmap_mlp_speed_decorator[True-False] 1.2657ms 0.5509ms 1.8152 KOps/s 1.8150 KOps/s $+0.01\%$
test_vmap_mlp_speed_decorator[False-True] 2.4511ms 1.8977ms 526.9422 Ops/s 520.5958 Ops/s $\color{#35bf28}+1.22\%$
test_vmap_mlp_speed_decorator[False-False] 0.9387ms 0.4243ms 2.3567 KOps/s 2.3723 KOps/s $\color{#d91a1a}-0.66\%$

Copy link

github-actions bot commented Feb 6, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 132. Improved: $\large\color{#35bf28}7$. Worsened: $\large\color{#d91a1a}27$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 81.5410μs 13.8421μs 72.2436 KOps/s 72.6614 KOps/s $\color{#d91a1a}-0.57\%$
test_plain_set_stack_nested 0.1475ms 0.1186ms 8.4282 KOps/s 8.3745 KOps/s $\color{#35bf28}+0.64\%$
test_plain_set_nested_inplace 43.0310μs 15.2209μs 65.6992 KOps/s 66.2599 KOps/s $\color{#d91a1a}-0.85\%$
test_plain_set_stack_nested_inplace 0.1800ms 0.1478ms 6.7661 KOps/s 6.7267 KOps/s $\color{#35bf28}+0.59\%$
test_items 26.2500μs 4.7252μs 211.6301 KOps/s 199.7016 KOps/s $\textbf{\color{#35bf28}+5.97\%}$
test_items_nested 0.3882ms 0.3380ms 2.9587 KOps/s 2.9224 KOps/s $\color{#35bf28}+1.24\%$
test_items_nested_locked 0.3685ms 0.3425ms 2.9201 KOps/s 2.8934 KOps/s $\color{#35bf28}+0.92\%$
test_items_nested_leaf 0.2423ms 0.2004ms 4.9900 KOps/s 4.9182 KOps/s $\color{#35bf28}+1.46\%$
test_items_stack_nested 1.4220ms 1.3154ms 760.2228 Ops/s 764.7719 Ops/s $\color{#d91a1a}-0.59\%$
test_items_stack_nested_leaf 1.2591ms 1.1506ms 869.0952 Ops/s 870.3376 Ops/s $\color{#d91a1a}-0.14\%$
test_items_stack_nested_locked 0.9409ms 0.8942ms 1.1183 KOps/s 1.1225 KOps/s $\color{#d91a1a}-0.38\%$
test_keys 23.5000μs 4.5605μs 219.2741 KOps/s 207.2289 KOps/s $\textbf{\color{#35bf28}+5.81\%}$
test_keys_nested 0.5520ms 94.4232μs 10.5906 KOps/s 10.4623 KOps/s $\color{#35bf28}+1.23\%$
test_keys_nested_locked 0.1207ms 98.5078μs 10.1515 KOps/s 10.0133 KOps/s $\color{#35bf28}+1.38\%$
test_keys_nested_leaf 0.1801ms 77.9004μs 12.8369 KOps/s 12.5385 KOps/s $\color{#35bf28}+2.38\%$
test_keys_stack_nested 1.3293ms 1.1473ms 871.5897 Ops/s 867.7975 Ops/s $\color{#35bf28}+0.44\%$
test_keys_stack_nested_leaf 1.2134ms 1.1357ms 880.5226 Ops/s 871.6868 Ops/s $\color{#35bf28}+1.01\%$
test_keys_stack_nested_locked 0.8171ms 0.7185ms 1.3918 KOps/s 1.3846 KOps/s $\color{#35bf28}+0.52\%$
test_values 11.4100μs 1.8963μs 527.3498 KOps/s 529.2929 KOps/s $\color{#d91a1a}-0.37\%$
test_values_nested 77.7010μs 45.1256μs 22.1604 KOps/s 22.1556 KOps/s $\color{#35bf28}+0.02\%$
test_values_nested_locked 71.7410μs 47.1987μs 21.1870 KOps/s 20.9743 KOps/s $\color{#35bf28}+1.01\%$
test_values_nested_leaf 58.3700μs 39.1759μs 25.5259 KOps/s 25.1491 KOps/s $\color{#35bf28}+1.50\%$
test_values_stack_nested 1.0107ms 0.9643ms 1.0370 KOps/s 1.0450 KOps/s $\color{#d91a1a}-0.77\%$
test_values_stack_nested_leaf 1.1098ms 0.9617ms 1.0398 KOps/s 1.0487 KOps/s $\color{#d91a1a}-0.85\%$
test_values_stack_nested_locked 0.8193ms 0.5773ms 1.7321 KOps/s 1.7671 KOps/s $\color{#d91a1a}-1.98\%$
test_membership 25.3400μs 1.0454μs 956.5916 KOps/s 900.6288 KOps/s $\textbf{\color{#35bf28}+6.21\%}$
test_membership_nested 20.9310μs 2.8562μs 350.1214 KOps/s 345.6228 KOps/s $\color{#35bf28}+1.30\%$
test_membership_nested_leaf 22.5900μs 2.8562μs 350.1196 KOps/s 344.7942 KOps/s $\color{#35bf28}+1.54\%$
test_membership_stacked_nested 35.2410μs 11.1748μs 89.4874 KOps/s 88.9068 KOps/s $\color{#35bf28}+0.65\%$
test_membership_stacked_nested_leaf 49.2700μs 11.2689μs 88.7395 KOps/s 88.0927 KOps/s $\color{#35bf28}+0.73\%$
test_membership_nested_last 30.8110μs 5.3300μs 187.6178 KOps/s 184.0657 KOps/s $\color{#35bf28}+1.93\%$
test_membership_nested_leaf_last 26.5110μs 5.3098μs 188.3322 KOps/s 184.1005 KOps/s $\color{#35bf28}+2.30\%$
test_membership_stacked_nested_last 0.2145ms 0.1567ms 6.3808 KOps/s 6.3411 KOps/s $\color{#35bf28}+0.63\%$
test_membership_stacked_nested_leaf_last 43.5800μs 13.1810μs 75.8667 KOps/s 75.1370 KOps/s $\color{#35bf28}+0.97\%$
test_nested_getleaf 34.1800μs 8.3909μs 119.1774 KOps/s 118.3762 KOps/s $\color{#35bf28}+0.68\%$
test_nested_get 22.9610μs 7.9376μs 125.9820 KOps/s 125.3265 KOps/s $\color{#35bf28}+0.52\%$
test_stacked_getleaf 0.3719ms 0.3293ms 3.0366 KOps/s 3.0256 KOps/s $\color{#35bf28}+0.37\%$
test_stacked_get 0.3265ms 0.2969ms 3.3677 KOps/s 3.3392 KOps/s $\color{#35bf28}+0.85\%$
test_nested_getitemleaf 32.8500μs 9.7774μs 102.2764 KOps/s 101.7732 KOps/s $\color{#35bf28}+0.49\%$
test_nested_getitem 27.5010μs 9.3509μs 106.9418 KOps/s 107.4961 KOps/s $\color{#d91a1a}-0.52\%$
test_stacked_getitemleaf 0.3759ms 0.3328ms 3.0044 KOps/s 2.9774 KOps/s $\color{#35bf28}+0.90\%$
test_stacked_getitem 0.3435ms 0.2976ms 3.3601 KOps/s 3.3025 KOps/s $\color{#35bf28}+1.75\%$
test_lock_nested 0.8409ms 0.3502ms 2.8551 KOps/s 2.8251 KOps/s $\color{#35bf28}+1.06\%$
test_lock_stack_nested 84.9891ms 6.2910ms 158.9584 Ops/s 159.4773 Ops/s $\color{#d91a1a}-0.33\%$
test_unlock_nested 78.4958ms 0.4322ms 2.3139 KOps/s 2.8701 KOps/s $\textbf{\color{#d91a1a}-19.38\%}$
test_unlock_stack_nested 89.1001ms 6.4789ms 154.3473 Ops/s 155.2719 Ops/s $\color{#d91a1a}-0.60\%$
test_flatten_speed 0.3550ms 0.2609ms 3.8322 KOps/s 3.7989 KOps/s $\color{#35bf28}+0.88\%$
test_unflatten_speed 0.4537ms 0.3621ms 2.7617 KOps/s 2.7521 KOps/s $\color{#35bf28}+0.35\%$
test_common_ops 1.0807ms 0.6306ms 1.5857 KOps/s 1.6698 KOps/s $\textbf{\color{#d91a1a}-5.04\%}$
test_creation 19.3000μs 1.6169μs 618.4791 KOps/s 640.2807 KOps/s $\color{#d91a1a}-3.41\%$
test_creation_empty 49.0910μs 9.0707μs 110.2455 KOps/s 114.5366 KOps/s $\color{#d91a1a}-3.75\%$
test_creation_nested_1 30.4600μs 10.8322μs 92.3173 KOps/s 96.2482 KOps/s $\color{#d91a1a}-4.08\%$
test_creation_nested_2 28.7400μs 13.2325μs 75.5714 KOps/s 77.1564 KOps/s $\color{#d91a1a}-2.05\%$
test_clone 45.0900μs 14.2912μs 69.9732 KOps/s 74.1147 KOps/s $\textbf{\color{#d91a1a}-5.59\%}$
test_getitem[int] 28.1710μs 10.9054μs 91.6977 KOps/s 91.4133 KOps/s $\color{#35bf28}+0.31\%$
test_getitem[slice_int] 60.4310μs 21.5776μs 46.3443 KOps/s 47.2087 KOps/s $\color{#d91a1a}-1.83\%$
test_getitem[range] 59.0100μs 37.8827μs 26.3973 KOps/s 27.1000 KOps/s $\color{#d91a1a}-2.59\%$
test_getitem[tuple] 43.2810μs 19.0440μs 52.5099 KOps/s 54.1369 KOps/s $\color{#d91a1a}-3.01\%$
test_getitem[list] 0.1681ms 35.9784μs 27.7944 KOps/s 30.0251 KOps/s $\textbf{\color{#d91a1a}-7.43\%}$
test_setitem_dim[int] 46.2310μs 28.6331μs 34.9246 KOps/s 37.3211 KOps/s $\textbf{\color{#d91a1a}-6.42\%}$
test_setitem_dim[slice_int] 76.7010μs 49.8657μs 20.0539 KOps/s 21.3541 KOps/s $\textbf{\color{#d91a1a}-6.09\%}$
test_setitem_dim[range] 86.7610μs 67.1187μs 14.8990 KOps/s 16.1223 KOps/s $\textbf{\color{#d91a1a}-7.59\%}$
test_setitem_dim[tuple] 64.2510μs 44.3745μs 22.5355 KOps/s 24.4347 KOps/s $\textbf{\color{#d91a1a}-7.77\%}$
test_setitem 81.8710μs 20.4068μs 49.0033 KOps/s 54.4345 KOps/s $\textbf{\color{#d91a1a}-9.98\%}$
test_set 69.3210μs 19.6231μs 50.9604 KOps/s 55.5619 KOps/s $\textbf{\color{#d91a1a}-8.28\%}$
test_set_shared 85.9557ms 0.1199ms 8.3420 KOps/s 9.9925 KOps/s $\textbf{\color{#d91a1a}-16.52\%}$
test_update 0.1432ms 22.1625μs 45.1212 KOps/s 48.3463 KOps/s $\textbf{\color{#d91a1a}-6.67\%}$
test_update_nested 0.1463ms 30.4293μs 32.8631 KOps/s 37.0295 KOps/s $\textbf{\color{#d91a1a}-11.25\%}$
test_set_nested 0.1489ms 20.7576μs 48.1752 KOps/s 51.8583 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_set_nested_new 0.1454ms 25.1184μs 39.8114 KOps/s 46.7833 KOps/s $\textbf{\color{#d91a1a}-14.90\%}$
test_select 0.1602ms 37.9235μs 26.3688 KOps/s 28.6289 KOps/s $\textbf{\color{#d91a1a}-7.89\%}$
test_select_nested 82.9810μs 52.6913μs 18.9785 KOps/s 18.5742 KOps/s $\color{#35bf28}+2.18\%$
test_exclude_nested 0.9053ms 0.1156ms 8.6532 KOps/s 8.6533 KOps/s $-0.00\%$
test_empty[True] 0.9988ms 0.3941ms 2.5374 KOps/s 2.5565 KOps/s $\color{#d91a1a}-0.75\%$
test_empty[False] 3.2480μs 0.8713μs 1.1476 MOps/s 1.1735 MOps/s $\color{#d91a1a}-2.20\%$
test_to 74.7410μs 55.3386μs 18.0706 KOps/s 18.7131 KOps/s $\color{#d91a1a}-3.43\%$
test_to_nonblocking 56.9010μs 35.5504μs 28.1291 KOps/s 29.4616 KOps/s $\color{#d91a1a}-4.52\%$
test_unbind_speed 0.2932ms 0.2688ms 3.7197 KOps/s 3.7365 KOps/s $\color{#d91a1a}-0.45\%$
test_unbind_speed_stack0 85.1697ms 3.7659ms 265.5430 Ops/s 287.6001 Ops/s $\textbf{\color{#d91a1a}-7.67\%}$
test_unbind_speed_stack1 9.2467μs 1.6912μs 591.3008 KOps/s 580.5359 KOps/s $\color{#35bf28}+1.85\%$
test_split 2.1170ms 1.5482ms 645.8999 Ops/s 575.4880 Ops/s $\textbf{\color{#35bf28}+12.24\%}$
test_chunk 82.2679ms 1.6721ms 598.0607 Ops/s 595.0855 Ops/s $\color{#35bf28}+0.50\%$
test_creation[device0] 0.1387ms 75.7585μs 13.1998 KOps/s 14.1261 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_creation_from_tensor 0.1432ms 56.8620μs 17.5864 KOps/s 18.7933 KOps/s $\textbf{\color{#d91a1a}-6.42\%}$
test_add_one[memmap_tensor0] 0.1243ms 6.8307μs 146.3969 KOps/s 158.5645 KOps/s $\textbf{\color{#d91a1a}-7.67\%}$
test_contiguous[memmap_tensor0] 13.6100μs 0.6646μs 1.5047 MOps/s 1.5563 MOps/s $\color{#d91a1a}-3.32\%$
test_stack[memmap_tensor0] 29.5000μs 4.5508μs 219.7408 KOps/s 230.1464 KOps/s $\color{#d91a1a}-4.52\%$
test_memmaptd_index 1.0723ms 0.2660ms 3.7593 KOps/s 3.8763 KOps/s $\color{#d91a1a}-3.02\%$
test_memmaptd_index_astensor 0.6825ms 0.3251ms 3.0763 KOps/s 3.1881 KOps/s $\color{#d91a1a}-3.51\%$
test_memmaptd_index_op 1.1631ms 0.6435ms 1.5539 KOps/s 1.6590 KOps/s $\textbf{\color{#d91a1a}-6.33\%}$
test_serialize_model 0.1738s 97.8400ms 10.2208 Ops/s 10.8372 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_serialize_model_pickle 1.3489s 1.2371s 0.8084 Ops/s 0.8090 Ops/s $\color{#d91a1a}-0.07\%$
test_serialize_weights 0.1722s 96.7584ms 10.3350 Ops/s 11.1386 Ops/s $\textbf{\color{#d91a1a}-7.21\%}$
test_serialize_weights_returnearly 0.2838s 83.8561ms 11.9252 Ops/s 16.9424 Ops/s $\textbf{\color{#d91a1a}-29.61\%}$
test_serialize_weights_pickle 1.3460s 1.2455s 0.8029 Ops/s 0.8085 Ops/s $\color{#d91a1a}-0.69\%$
test_reshape_pytree 47.2210μs 26.1940μs 38.1767 KOps/s 40.3706 KOps/s $\textbf{\color{#d91a1a}-5.43\%}$
test_reshape_td 52.4100μs 33.4616μs 29.8850 KOps/s 32.2620 KOps/s $\textbf{\color{#d91a1a}-7.37\%}$
test_view_pytree 52.8400μs 24.9069μs 40.1495 KOps/s 41.0480 KOps/s $\color{#d91a1a}-2.19\%$
test_view_td 0.3938ms 6.7553μs 148.0313 KOps/s 95.6512 KOps/s $\textbf{\color{#35bf28}+54.76\%}$
test_unbind_pytree 69.2410μs 29.6604μs 33.7150 KOps/s 33.8897 KOps/s $\color{#d91a1a}-0.52\%$
test_unbind_td 72.8810μs 40.4626μs 24.7142 KOps/s 24.9513 KOps/s $\color{#d91a1a}-0.95\%$
test_split_pytree 54.5710μs 28.2622μs 35.3830 KOps/s 35.8305 KOps/s $\color{#d91a1a}-1.25\%$
test_split_td 0.3370ms 38.5793μs 25.9206 KOps/s 25.6655 KOps/s $\color{#35bf28}+0.99\%$
test_add_pytree 57.2710μs 35.8722μs 27.8768 KOps/s 28.0798 KOps/s $\color{#d91a1a}-0.72\%$
test_add_td 75.1010μs 51.0476μs 19.5896 KOps/s 20.2299 KOps/s $\color{#d91a1a}-3.17\%$
test_distributed 0.1828ms 69.2687μs 14.4365 KOps/s 10.9743 KOps/s $\textbf{\color{#35bf28}+31.55\%}$
test_tdmodule 33.5310μs 18.3184μs 54.5900 KOps/s 55.7460 KOps/s $\color{#d91a1a}-2.07\%$
test_tdmodule_dispatch 0.2143ms 37.8088μs 26.4489 KOps/s 25.4774 KOps/s $\color{#35bf28}+3.81\%$
test_tdseq 36.5600μs 21.2813μs 46.9897 KOps/s 48.1442 KOps/s $\color{#d91a1a}-2.40\%$
test_tdseq_dispatch 61.1910μs 40.3237μs 24.7993 KOps/s 25.3663 KOps/s $\color{#d91a1a}-2.24\%$
test_instantiation_functorch 1.8122ms 1.6866ms 592.9218 Ops/s 591.1668 Ops/s $\color{#35bf28}+0.30\%$
test_instantiation_td 1.7066ms 1.1678ms 856.2772 Ops/s 858.2159 Ops/s $\color{#d91a1a}-0.23\%$
test_exec_functorch 0.1949ms 0.1568ms 6.3783 KOps/s 6.3775 KOps/s $\color{#35bf28}+0.01\%$
test_exec_functional_call 0.2144ms 0.1522ms 6.5687 KOps/s 6.5731 KOps/s $\color{#d91a1a}-0.07\%$
test_exec_td 0.1741ms 0.1436ms 6.9638 KOps/s 6.9852 KOps/s $\color{#d91a1a}-0.31\%$
test_exec_td_decorator 0.8888ms 0.1990ms 5.0251 KOps/s 5.0525 KOps/s $\color{#d91a1a}-0.54\%$
test_vmap_mlp_speed[True-True] 1.4114ms 1.0145ms 985.6827 Ops/s 990.5163 Ops/s $\color{#d91a1a}-0.49\%$
test_vmap_mlp_speed[True-False] 0.6456ms 0.5796ms 1.7252 KOps/s 1.7204 KOps/s $\color{#35bf28}+0.28\%$
test_vmap_mlp_speed[False-True] 0.9928ms 0.9245ms 1.0817 KOps/s 1.0792 KOps/s $\color{#35bf28}+0.22\%$
test_vmap_mlp_speed[False-False] 0.5804ms 0.5100ms 1.9609 KOps/s 1.9570 KOps/s $\color{#35bf28}+0.20\%$
test_vmap_mlp_speed_decorator[True-True] 2.8511ms 2.3153ms 431.9090 Ops/s 432.1185 Ops/s $\color{#d91a1a}-0.05\%$
test_vmap_mlp_speed_decorator[True-False] 1.0421ms 0.6414ms 1.5590 KOps/s 1.5539 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_mlp_speed_decorator[False-True] 2.3311ms 1.9371ms 516.2289 Ops/s 520.2520 Ops/s $\color{#d91a1a}-0.77\%$
test_vmap_mlp_speed_decorator[False-False] 0.9648ms 0.5388ms 1.8559 KOps/s 1.8441 KOps/s $\color{#35bf28}+0.64\%$
test_vmap_transformer_speed[True-True] 12.0160ms 11.9269ms 83.8439 Ops/s 84.0433 Ops/s $\color{#d91a1a}-0.24\%$
test_vmap_transformer_speed[True-False] 7.8466ms 7.7920ms 128.3362 Ops/s 129.3136 Ops/s $\color{#d91a1a}-0.76\%$
test_vmap_transformer_speed[False-True] 11.8879ms 11.7740ms 84.9330 Ops/s 85.2884 Ops/s $\color{#d91a1a}-0.42\%$
test_vmap_transformer_speed[False-False] 8.1420ms 7.7120ms 129.6677 Ops/s 130.6693 Ops/s $\color{#d91a1a}-0.77\%$
test_vmap_transformer_speed_decorator[True-True] 72.5910ms 71.9424ms 13.9000 Ops/s 12.5026 Ops/s $\textbf{\color{#35bf28}+11.18\%}$
test_vmap_transformer_speed_decorator[True-False] 20.3618ms 18.9072ms 52.8900 Ops/s 52.9914 Ops/s $\color{#d91a1a}-0.19\%$
test_vmap_transformer_speed_decorator[False-True] 0.1993s 73.2004ms 13.6611 Ops/s 15.4318 Ops/s $\textbf{\color{#d91a1a}-11.47\%}$
test_vmap_transformer_speed_decorator[False-False] 20.2860ms 18.5940ms 53.7809 Ops/s 53.9525 Ops/s $\color{#d91a1a}-0.32\%$

@vmoens vmoens added enhancement New feature or request Refactor Refactoring code - not a new feature labels Feb 6, 2024
@vmoens vmoens merged commit 5ede453 into main Feb 6, 2024
47 of 48 checks passed
@vmoens vmoens deleted the cleanup-deprec branch February 6, 2024 17:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request Refactor Refactoring code - not a new feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants