Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix deepcopy of TensorDictParams #580

Merged
merged 1 commit into from
Nov 27, 2023
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 27, 2023

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 27, 2023
@vmoens vmoens merged commit 1a9aca2 into main Nov 27, 2023
7 checks passed
@vmoens vmoens added the bug Something isn't working label Nov 27, 2023
@vmoens vmoens deleted the fix_deepcopy_params branch November 27, 2023 10:28
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 113. Improved: $\large\color{#35bf28}2$. Worsened: $\large\color{#d91a1a}1$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 36.0980μs 15.5915μs 64.1375 KOps/s 64.6313 KOps/s $\color{#d91a1a}-0.76\%$
test_plain_set_stack_nested 0.1929ms 0.1401ms 7.1365 KOps/s 7.1208 KOps/s $\color{#35bf28}+0.22\%$
test_plain_set_nested_inplace 53.9510μs 18.7993μs 53.1933 KOps/s 53.6930 KOps/s $\color{#d91a1a}-0.93\%$
test_plain_set_stack_nested_inplace 0.2366ms 0.1705ms 5.8660 KOps/s 5.8903 KOps/s $\color{#d91a1a}-0.41\%$
test_items 37.6000μs 2.4151μs 414.0699 KOps/s 416.4833 KOps/s $\color{#d91a1a}-0.58\%$
test_items_nested 1.4954ms 0.2735ms 3.6564 KOps/s 3.6720 KOps/s $\color{#d91a1a}-0.42\%$
test_items_nested_locked 0.9440ms 0.2661ms 3.7582 KOps/s 3.7062 KOps/s $\color{#35bf28}+1.41\%$
test_items_nested_leaf 0.2281ms 0.1624ms 6.1593 KOps/s 5.9949 KOps/s $\color{#35bf28}+2.74\%$
test_items_stack_nested 2.2696ms 1.4981ms 667.5093 Ops/s 670.8292 Ops/s $\color{#d91a1a}-0.49\%$
test_items_stack_nested_leaf 1.4797ms 1.3630ms 733.6929 Ops/s 737.9298 Ops/s $\color{#d91a1a}-0.57\%$
test_items_stack_nested_locked 1.9855ms 0.7675ms 1.3030 KOps/s 1.2977 KOps/s $\color{#35bf28}+0.41\%$
test_keys 39.0630μs 3.8794μs 257.7696 KOps/s 252.2415 KOps/s $\color{#35bf28}+2.19\%$
test_keys_nested 0.5015ms 0.1409ms 7.0990 KOps/s 6.7010 KOps/s $\textbf{\color{#35bf28}+5.94\%}$
test_keys_nested_locked 0.2812ms 0.1397ms 7.1596 KOps/s 7.0978 KOps/s $\color{#35bf28}+0.87\%$
test_keys_nested_leaf 0.6683ms 0.1398ms 7.1535 KOps/s 7.0588 KOps/s $\color{#35bf28}+1.34\%$
test_keys_stack_nested 2.3177ms 1.4287ms 699.9165 Ops/s 708.7003 Ops/s $\color{#d91a1a}-1.24\%$
test_keys_stack_nested_leaf 2.2287ms 1.4182ms 705.1138 Ops/s 703.0662 Ops/s $\color{#35bf28}+0.29\%$
test_keys_stack_nested_locked 1.1046ms 0.6757ms 1.4800 KOps/s 1.4566 KOps/s $\color{#35bf28}+1.60\%$
test_values 11.8620μs 1.1632μs 859.6643 KOps/s 861.2057 KOps/s $\color{#d91a1a}-0.18\%$
test_values_nested 92.4730μs 49.5839μs 20.1678 KOps/s 20.3128 KOps/s $\color{#d91a1a}-0.71\%$
test_values_nested_locked 0.1247ms 49.9865μs 20.0054 KOps/s 20.3913 KOps/s $\color{#d91a1a}-1.89\%$
test_values_nested_leaf 56.4360μs 44.3504μs 22.5477 KOps/s 22.6784 KOps/s $\color{#d91a1a}-0.58\%$
test_values_stack_nested 1.4440ms 1.2112ms 825.6191 Ops/s 833.0815 Ops/s $\color{#d91a1a}-0.90\%$
test_values_stack_nested_leaf 1.7850ms 1.2027ms 831.4517 Ops/s 833.4488 Ops/s $\color{#d91a1a}-0.24\%$
test_values_stack_nested_locked 0.7080ms 0.5104ms 1.9594 KOps/s 1.9449 KOps/s $\color{#35bf28}+0.75\%$
test_membership 11.0210μs 1.3927μs 718.0239 KOps/s 744.8756 KOps/s $\color{#d91a1a}-3.60\%$
test_membership_nested 37.0390μs 2.8349μs 352.7491 KOps/s 359.2262 KOps/s $\color{#d91a1a}-1.80\%$
test_membership_nested_leaf 21.5700μs 2.8305μs 353.2966 KOps/s 347.9091 KOps/s $\color{#35bf28}+1.55\%$
test_membership_stacked_nested 56.4660μs 11.8968μs 84.0565 KOps/s 85.6936 KOps/s $\color{#d91a1a}-1.91\%$
test_membership_stacked_nested_leaf 36.8190μs 11.9354μs 83.7841 KOps/s 86.6811 KOps/s $\color{#d91a1a}-3.34\%$
test_membership_nested_last 47.2880μs 6.0131μs 166.3039 KOps/s 171.1005 KOps/s $\color{#d91a1a}-2.80\%$
test_membership_nested_leaf_last 24.6660μs 6.0903μs 164.1947 KOps/s 167.8239 KOps/s $\color{#d91a1a}-2.16\%$
test_membership_stacked_nested_last 0.2734ms 0.1667ms 5.9971 KOps/s 5.9058 KOps/s $\color{#35bf28}+1.55\%$
test_membership_stacked_nested_leaf_last 62.0050μs 14.0682μs 71.0825 KOps/s 71.8245 KOps/s $\color{#d91a1a}-1.03\%$
test_nested_getleaf 44.3630μs 10.5898μs 94.4303 KOps/s 95.1432 KOps/s $\color{#d91a1a}-0.75\%$
test_nested_get 52.1180μs 10.1077μs 98.9348 KOps/s 99.1479 KOps/s $\color{#d91a1a}-0.21\%$
test_stacked_getleaf 1.0167ms 0.6748ms 1.4819 KOps/s 1.5322 KOps/s $\color{#d91a1a}-3.29\%$
test_stacked_get 1.1084ms 0.6279ms 1.5926 KOps/s 1.6094 KOps/s $\color{#d91a1a}-1.04\%$
test_nested_getitemleaf 0.2504ms 10.7969μs 92.6189 KOps/s 94.6107 KOps/s $\color{#d91a1a}-2.11\%$
test_nested_getitem 26.6990μs 10.0806μs 99.2003 KOps/s 99.0332 KOps/s $\color{#35bf28}+0.17\%$
test_stacked_getitemleaf 0.8300ms 0.6519ms 1.5339 KOps/s 1.5216 KOps/s $\color{#35bf28}+0.81\%$
test_stacked_getitem 1.0508ms 0.6219ms 1.6081 KOps/s 1.5982 KOps/s $\color{#35bf28}+0.62\%$
test_lock_nested 2.7229ms 0.5606ms 1.7838 KOps/s 1.7555 KOps/s $\color{#35bf28}+1.62\%$
test_lock_stack_nested 9.4970ms 5.0417ms 198.3474 Ops/s 193.9954 Ops/s $\color{#35bf28}+2.24\%$
test_unlock_nested 68.2797ms 0.5100ms 1.9609 KOps/s 2.2424 KOps/s $\textbf{\color{#d91a1a}-12.55\%}$
test_unlock_stack_nested 68.0591ms 6.6064ms 151.3680 Ops/s 149.5173 Ops/s $\color{#35bf28}+1.24\%$
test_flatten_speed 0.3300ms 0.2677ms 3.7350 KOps/s 3.7453 KOps/s $\color{#d91a1a}-0.27\%$
test_unflatten_speed 0.5498ms 0.4663ms 2.1445 KOps/s 2.1933 KOps/s $\color{#d91a1a}-2.23\%$
test_common_ops 5.7535ms 0.6734ms 1.4850 KOps/s 1.5140 KOps/s $\color{#d91a1a}-1.92\%$
test_creation 16.7110μs 2.4704μs 404.7910 KOps/s 403.7569 KOps/s $\color{#35bf28}+0.26\%$
test_creation_empty 51.8460μs 8.0822μs 123.7294 KOps/s 123.2130 KOps/s $\color{#35bf28}+0.42\%$
test_creation_nested_1 62.0260μs 11.4482μs 87.3503 KOps/s 87.4336 KOps/s $\color{#d91a1a}-0.10\%$
test_creation_nested_2 38.9830μs 14.8176μs 67.4874 KOps/s 66.4795 KOps/s $\color{#35bf28}+1.52\%$
test_clone 82.7940μs 13.4454μs 74.3750 KOps/s 74.9428 KOps/s $\color{#d91a1a}-0.76\%$
test_getitem[int] 38.7720μs 13.4294μs 74.4635 KOps/s 74.6993 KOps/s $\color{#d91a1a}-0.32\%$
test_getitem[slice_int] 0.1244ms 24.9459μs 40.0868 KOps/s 39.1822 KOps/s $\color{#35bf28}+2.31\%$
test_getitem[range] 93.1950μs 42.9139μs 23.3025 KOps/s 22.4710 KOps/s $\color{#35bf28}+3.70\%$
test_getitem[tuple] 50.2340μs 20.6015μs 48.5401 KOps/s 48.2954 KOps/s $\color{#35bf28}+0.51\%$
test_getitem[list] 0.1768ms 38.3795μs 26.0556 KOps/s 25.2767 KOps/s $\color{#35bf28}+3.08\%$
test_setitem_dim[int] 54.4520μs 27.9547μs 35.7722 KOps/s 36.6144 KOps/s $\color{#d91a1a}-2.30\%$
test_setitem_dim[slice_int] 89.9680μs 51.3281μs 19.4825 KOps/s 19.7898 KOps/s $\color{#d91a1a}-1.55\%$
test_setitem_dim[range] 0.1190ms 70.5560μs 14.1731 KOps/s 14.2609 KOps/s $\color{#d91a1a}-0.62\%$
test_setitem_dim[tuple] 72.4350μs 40.7141μs 24.5615 KOps/s 24.1155 KOps/s $\color{#35bf28}+1.85\%$
test_setitem 0.1020ms 17.8686μs 55.9641 KOps/s 54.4636 KOps/s $\color{#35bf28}+2.75\%$
test_set 81.3720μs 17.3850μs 57.5209 KOps/s 57.0617 KOps/s $\color{#35bf28}+0.80\%$
test_set_shared 1.8353ms 0.1422ms 7.0326 KOps/s 7.0822 KOps/s $\color{#d91a1a}-0.70\%$
test_update 89.2870μs 18.7559μs 53.3167 KOps/s 53.0826 KOps/s $\color{#35bf28}+0.44\%$
test_update_nested 84.7380μs 26.1645μs 38.2197 KOps/s 37.0279 KOps/s $\color{#35bf28}+3.22\%$
test_set_nested 82.4640μs 19.2768μs 51.8758 KOps/s 50.2011 KOps/s $\color{#35bf28}+3.34\%$
test_set_nested_new 0.1567ms 24.8316μs 40.2712 KOps/s 40.2886 KOps/s $\color{#d91a1a}-0.04\%$
test_select 0.1100ms 49.6145μs 20.1554 KOps/s 19.6779 KOps/s $\color{#35bf28}+2.43\%$
test_unbind_speed 0.4498ms 0.3740ms 2.6739 KOps/s 2.6780 KOps/s $\color{#d91a1a}-0.15\%$
test_unbind_speed_stack0 67.0823ms 4.4858ms 222.9233 Ops/s 216.3860 Ops/s $\color{#35bf28}+3.02\%$
test_unbind_speed_stack1 2.6515μs 0.6278μs 1.5928 MOps/s 1.5542 MOps/s $\color{#35bf28}+2.48\%$
test_split 60.6789ms 1.7731ms 563.9796 Ops/s 567.1761 Ops/s $\color{#d91a1a}-0.56\%$
test_chunk 52.7587ms 1.7301ms 577.9889 Ops/s 564.5423 Ops/s $\color{#35bf28}+2.38\%$
test_creation[device0] 2.4433ms 0.3011ms 3.3217 KOps/s 3.4297 KOps/s $\color{#d91a1a}-3.15\%$
test_creation_from_tensor 0.6591ms 0.3321ms 3.0114 KOps/s 3.0578 KOps/s $\color{#d91a1a}-1.52\%$
test_add_one[memmap_tensor0] 69.7800μs 24.9391μs 40.0977 KOps/s 30.4387 KOps/s $\textbf{\color{#35bf28}+31.73\%}$
test_contiguous[memmap_tensor0] 45.0340μs 5.6796μs 176.0693 KOps/s 170.2983 KOps/s $\color{#35bf28}+3.39\%$
test_stack[memmap_tensor0] 64.8610μs 19.0352μs 52.5343 KOps/s 52.8563 KOps/s $\color{#d91a1a}-0.61\%$
test_memmaptd_index 0.7431ms 0.4027ms 2.4833 KOps/s 2.5226 KOps/s $\color{#d91a1a}-1.56\%$
test_memmaptd_index_astensor 0.5451ms 0.4598ms 2.1747 KOps/s 2.1550 KOps/s $\color{#35bf28}+0.91\%$
test_memmaptd_index_op 0.7841ms 0.6921ms 1.4448 KOps/s 1.4332 KOps/s $\color{#35bf28}+0.81\%$
test_reshape_pytree 61.1140μs 23.1317μs 43.2307 KOps/s 42.9600 KOps/s $\color{#35bf28}+0.63\%$
test_reshape_td 67.3960μs 31.4157μs 31.8312 KOps/s 30.9041 KOps/s $\color{#35bf28}+3.00\%$
test_view_pytree 68.4280μs 23.1760μs 43.1481 KOps/s 43.1958 KOps/s $\color{#d91a1a}-0.11\%$
test_view_td 38.7320μs 4.8122μs 207.8068 KOps/s 202.2294 KOps/s $\color{#35bf28}+2.76\%$
test_unbind_pytree 57.2370μs 26.2960μs 38.0287 KOps/s 37.8948 KOps/s $\color{#35bf28}+0.35\%$
test_unbind_td 0.1480ms 59.7357μs 16.7404 KOps/s 16.6764 KOps/s $\color{#35bf28}+0.38\%$
test_split_pytree 81.2320μs 26.3218μs 37.9913 KOps/s 37.5274 KOps/s $\color{#35bf28}+1.24\%$
test_split_td 0.1095ms 46.3339μs 21.5825 KOps/s 21.2264 KOps/s $\color{#35bf28}+1.68\%$
test_add_pytree 86.6920μs 31.3869μs 31.8604 KOps/s 31.5560 KOps/s $\color{#35bf28}+0.96\%$
test_add_td 99.8160μs 43.3264μs 23.0806 KOps/s 22.8799 KOps/s $\color{#35bf28}+0.88\%$
test_distributed 25.5380μs 5.9511μs 168.0370 KOps/s 162.9832 KOps/s $\color{#35bf28}+3.10\%$
test_tdmodule 0.1191ms 20.5407μs 48.6838 KOps/s 47.5645 KOps/s $\color{#35bf28}+2.35\%$
test_tdmodule_dispatch 0.1713ms 38.8559μs 25.7361 KOps/s 25.9470 KOps/s $\color{#d91a1a}-0.81\%$
test_tdseq 0.1177ms 23.7731μs 42.0644 KOps/s 41.1950 KOps/s $\color{#35bf28}+2.11\%$
test_tdseq_dispatch 0.1303ms 41.8552μs 23.8919 KOps/s 23.1482 KOps/s $\color{#35bf28}+3.21\%$
test_instantiation_functorch 1.5247ms 1.2898ms 775.3430 Ops/s 763.2196 Ops/s $\color{#35bf28}+1.59\%$
test_instantiation_td 1.5837ms 1.0190ms 981.3726 Ops/s 976.4524 Ops/s $\color{#35bf28}+0.50\%$
test_exec_functorch 0.2403ms 0.1608ms 6.2208 KOps/s 6.4075 KOps/s $\color{#d91a1a}-2.91\%$
test_exec_functional_call 0.2708ms 0.1487ms 6.7251 KOps/s 6.9331 KOps/s $\color{#d91a1a}-3.00\%$
test_exec_td 0.2327ms 0.1447ms 6.9116 KOps/s 7.1167 KOps/s $\color{#d91a1a}-2.88\%$
test_exec_td_decorator 0.9597ms 0.1778ms 5.6232 KOps/s 5.7453 KOps/s $\color{#d91a1a}-2.13\%$
test_vmap_mlp_speed[True-True] 1.1588ms 0.8724ms 1.1463 KOps/s 1.1212 KOps/s $\color{#35bf28}+2.24\%$
test_vmap_mlp_speed[True-False] 0.6696ms 0.4555ms 2.1953 KOps/s 2.1117 KOps/s $\color{#35bf28}+3.96\%$
test_vmap_mlp_speed[False-True] 1.1363ms 0.7587ms 1.3181 KOps/s 1.2872 KOps/s $\color{#35bf28}+2.39\%$
test_vmap_mlp_speed[False-False] 0.4690ms 0.3767ms 2.6548 KOps/s 2.6136 KOps/s $\color{#35bf28}+1.58\%$
test_vmap_mlp_speed_decorator[True-True] 2.6925ms 1.7635ms 567.0620 Ops/s 569.8970 Ops/s $\color{#d91a1a}-0.50\%$
test_vmap_mlp_speed_decorator[True-False] 0.9507ms 0.5054ms 1.9787 KOps/s 1.9219 KOps/s $\color{#35bf28}+2.96\%$
test_vmap_mlp_speed_decorator[False-True] 1.9440ms 1.4753ms 677.8128 Ops/s 677.6770 Ops/s $\color{#35bf28}+0.02\%$
test_vmap_mlp_speed_decorator[False-False] 0.7768ms 0.3919ms 2.5517 KOps/s 2.4893 KOps/s $\color{#35bf28}+2.51\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 127. Improved: $\large\color{#35bf28}4$. Worsened: $\large\color{#d91a1a}3$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.5347ms 12.9203μs 77.3977 KOps/s 78.5866 KOps/s $\color{#d91a1a}-1.51\%$
test_plain_set_stack_nested 0.1655ms 0.1154ms 8.6639 KOps/s 8.7026 KOps/s $\color{#d91a1a}-0.44\%$
test_plain_set_nested_inplace 74.6550μs 15.4134μs 64.8788 KOps/s 64.0814 KOps/s $\color{#35bf28}+1.24\%$
test_plain_set_stack_nested_inplace 0.1896ms 0.1421ms 7.0353 KOps/s 7.1188 KOps/s $\color{#d91a1a}-1.17\%$
test_items 28.9320μs 4.7401μs 210.9668 KOps/s 209.8114 KOps/s $\color{#35bf28}+0.55\%$
test_items_nested 0.3937ms 0.3395ms 2.9455 KOps/s 2.9747 KOps/s $\color{#d91a1a}-0.98\%$
test_items_nested_locked 0.3963ms 0.3418ms 2.9255 KOps/s 2.9339 KOps/s $\color{#d91a1a}-0.29\%$
test_items_nested_leaf 0.2472ms 0.2000ms 4.9993 KOps/s 5.0181 KOps/s $\color{#d91a1a}-0.37\%$
test_items_stack_nested 1.5388ms 1.4771ms 677.0015 Ops/s 668.3283 Ops/s $\color{#35bf28}+1.30\%$
test_items_stack_nested_leaf 1.4009ms 1.2997ms 769.3988 Ops/s 752.1567 Ops/s $\color{#35bf28}+2.29\%$
test_items_stack_nested_locked 0.8881ms 0.8246ms 1.2128 KOps/s 1.2022 KOps/s $\color{#35bf28}+0.88\%$
test_keys 22.4210μs 4.5863μs 218.0417 KOps/s 218.2793 KOps/s $\color{#d91a1a}-0.11\%$
test_keys_nested 3.3192ms 91.1289μs 10.9735 KOps/s 11.0195 KOps/s $\color{#d91a1a}-0.42\%$
test_keys_nested_locked 0.1367ms 90.9048μs 11.0005 KOps/s 10.9563 KOps/s $\color{#35bf28}+0.40\%$
test_keys_nested_leaf 41.1734ms 86.7956μs 11.5213 KOps/s 12.0756 KOps/s $\color{#d91a1a}-4.59\%$
test_keys_stack_nested 1.3729ms 1.2824ms 779.7575 Ops/s 762.4751 Ops/s $\color{#35bf28}+2.27\%$
test_keys_stack_nested_leaf 1.3775ms 1.2676ms 788.8670 Ops/s 781.8785 Ops/s $\color{#35bf28}+0.89\%$
test_keys_stack_nested_locked 0.7005ms 0.6191ms 1.6152 KOps/s 1.5982 KOps/s $\color{#35bf28}+1.06\%$
test_values 8.9770μs 1.9131μs 522.7074 KOps/s 528.3974 KOps/s $\color{#d91a1a}-1.08\%$
test_values_nested 70.6150μs 42.8299μs 23.3482 KOps/s 23.1715 KOps/s $\color{#35bf28}+0.76\%$
test_values_nested_locked 71.4440μs 45.1750μs 22.1361 KOps/s 22.1252 KOps/s $\color{#35bf28}+0.05\%$
test_values_nested_leaf 66.9740μs 37.2132μs 26.8722 KOps/s 26.5549 KOps/s $\color{#35bf28}+1.19\%$
test_values_stack_nested 1.3021ms 1.1428ms 875.0611 Ops/s 872.3161 Ops/s $\color{#35bf28}+0.31\%$
test_values_stack_nested_leaf 1.1879ms 1.1206ms 892.3681 Ops/s 876.2039 Ops/s $\color{#35bf28}+1.84\%$
test_values_stack_nested_locked 0.5712ms 0.4968ms 2.0128 KOps/s 1.9923 KOps/s $\color{#35bf28}+1.03\%$
test_membership 5.8384μs 0.9563μs 1.0457 MOps/s 1.0732 MOps/s $\color{#d91a1a}-2.56\%$
test_membership_nested 14.2405μs 2.1780μs 459.1369 KOps/s 481.4444 KOps/s $\color{#d91a1a}-4.63\%$
test_membership_nested_leaf 11.7360μs 2.1378μs 467.7664 KOps/s 487.4223 KOps/s $\color{#d91a1a}-4.03\%$
test_membership_stacked_nested 32.9920μs 11.0446μs 90.5417 KOps/s 93.4440 KOps/s $\color{#d91a1a}-3.11\%$
test_membership_stacked_nested_leaf 45.3930μs 10.9853μs 91.0309 KOps/s 93.4771 KOps/s $\color{#d91a1a}-2.62\%$
test_membership_nested_last 28.5620μs 4.6907μs 213.1875 KOps/s 216.2233 KOps/s $\color{#d91a1a}-1.40\%$
test_membership_nested_leaf_last 20.0320μs 4.6308μs 215.9458 KOps/s 218.4463 KOps/s $\color{#d91a1a}-1.14\%$
test_membership_stacked_nested_last 0.1681ms 0.1350ms 7.4050 KOps/s 7.4465 KOps/s $\color{#d91a1a}-0.56\%$
test_membership_stacked_nested_leaf_last 34.1520μs 12.7809μs 78.2418 KOps/s 79.2952 KOps/s $\color{#d91a1a}-1.33\%$
test_nested_getleaf 31.2220μs 8.4948μs 117.7194 KOps/s 118.6017 KOps/s $\color{#d91a1a}-0.74\%$
test_nested_get 29.2920μs 7.9898μs 125.1592 KOps/s 125.5309 KOps/s $\color{#d91a1a}-0.30\%$
test_stacked_getleaf 0.6304ms 0.5563ms 1.7976 KOps/s 1.7879 KOps/s $\color{#35bf28}+0.54\%$
test_stacked_get 0.6143ms 0.5393ms 1.8542 KOps/s 1.8639 KOps/s $\color{#d91a1a}-0.52\%$
test_nested_getitemleaf 28.8210μs 8.5519μs 116.9336 KOps/s 118.0729 KOps/s $\color{#d91a1a}-0.96\%$
test_nested_getitem 26.9020μs 8.0625μs 124.0306 KOps/s 124.6860 KOps/s $\color{#d91a1a}-0.53\%$
test_stacked_getitemleaf 0.6234ms 0.5694ms 1.7563 KOps/s 1.8011 KOps/s $\color{#d91a1a}-2.49\%$
test_stacked_getitem 0.6074ms 0.5392ms 1.8546 KOps/s 1.8798 KOps/s $\color{#d91a1a}-1.34\%$
test_lock_nested 3.3002ms 0.5465ms 1.8298 KOps/s 1.7903 KOps/s $\color{#35bf28}+2.21\%$
test_lock_stack_nested 80.1772ms 7.1115ms 140.6179 Ops/s 140.0467 Ops/s $\color{#35bf28}+0.41\%$
test_unlock_nested 2.3465ms 0.4287ms 2.3327 KOps/s 2.3618 KOps/s $\color{#d91a1a}-1.23\%$
test_unlock_stack_nested 65.9513ms 6.1574ms 162.4056 Ops/s 162.3168 Ops/s $\color{#35bf28}+0.05\%$
test_flatten_speed 0.2390ms 0.1882ms 5.3130 KOps/s 5.3376 KOps/s $\color{#d91a1a}-0.46\%$
test_unflatten_speed 0.4381ms 0.3686ms 2.7133 KOps/s 2.7570 KOps/s $\color{#d91a1a}-1.59\%$
test_common_ops 1.0859ms 0.5915ms 1.6907 KOps/s 1.6897 KOps/s $\color{#35bf28}+0.05\%$
test_creation 32.7820μs 2.0666μs 483.8944 KOps/s 483.0486 KOps/s $\color{#35bf28}+0.18\%$
test_creation_empty 23.0210μs 7.0775μs 141.2929 KOps/s 142.0325 KOps/s $\color{#d91a1a}-0.52\%$
test_creation_nested_1 34.2520μs 9.3638μs 106.7941 KOps/s 106.7297 KOps/s $\color{#35bf28}+0.06\%$
test_creation_nested_2 26.7220μs 12.2891μs 81.3731 KOps/s 83.1280 KOps/s $\color{#d91a1a}-2.11\%$
test_clone 86.4950μs 13.5169μs 73.9817 KOps/s 71.9414 KOps/s $\color{#35bf28}+2.84\%$
test_getitem[int] 29.4710μs 12.1498μs 82.3057 KOps/s 82.3698 KOps/s $\color{#d91a1a}-0.08\%$
test_getitem[slice_int] 51.1230μs 23.6391μs 42.3028 KOps/s 42.3830 KOps/s $\color{#d91a1a}-0.19\%$
test_getitem[range] 55.8530μs 37.3379μs 26.7824 KOps/s 27.1113 KOps/s $\color{#d91a1a}-1.21\%$
test_getitem[tuple] 40.2620μs 20.0182μs 49.9545 KOps/s 50.3084 KOps/s $\color{#d91a1a}-0.70\%$
test_getitem[list] 0.2541ms 34.3596μs 29.1040 KOps/s 29.7113 KOps/s $\color{#d91a1a}-2.04\%$
test_setitem_dim[int] 41.5630μs 25.5316μs 39.1671 KOps/s 39.6052 KOps/s $\color{#d91a1a}-1.11\%$
test_setitem_dim[slice_int] 65.7840μs 45.8228μs 21.8232 KOps/s 21.7200 KOps/s $\color{#35bf28}+0.47\%$
test_setitem_dim[range] 82.8850μs 60.2997μs 16.5838 KOps/s 16.9012 KOps/s $\color{#d91a1a}-1.88\%$
test_setitem_dim[tuple] 63.3540μs 39.0586μs 25.6025 KOps/s 25.7123 KOps/s $\color{#d91a1a}-0.43\%$
test_setitem 77.9150μs 17.7770μs 56.2523 KOps/s 57.8765 KOps/s $\color{#d91a1a}-2.81\%$
test_set 84.0850μs 17.0639μs 58.6033 KOps/s 58.6960 KOps/s $\color{#d91a1a}-0.16\%$
test_set_shared 2.9097ms 0.1026ms 9.7486 KOps/s 8.8428 KOps/s $\textbf{\color{#35bf28}+10.24\%}$
test_update 0.1051ms 18.2483μs 54.7995 KOps/s 54.3669 KOps/s $\color{#35bf28}+0.80\%$
test_update_nested 0.1075ms 24.9091μs 40.1459 KOps/s 39.6962 KOps/s $\color{#35bf28}+1.13\%$
test_set_nested 88.3160μs 18.7981μs 53.1968 KOps/s 53.8061 KOps/s $\color{#d91a1a}-1.13\%$
test_set_nested_new 87.1150μs 22.8372μs 43.7882 KOps/s 44.2415 KOps/s $\color{#d91a1a}-1.02\%$
test_select 0.9184ms 46.1765μs 21.6561 KOps/s 22.7378 KOps/s $\color{#d91a1a}-4.76\%$
test_to 73.5950μs 51.3283μs 19.4824 KOps/s 20.0835 KOps/s $\color{#d91a1a}-2.99\%$
test_to_nonblocking 64.3740μs 33.4815μs 29.8672 KOps/s 28.4335 KOps/s $\textbf{\color{#35bf28}+5.04\%}$
test_unbind_speed 0.4160ms 0.3569ms 2.8017 KOps/s 2.8355 KOps/s $\color{#d91a1a}-1.19\%$
test_unbind_speed_stack0 61.7920ms 4.5454ms 220.0032 Ops/s 254.9646 Ops/s $\textbf{\color{#d91a1a}-13.71\%}$
test_unbind_speed_stack1 2.0391μs 0.5339μs 1.8731 MOps/s 1.8712 MOps/s $\color{#35bf28}+0.10\%$
test_split 1.9844ms 1.7003ms 588.1307 Ops/s 549.0604 Ops/s $\textbf{\color{#35bf28}+7.12\%}$
test_chunk 53.6464ms 1.7898ms 558.7124 Ops/s 563.3509 Ops/s $\color{#d91a1a}-0.82\%$
test_creation[device0] 0.4160ms 0.3066ms 3.2617 KOps/s 3.2787 KOps/s $\color{#d91a1a}-0.52\%$
test_creation[device1] 54.2075ms 0.3325ms 3.0075 KOps/s 3.2402 KOps/s $\textbf{\color{#d91a1a}-7.18\%}$
test_creation_from_tensor 0.5497ms 0.3329ms 3.0038 KOps/s 3.0156 KOps/s $\color{#d91a1a}-0.39\%$
test_add_one[memmap_tensor0] 0.1578ms 22.9458μs 43.5809 KOps/s 43.7343 KOps/s $\color{#d91a1a}-0.35\%$
test_add_one[memmap_tensor1] 0.2053ms 70.3150μs 14.2217 KOps/s 13.9332 KOps/s $\color{#35bf28}+2.07\%$
test_contiguous[memmap_tensor0] 23.5920μs 5.5504μs 180.1666 KOps/s 180.1596 KOps/s $+0.00\%$
test_contiguous[memmap_tensor1] 45.7520μs 20.9585μs 47.7133 KOps/s 46.3434 KOps/s $\color{#35bf28}+2.96\%$
test_stack[memmap_tensor0] 52.4730μs 18.2914μs 54.6705 KOps/s 54.4328 KOps/s $\color{#35bf28}+0.44\%$
test_stack[memmap_tensor1] 0.1475ms 71.1823μs 14.0484 KOps/s 13.7696 KOps/s $\color{#35bf28}+2.02\%$
test_memmaptd_index 0.4737ms 0.4162ms 2.4028 KOps/s 2.4601 KOps/s $\color{#d91a1a}-2.33\%$
test_memmaptd_index_astensor 0.5171ms 0.4722ms 2.1177 KOps/s 2.1377 KOps/s $\color{#d91a1a}-0.94\%$
test_memmaptd_index_op 0.8176ms 0.7292ms 1.3713 KOps/s 1.4017 KOps/s $\color{#d91a1a}-2.17\%$
test_reshape_pytree 39.9130μs 20.8317μs 48.0038 KOps/s 47.8622 KOps/s $\color{#35bf28}+0.30\%$
test_reshape_td 52.9140μs 28.9401μs 34.5541 KOps/s 34.4957 KOps/s $\color{#35bf28}+0.17\%$
test_view_pytree 40.7330μs 20.4395μs 48.9250 KOps/s 48.6757 KOps/s $\color{#35bf28}+0.51\%$
test_view_td 27.8910μs 3.9953μs 250.2945 KOps/s 250.3900 KOps/s $\color{#d91a1a}-0.04\%$
test_unbind_pytree 49.3330μs 25.6927μs 38.9216 KOps/s 39.0379 KOps/s $\color{#d91a1a}-0.30\%$
test_unbind_td 81.1550μs 56.0498μs 17.8413 KOps/s 18.3201 KOps/s $\color{#d91a1a}-2.61\%$
test_split_pytree 42.5630μs 23.8427μs 41.9416 KOps/s 42.3878 KOps/s $\color{#d91a1a}-1.05\%$
test_split_td 74.8050μs 44.1966μs 22.6262 KOps/s 22.5943 KOps/s $\color{#35bf28}+0.14\%$
test_add_pytree 51.5630μs 31.2012μs 32.0501 KOps/s 32.5730 KOps/s $\color{#d91a1a}-1.61\%$
test_add_td 70.9250μs 42.9578μs 23.2786 KOps/s 24.3082 KOps/s $\color{#d91a1a}-4.24\%$
test_distributed 20.6620μs 5.4879μs 182.2186 KOps/s 182.3864 KOps/s $\color{#d91a1a}-0.09\%$
test_tdmodule 33.6420μs 16.4462μs 60.8043 KOps/s 60.0704 KOps/s $\color{#35bf28}+1.22\%$
test_tdmodule_dispatch 0.1237ms 32.4015μs 30.8628 KOps/s 30.3135 KOps/s $\color{#35bf28}+1.81\%$
test_tdseq 34.8220μs 19.5496μs 51.1520 KOps/s 51.3968 KOps/s $\color{#d91a1a}-0.48\%$
test_tdseq_dispatch 61.4340μs 35.4883μs 28.1783 KOps/s 28.3164 KOps/s $\color{#d91a1a}-0.49\%$
test_instantiation_functorch 2.0282ms 1.6634ms 601.1660 Ops/s 602.8886 Ops/s $\color{#d91a1a}-0.29\%$
test_instantiation_td 1.6712ms 1.1742ms 851.6326 Ops/s 864.1386 Ops/s $\color{#d91a1a}-1.45\%$
test_exec_functorch 0.2095ms 0.1573ms 6.3561 KOps/s 6.4421 KOps/s $\color{#d91a1a}-1.34\%$
test_exec_functional_call 0.2023ms 0.1571ms 6.3672 KOps/s 6.4233 KOps/s $\color{#d91a1a}-0.87\%$
test_exec_td 0.2033ms 0.1447ms 6.9118 KOps/s 6.7978 KOps/s $\color{#35bf28}+1.68\%$
test_exec_td_decorator 64.3547ms 0.1990ms 5.0263 KOps/s 5.3907 KOps/s $\textbf{\color{#d91a1a}-6.76\%}$
test_vmap_mlp_speed[True-True] 1.1143ms 1.0502ms 952.2405 Ops/s 948.1179 Ops/s $\color{#35bf28}+0.43\%$
test_vmap_mlp_speed[True-False] 0.7085ms 0.6057ms 1.6509 KOps/s 1.6356 KOps/s $\color{#35bf28}+0.93\%$
test_vmap_mlp_speed[False-True] 1.2350ms 0.9582ms 1.0437 KOps/s 1.0314 KOps/s $\color{#35bf28}+1.19\%$
test_vmap_mlp_speed[False-False] 0.5989ms 0.5353ms 1.8680 KOps/s 1.8486 KOps/s $\color{#35bf28}+1.05\%$
test_vmap_mlp_speed_decorator[True-True] 2.4716ms 1.9953ms 501.1751 Ops/s 500.0344 Ops/s $\color{#35bf28}+0.23\%$
test_vmap_mlp_speed_decorator[True-False] 1.0522ms 0.6490ms 1.5409 KOps/s 1.5295 KOps/s $\color{#35bf28}+0.74\%$
test_vmap_mlp_speed_decorator[False-True] 2.1624ms 1.7269ms 579.0652 Ops/s 576.1480 Ops/s $\color{#35bf28}+0.51\%$
test_vmap_mlp_speed_decorator[False-False] 0.9938ms 0.5497ms 1.8192 KOps/s 1.6633 KOps/s $\textbf{\color{#35bf28}+9.38\%}$
test_vmap_transformer_speed[True-True] 12.5894ms 12.3254ms 81.1336 Ops/s 80.5312 Ops/s $\color{#35bf28}+0.75\%$
test_vmap_transformer_speed[True-False] 8.3336ms 8.1261ms 123.0597 Ops/s 120.9641 Ops/s $\color{#35bf28}+1.73\%$
test_vmap_transformer_speed[False-True] 12.4710ms 12.2603ms 81.5639 Ops/s 81.3622 Ops/s $\color{#35bf28}+0.25\%$
test_vmap_transformer_speed[False-False] 8.2494ms 8.0724ms 123.8784 Ops/s 122.4047 Ops/s $\color{#35bf28}+1.20\%$
test_vmap_transformer_speed_decorator[True-True] 63.8816ms 62.8526ms 15.9102 Ops/s 15.8575 Ops/s $\color{#35bf28}+0.33\%$
test_vmap_transformer_speed_decorator[True-False] 97.9972ms 21.2384ms 47.0845 Ops/s 46.7779 Ops/s $\color{#35bf28}+0.66\%$
test_vmap_transformer_speed_decorator[False-True] 58.6484ms 57.2288ms 17.4737 Ops/s 17.4741 Ops/s $-0.00\%$
test_vmap_transformer_speed_decorator[False-False] 21.4340ms 19.3317ms 51.7285 Ops/s 51.3657 Ops/s $\color{#35bf28}+0.71\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants