Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BugFix] Fix (keys, values) in sub #907

Merged
merged 1 commit into from
Jul 22, 2024
Merged

[BugFix] Fix (keys, values) in sub #907

merged 1 commit into from
Jul 22, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 22, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 22, 2024
@vmoens vmoens added the bug Something isn't working label Jul 22, 2024
@vmoens vmoens merged commit 43faf04 into main Jul 22, 2024
21 of 26 checks passed
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}28$. Worsened: $\large\color{#d91a1a}6$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 41.3970μs 22.7223μs 44.0096 KOps/s 42.9488 KOps/s $\color{#35bf28}+2.47\%$
test_plain_set_stack_nested 63.8190μs 23.1330μs 43.2283 KOps/s 42.6912 KOps/s $\color{#35bf28}+1.26\%$
test_plain_set_nested_inplace 65.0420μs 25.0312μs 39.9501 KOps/s 38.7644 KOps/s $\color{#35bf28}+3.06\%$
test_plain_set_stack_nested_inplace 55.6240μs 24.7946μs 40.3313 KOps/s 38.7870 KOps/s $\color{#35bf28}+3.98\%$
test_items 28.8130μs 2.6502μs 377.3367 KOps/s 372.5310 KOps/s $\color{#35bf28}+1.29\%$
test_items_nested 0.5623ms 0.3695ms 2.7063 KOps/s 2.7213 KOps/s $\color{#d91a1a}-0.55\%$
test_items_nested_locked 0.5869ms 0.3751ms 2.6658 KOps/s 2.7018 KOps/s $\color{#d91a1a}-1.33\%$
test_items_nested_leaf 0.1605ms 87.9632μs 11.3684 KOps/s 11.2806 KOps/s $\color{#35bf28}+0.78\%$
test_items_stack_nested 0.7602ms 0.3716ms 2.6908 KOps/s 2.7426 KOps/s $\color{#d91a1a}-1.89\%$
test_items_stack_nested_leaf 0.1704ms 89.4505μs 11.1794 KOps/s 11.3340 KOps/s $\color{#d91a1a}-1.36\%$
test_items_stack_nested_locked 0.6034ms 0.3696ms 2.7060 KOps/s 2.7500 KOps/s $\color{#d91a1a}-1.60\%$
test_keys 28.2130μs 3.8927μs 256.8899 KOps/s 257.5916 KOps/s $\color{#d91a1a}-0.27\%$
test_keys_nested 0.2977ms 0.1453ms 6.8811 KOps/s 7.0211 KOps/s $\color{#d91a1a}-1.99\%$
test_keys_nested_locked 0.7078ms 0.1521ms 6.5747 KOps/s 6.6802 KOps/s $\color{#d91a1a}-1.58\%$
test_keys_nested_leaf 0.2384ms 0.1241ms 8.0581 KOps/s 8.1164 KOps/s $\color{#d91a1a}-0.72\%$
test_keys_stack_nested 0.2618ms 0.1457ms 6.8616 KOps/s 6.9307 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_stack_nested_leaf 0.2296ms 0.1228ms 8.1436 KOps/s 8.1600 KOps/s $\color{#d91a1a}-0.20\%$
test_keys_stack_nested_locked 0.2869ms 0.1508ms 6.6327 KOps/s 6.6890 KOps/s $\color{#d91a1a}-0.84\%$
test_values 5.0454μs 1.1799μs 847.5023 KOps/s 881.5315 KOps/s $\color{#d91a1a}-3.86\%$
test_values_nested 0.1114ms 50.1914μs 19.9237 KOps/s 19.9377 KOps/s $\color{#d91a1a}-0.07\%$
test_values_nested_locked 99.0950μs 49.8736μs 20.0507 KOps/s 20.1141 KOps/s $\color{#d91a1a}-0.32\%$
test_values_nested_leaf 0.1002ms 44.9677μs 22.2382 KOps/s 22.2438 KOps/s $\color{#d91a1a}-0.03\%$
test_values_stack_nested 0.1342ms 50.5068μs 19.7993 KOps/s 20.0596 KOps/s $\color{#d91a1a}-1.30\%$
test_values_stack_nested_leaf 86.7020μs 44.7836μs 22.3296 KOps/s 22.1042 KOps/s $\color{#35bf28}+1.02\%$
test_values_stack_nested_locked 93.3240μs 50.1876μs 19.9252 KOps/s 19.6618 KOps/s $\color{#35bf28}+1.34\%$
test_membership 4.0519μs 0.7547μs 1.3251 MOps/s 1.0378 MOps/s $\textbf{\color{#35bf28}+27.68\%}$
test_membership_nested 25.4480μs 2.6621μs 375.6388 KOps/s 352.2627 KOps/s $\textbf{\color{#35bf28}+6.64\%}$
test_membership_nested_leaf 29.9150μs 2.6633μs 375.4752 KOps/s 355.0363 KOps/s $\textbf{\color{#35bf28}+5.76\%}$
test_membership_stacked_nested 23.1430μs 2.6516μs 377.1337 KOps/s 367.1076 KOps/s $\color{#35bf28}+2.73\%$
test_membership_stacked_nested_leaf 20.6290μs 2.6405μs 378.7155 KOps/s 366.1049 KOps/s $\color{#35bf28}+3.44\%$
test_membership_nested_last 0.1291ms 4.2525μs 235.1575 KOps/s 244.1595 KOps/s $\color{#d91a1a}-3.69\%$
test_membership_nested_leaf_last 43.8120μs 4.1427μs 241.3895 KOps/s 242.6957 KOps/s $\color{#d91a1a}-0.54\%$
test_membership_stacked_nested_last 27.2500μs 4.1068μs 243.5007 KOps/s 254.3055 KOps/s $\color{#d91a1a}-4.25\%$
test_membership_stacked_nested_leaf_last 19.4760μs 4.1184μs 242.8111 KOps/s 248.3047 KOps/s $\color{#d91a1a}-2.21\%$
test_nested_getleaf 39.0020μs 11.0709μs 90.3269 KOps/s 93.5041 KOps/s $\color{#d91a1a}-3.40\%$
test_nested_get 54.8830μs 10.3804μs 96.3351 KOps/s 96.8802 KOps/s $\color{#d91a1a}-0.56\%$
test_stacked_getleaf 31.7990μs 10.9362μs 91.4391 KOps/s 93.4414 KOps/s $\color{#d91a1a}-2.14\%$
test_stacked_get 0.2606ms 10.5305μs 94.9625 KOps/s 96.5816 KOps/s $\color{#d91a1a}-1.68\%$
test_nested_getitemleaf 39.3540μs 11.4960μs 86.9867 KOps/s 88.9138 KOps/s $\color{#d91a1a}-2.17\%$
test_nested_getitem 33.9630μs 10.5635μs 94.6654 KOps/s 97.3102 KOps/s $\color{#d91a1a}-2.72\%$
test_stacked_getitemleaf 33.5930μs 11.4031μs 87.6951 KOps/s 89.2123 KOps/s $\color{#d91a1a}-1.70\%$
test_stacked_getitem 33.6630μs 10.5168μs 95.0859 KOps/s 96.8387 KOps/s $\color{#d91a1a}-1.81\%$
test_lock_nested 1.0027ms 0.5149ms 1.9423 KOps/s 1.6350 KOps/s $\textbf{\color{#35bf28}+18.80\%}$
test_lock_stack_nested 0.8449ms 0.4890ms 2.0450 KOps/s 2.0399 KOps/s $\color{#35bf28}+0.25\%$
test_unlock_nested 0.8382ms 0.4376ms 2.2851 KOps/s 2.2946 KOps/s $\color{#d91a1a}-0.41\%$
test_unlock_stack_nested 0.6994ms 0.4029ms 2.4820 KOps/s 2.4852 KOps/s $\color{#d91a1a}-0.13\%$
test_flatten_speed 0.1967ms 0.1060ms 9.4354 KOps/s 9.2749 KOps/s $\color{#35bf28}+1.73\%$
test_unflatten_speed 0.6569ms 0.4482ms 2.2313 KOps/s 2.2290 KOps/s $\color{#35bf28}+0.10\%$
test_common_ops 1.9171ms 1.1469ms 871.8908 Ops/s 827.1231 Ops/s $\textbf{\color{#35bf28}+5.41\%}$
test_creation 26.0090μs 2.5031μs 399.5028 KOps/s 400.6651 KOps/s $\color{#d91a1a}-0.29\%$
test_creation_empty 55.5840μs 20.0006μs 49.9985 KOps/s 44.4805 KOps/s $\textbf{\color{#35bf28}+12.41\%}$
test_creation_nested_1 53.6900μs 23.6549μs 42.2746 KOps/s 38.9234 KOps/s $\textbf{\color{#35bf28}+8.61\%}$
test_creation_nested_2 1.3242ms 27.4953μs 36.3699 KOps/s 34.2073 KOps/s $\textbf{\color{#35bf28}+6.32\%}$
test_clone 67.9070μs 17.1473μs 58.3184 KOps/s 55.3606 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_getitem[int] 0.8654ms 13.1401μs 76.1030 KOps/s 74.2859 KOps/s $\color{#35bf28}+2.45\%$
test_getitem[slice_int] 0.1324ms 33.6768μs 29.6940 KOps/s 28.5413 KOps/s $\color{#35bf28}+4.04\%$
test_getitem[range] 0.1593ms 58.3569μs 17.1359 KOps/s 16.5401 KOps/s $\color{#35bf28}+3.60\%$
test_getitem[tuple] 0.1256ms 27.1581μs 36.8215 KOps/s 36.0892 KOps/s $\color{#35bf28}+2.03\%$
test_getitem[list] 0.1522ms 53.1790μs 18.8044 KOps/s 18.2029 KOps/s $\color{#35bf28}+3.30\%$
test_setitem_dim[int] 61.2740μs 37.4377μs 26.7110 KOps/s 26.2411 KOps/s $\color{#35bf28}+1.79\%$
test_setitem_dim[slice_int] 0.1349ms 75.2224μs 13.2939 KOps/s 12.8331 KOps/s $\color{#35bf28}+3.59\%$
test_setitem_dim[range] 0.1613ms 97.5180μs 10.2545 KOps/s 10.1237 KOps/s $\color{#35bf28}+1.29\%$
test_setitem_dim[tuple] 0.1021ms 64.8460μs 15.4212 KOps/s 15.8374 KOps/s $\color{#d91a1a}-2.63\%$
test_setitem 88.7650μs 30.9344μs 32.3265 KOps/s 30.3524 KOps/s $\textbf{\color{#35bf28}+6.50\%}$
test_set 87.6430μs 30.2976μs 33.0059 KOps/s 30.6419 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_set_shared 3.4067ms 0.2170ms 4.6074 KOps/s 4.5915 KOps/s $\color{#35bf28}+0.35\%$
test_update 0.8476ms 41.2699μs 24.2308 KOps/s 24.1848 KOps/s $\color{#35bf28}+0.19\%$
test_update_nested 0.1128ms 49.0020μs 20.4073 KOps/s 19.3734 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_update__nested 0.1121ms 34.5393μs 28.9525 KOps/s 27.3815 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_set_nested 93.1530μs 33.0043μs 30.2991 KOps/s 28.4198 KOps/s $\textbf{\color{#35bf28}+6.61\%}$
test_set_nested_new 86.3710μs 38.1884μs 26.1860 KOps/s 25.0105 KOps/s $\color{#35bf28}+4.70\%$
test_select 0.1453ms 55.9143μs 17.8845 KOps/s 17.5315 KOps/s $\color{#35bf28}+2.01\%$
test_select_nested 0.1162ms 61.6962μs 16.2085 KOps/s 16.8244 KOps/s $\color{#d91a1a}-3.66\%$
test_exclude_nested 0.1547ms 82.5450μs 12.1146 KOps/s 12.2687 KOps/s $\color{#d91a1a}-1.26\%$
test_empty[True] 0.5343ms 0.3493ms 2.8629 KOps/s 2.9206 KOps/s $\color{#d91a1a}-1.97\%$
test_empty[False] 7.2913μs 1.2757μs 783.8547 KOps/s 767.2170 KOps/s $\color{#35bf28}+2.17\%$
test_unbind_speed 0.4695ms 0.3232ms 3.0938 KOps/s 3.0006 KOps/s $\color{#35bf28}+3.11\%$
test_unbind_speed_stack0 0.6772ms 0.3227ms 3.0991 KOps/s 3.0645 KOps/s $\color{#35bf28}+1.13\%$
test_unbind_speed_stack1 77.2287ms 0.8258ms 1.2109 KOps/s 1.2703 KOps/s $\color{#d91a1a}-4.68\%$
test_split 74.9783ms 2.2543ms 443.5996 Ops/s 394.6534 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_chunk 75.8780ms 2.2670ms 441.1028 Ops/s 454.7622 Ops/s $\color{#d91a1a}-3.00\%$
test_creation[device0] 0.2238ms 0.1217ms 8.2189 KOps/s 7.9083 KOps/s $\color{#35bf28}+3.93\%$
test_creation_from_tensor 3.6174ms 0.1226ms 8.1533 KOps/s 8.2892 KOps/s $\color{#d91a1a}-1.64\%$
test_add_one[memmap_tensor0] 0.1557ms 7.7420μs 129.1661 KOps/s 121.5098 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_contiguous[memmap_tensor0] 17.4930μs 2.1958μs 455.4133 KOps/s 439.9498 KOps/s $\color{#35bf28}+3.51\%$
test_stack[memmap_tensor0] 45.2050μs 5.7251μs 174.6685 KOps/s 159.9693 KOps/s $\textbf{\color{#35bf28}+9.19\%}$
test_memmaptd_index 1.0570ms 0.4420ms 2.2624 KOps/s 2.2521 KOps/s $\color{#35bf28}+0.46\%$
test_memmaptd_index_astensor 0.7569ms 0.5219ms 1.9159 KOps/s 1.9262 KOps/s $\color{#d91a1a}-0.53\%$
test_memmaptd_index_op 1.8859ms 1.0994ms 909.5905 Ops/s 786.3875 Ops/s $\textbf{\color{#35bf28}+15.67\%}$
test_serialize_model 0.2012s 0.1412s 7.0825 Ops/s 7.6632 Ops/s $\textbf{\color{#d91a1a}-7.58\%}$
test_serialize_model_pickle 0.4476s 0.3950s 2.5316 Ops/s 2.5328 Ops/s $\color{#d91a1a}-0.05\%$
test_serialize_weights 0.1311s 0.1244s 8.0365 Ops/s 7.1286 Ops/s $\textbf{\color{#35bf28}+12.74\%}$
test_serialize_weights_returnearly 0.2449s 0.1823s 5.4867 Ops/s 6.0668 Ops/s $\textbf{\color{#d91a1a}-9.56\%}$
test_serialize_weights_pickle 0.4946s 0.4199s 2.3814 Ops/s 2.5223 Ops/s $\textbf{\color{#d91a1a}-5.59\%}$
test_serialize_weights_filesystem 0.1460s 0.1430s 6.9911 Ops/s 7.0024 Ops/s $\color{#d91a1a}-0.16\%$
test_serialize_model_filesystem 0.1575s 0.1518s 6.5863 Ops/s 6.5629 Ops/s $\color{#35bf28}+0.36\%$
test_reshape_pytree 84.6380μs 39.5313μs 25.2964 KOps/s 24.9644 KOps/s $\color{#35bf28}+1.33\%$
test_reshape_td 0.1182ms 49.1289μs 20.3546 KOps/s 19.3407 KOps/s $\textbf{\color{#35bf28}+5.24\%}$
test_view_pytree 93.9550μs 39.5774μs 25.2669 KOps/s 25.3909 KOps/s $\color{#d91a1a}-0.49\%$
test_view_td 0.1500ms 56.8299μs 17.5964 KOps/s 17.5768 KOps/s $\color{#35bf28}+0.11\%$
test_unbind_pytree 71.7140μs 35.7072μs 28.0055 KOps/s 27.6903 KOps/s $\color{#35bf28}+1.14\%$
test_unbind_td 0.3267ms 47.9816μs 20.8413 KOps/s 20.5346 KOps/s $\color{#35bf28}+1.49\%$
test_split_pytree 82.6240μs 38.8807μs 25.7197 KOps/s 25.0254 KOps/s $\color{#35bf28}+2.77\%$
test_split_td 75.4821ms 71.5753μs 13.9713 KOps/s 15.4184 KOps/s $\textbf{\color{#d91a1a}-9.39\%}$
test_add_pytree 0.1321ms 43.5371μs 22.9689 KOps/s 21.6018 KOps/s $\textbf{\color{#35bf28}+6.33\%}$
test_add_td 0.1878ms 87.2304μs 11.4639 KOps/s 10.5285 KOps/s $\textbf{\color{#35bf28}+8.88\%}$
test_distributed 0.2670ms 0.1329ms 7.5245 KOps/s 7.6189 KOps/s $\color{#d91a1a}-1.24\%$
test_tdmodule 33.2420μs 16.9692μs 58.9304 KOps/s 51.7545 KOps/s $\textbf{\color{#35bf28}+13.87\%}$
test_tdmodule_dispatch 59.9920μs 36.3369μs 27.5202 KOps/s 25.0349 KOps/s $\textbf{\color{#35bf28}+9.93\%}$
test_tdseq 43.0200μs 19.8579μs 50.3579 KOps/s 47.5730 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_tdseq_dispatch 62.7470μs 40.6067μs 24.6265 KOps/s 23.0984 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_instantiation_functorch 1.7539ms 1.5976ms 625.9477 Ops/s 626.3290 Ops/s $\color{#d91a1a}-0.06\%$
test_instantiation_td 2.2004ms 1.1651ms 858.3297 Ops/s 863.5623 Ops/s $\color{#d91a1a}-0.61\%$
test_exec_functorch 0.3418ms 0.1806ms 5.5362 KOps/s 5.5202 KOps/s $\color{#35bf28}+0.29\%$
test_exec_functional_call 5.3986ms 0.1736ms 5.7592 KOps/s 5.7652 KOps/s $\color{#d91a1a}-0.10\%$
test_exec_td 0.3373ms 0.1748ms 5.7198 KOps/s 5.7485 KOps/s $\color{#d91a1a}-0.50\%$
test_exec_td_decorator 0.4673ms 0.2617ms 3.8207 KOps/s 3.8998 KOps/s $\color{#d91a1a}-2.03\%$
test_vmap_mlp_speed[True-True] 0.8755ms 0.6111ms 1.6363 KOps/s 1.6015 KOps/s $\color{#35bf28}+2.17\%$
test_vmap_mlp_speed[True-False] 0.8763ms 0.6156ms 1.6244 KOps/s 1.6107 KOps/s $\color{#35bf28}+0.85\%$
test_vmap_mlp_speed[False-True] 0.8169ms 0.5013ms 1.9947 KOps/s 1.9531 KOps/s $\color{#35bf28}+2.13\%$
test_vmap_mlp_speed[False-False] 1.1274ms 0.4995ms 2.0019 KOps/s 1.9537 KOps/s $\color{#35bf28}+2.47\%$
test_vmap_mlp_speed_decorator[True-True] 1.0838ms 0.7077ms 1.4131 KOps/s 1.4084 KOps/s $\color{#35bf28}+0.33\%$
test_vmap_mlp_speed_decorator[True-False] 1.1696ms 0.7070ms 1.4144 KOps/s 1.3557 KOps/s $\color{#35bf28}+4.33\%$
test_vmap_mlp_speed_decorator[False-True] 1.1555ms 0.6042ms 1.6552 KOps/s 1.7191 KOps/s $\color{#d91a1a}-3.72\%$
test_vmap_mlp_speed_decorator[False-False] 0.9156ms 0.5779ms 1.7303 KOps/s 1.7153 KOps/s $\color{#35bf28}+0.88\%$
test_to_module_speed[True] 2.3631ms 1.8233ms 548.4482 Ops/s 556.4121 Ops/s $\color{#d91a1a}-1.43\%$
test_to_module_speed[False] 2.2468ms 1.7745ms 563.5460 Ops/s 562.0112 Ops/s $\color{#35bf28}+0.27\%$
test_tc_init 96.4300μs 45.5573μs 21.9504 KOps/s 20.4369 KOps/s $\textbf{\color{#35bf28}+7.41\%}$
test_tc_init_nested 0.1679ms 94.6353μs 10.5669 KOps/s 10.1004 KOps/s $\color{#35bf28}+4.62\%$
test_tc_first_layer_tensor 54.1100μs 9.0989μs 109.9030 KOps/s 108.7524 KOps/s $\color{#35bf28}+1.06\%$
test_tc_first_layer_nontensor 53.6600μs 9.1465μs 109.3312 KOps/s 109.0340 KOps/s $\color{#35bf28}+0.27\%$
test_tc_second_layer_tensor 19.6860μs 2.8978μs 345.0876 KOps/s 351.9284 KOps/s $\color{#d91a1a}-1.94\%$
test_tc_second_layer_nontensor 55.8040μs 10.2803μs 97.2738 KOps/s 97.3509 KOps/s $\color{#d91a1a}-0.08\%$
test_unbind 97.5956ms 13.7305ms 72.8308 Ops/s 75.1765 Ops/s $\color{#d91a1a}-3.12\%$
test_full_like 8.9573ms 7.2619ms 137.7050 Ops/s 139.4639 Ops/s $\color{#d91a1a}-1.26\%$
test_zeros_like 13.0549ms 6.4885ms 154.1185 Ops/s 159.6210 Ops/s $\color{#d91a1a}-3.45\%$
test_ones_like 14.5941ms 7.6714ms 130.3535 Ops/s 140.7343 Ops/s $\textbf{\color{#d91a1a}-7.38\%}$
test_clone 14.1756ms 9.3179ms 107.3198 Ops/s 113.6487 Ops/s $\textbf{\color{#d91a1a}-5.57\%}$
test_squeeze 64.1000μs 13.9890μs 71.4846 KOps/s 69.6551 KOps/s $\color{#35bf28}+2.63\%$
test_unsqueeze 0.1957ms 97.6287μs 10.2429 KOps/s 9.2213 KOps/s $\textbf{\color{#35bf28}+11.08\%}$
test_split 0.4519ms 0.2085ms 4.7951 KOps/s 4.6939 KOps/s $\color{#35bf28}+2.16\%$
test_permute 0.4665ms 0.2265ms 4.4147 KOps/s 4.3355 KOps/s $\color{#35bf28}+1.83\%$
test_stack 31.6002ms 25.2329ms 39.6308 Ops/s 41.3034 Ops/s $\color{#d91a1a}-4.05\%$
test_cat 29.5932ms 25.0251ms 39.9599 Ops/s 41.5027 Ops/s $\color{#d91a1a}-3.72\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}18$. Worsened: $\large\color{#d91a1a}10$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 26.9410μs 15.4074μs 64.9040 KOps/s 59.5570 KOps/s $\textbf{\color{#35bf28}+8.98\%}$
test_plain_set_stack_nested 37.0600μs 15.3124μs 65.3066 KOps/s 59.6148 KOps/s $\textbf{\color{#35bf28}+9.55\%}$
test_plain_set_nested_inplace 45.7110μs 16.4164μs 60.9147 KOps/s 55.9982 KOps/s $\textbf{\color{#35bf28}+8.78\%}$
test_plain_set_stack_nested_inplace 43.2710μs 16.4360μs 60.8421 KOps/s 56.4249 KOps/s $\textbf{\color{#35bf28}+7.83\%}$
test_items 16.3400μs 4.6081μs 217.0075 KOps/s 214.4173 KOps/s $\color{#35bf28}+1.21\%$
test_items_nested 0.4405ms 0.3875ms 2.5808 KOps/s 2.5366 KOps/s $\color{#35bf28}+1.74\%$
test_items_nested_locked 0.4290ms 0.3909ms 2.5583 KOps/s 2.5504 KOps/s $\color{#35bf28}+0.31\%$
test_items_nested_leaf 0.1064ms 85.5378μs 11.6907 KOps/s 11.5313 KOps/s $\color{#35bf28}+1.38\%$
test_items_stack_nested 0.4498ms 0.3952ms 2.5303 KOps/s 2.5347 KOps/s $\color{#d91a1a}-0.17\%$
test_items_stack_nested_leaf 0.1073ms 86.1002μs 11.6144 KOps/s 11.5491 KOps/s $\color{#35bf28}+0.56\%$
test_items_stack_nested_locked 0.4273ms 0.3988ms 2.5073 KOps/s 2.5264 KOps/s $\color{#d91a1a}-0.75\%$
test_keys 21.9510μs 4.3641μs 229.1415 KOps/s 226.0935 KOps/s $\color{#35bf28}+1.35\%$
test_keys_nested 91.7630μs 67.1304μs 14.8964 KOps/s 15.1861 KOps/s $\color{#d91a1a}-1.91\%$
test_keys_nested_locked 1.8699ms 71.0104μs 14.0824 KOps/s 13.7021 KOps/s $\color{#35bf28}+2.78\%$
test_keys_nested_leaf 79.6810μs 56.0910μs 17.8282 KOps/s 17.2879 KOps/s $\color{#35bf28}+3.13\%$
test_keys_stack_nested 86.1620μs 65.3446μs 15.3035 KOps/s 15.3780 KOps/s $\color{#d91a1a}-0.48\%$
test_keys_stack_nested_leaf 80.6520μs 56.7201μs 17.6304 KOps/s 17.3493 KOps/s $\color{#35bf28}+1.62\%$
test_keys_stack_nested_locked 0.1014ms 71.9450μs 13.8995 KOps/s 13.7432 KOps/s $\color{#35bf28}+1.14\%$
test_values 8.6567μs 1.7559μs 569.5081 KOps/s 567.3254 KOps/s $\color{#35bf28}+0.38\%$
test_values_nested 54.9710μs 33.8518μs 29.5406 KOps/s 29.5624 KOps/s $\color{#d91a1a}-0.07\%$
test_values_nested_locked 58.6110μs 35.8708μs 27.8778 KOps/s 28.0283 KOps/s $\color{#d91a1a}-0.54\%$
test_values_nested_leaf 44.6310μs 30.0903μs 33.2333 KOps/s 33.3791 KOps/s $\color{#d91a1a}-0.44\%$
test_values_stack_nested 60.0510μs 34.6169μs 28.8876 KOps/s 29.4781 KOps/s $\color{#d91a1a}-2.00\%$
test_values_stack_nested_leaf 50.7010μs 30.9576μs 32.3022 KOps/s 33.1711 KOps/s $\color{#d91a1a}-2.62\%$
test_values_stack_nested_locked 54.4920μs 36.7370μs 27.2205 KOps/s 28.0019 KOps/s $\color{#d91a1a}-2.79\%$
test_membership 1.4145μs 0.5450μs 1.8349 MOps/s 1.8462 MOps/s $\color{#d91a1a}-0.61\%$
test_membership_nested 18.1200μs 2.0881μs 478.9077 KOps/s 471.6394 KOps/s $\color{#35bf28}+1.54\%$
test_membership_nested_leaf 16.0755μs 2.0520μs 487.3194 KOps/s 488.4312 KOps/s $\color{#d91a1a}-0.23\%$
test_membership_stacked_nested 24.7510μs 2.0837μs 479.9114 KOps/s 472.1140 KOps/s $\color{#35bf28}+1.65\%$
test_membership_stacked_nested_leaf 31.7910μs 2.0463μs 488.6778 KOps/s 479.9480 KOps/s $\color{#35bf28}+1.82\%$
test_membership_nested_last 20.6810μs 3.0028μs 333.0175 KOps/s 325.2360 KOps/s $\color{#35bf28}+2.39\%$
test_membership_nested_leaf_last 31.0510μs 2.9933μs 334.0821 KOps/s 331.4320 KOps/s $\color{#35bf28}+0.80\%$
test_membership_stacked_nested_last 24.7100μs 4.3758μs 228.5318 KOps/s 333.7789 KOps/s $\textbf{\color{#d91a1a}-31.53\%}$
test_membership_stacked_nested_leaf_last 22.5610μs 4.3195μs 231.5083 KOps/s 328.2295 KOps/s $\textbf{\color{#d91a1a}-29.47\%}$
test_nested_getleaf 37.7910μs 7.9930μs 125.1099 KOps/s 124.3339 KOps/s $\color{#35bf28}+0.62\%$
test_nested_get 30.4000μs 7.5451μs 132.5367 KOps/s 131.2504 KOps/s $\color{#35bf28}+0.98\%$
test_stacked_getleaf 29.4310μs 8.0503μs 124.2196 KOps/s 124.3413 KOps/s $\color{#d91a1a}-0.10\%$
test_stacked_get 32.9410μs 7.5112μs 133.1353 KOps/s 131.9112 KOps/s $\color{#35bf28}+0.93\%$
test_nested_getitemleaf 23.5910μs 8.1293μs 123.0121 KOps/s 122.4599 KOps/s $\color{#35bf28}+0.45\%$
test_nested_getitem 27.6510μs 7.6819μs 130.1763 KOps/s 128.8586 KOps/s $\color{#35bf28}+1.02\%$
test_stacked_getitemleaf 22.6410μs 8.1915μs 122.0781 KOps/s 121.6771 KOps/s $\color{#35bf28}+0.33\%$
test_stacked_getitem 31.8810μs 7.7037μs 129.8077 KOps/s 129.3669 KOps/s $\color{#35bf28}+0.34\%$
test_lock_nested 4.3976ms 0.4848ms 2.0626 KOps/s 2.0714 KOps/s $\color{#d91a1a}-0.42\%$
test_lock_stack_nested 0.4736ms 0.4341ms 2.3036 KOps/s 2.2499 KOps/s $\color{#35bf28}+2.39\%$
test_unlock_nested 0.8472ms 0.4007ms 2.4958 KOps/s 2.4614 KOps/s $\color{#35bf28}+1.40\%$
test_unlock_stack_nested 0.4092ms 0.3547ms 2.8195 KOps/s 2.7658 KOps/s $\color{#35bf28}+1.94\%$
test_flatten_speed 0.1992ms 0.1069ms 9.3531 KOps/s 9.5155 KOps/s $\color{#d91a1a}-1.71\%$
test_unflatten_speed 0.3455ms 0.2966ms 3.3710 KOps/s 3.3776 KOps/s $\color{#d91a1a}-0.19\%$
test_common_ops 1.6971ms 1.2787ms 782.0374 Ops/s 748.5977 Ops/s $\color{#35bf28}+4.47\%$
test_creation 17.0500μs 1.9916μs 502.1201 KOps/s 504.0542 KOps/s $\color{#d91a1a}-0.38\%$
test_creation_empty 36.9310μs 14.3929μs 69.4789 KOps/s 59.4282 KOps/s $\textbf{\color{#35bf28}+16.91\%}$
test_creation_nested_1 41.9510μs 16.3887μs 61.0175 KOps/s 52.5891 KOps/s $\textbf{\color{#35bf28}+16.03\%}$
test_creation_nested_2 43.8600μs 18.8916μs 52.9336 KOps/s 45.4877 KOps/s $\textbf{\color{#35bf28}+16.37\%}$
test_clone 57.3610μs 31.3326μs 31.9157 KOps/s 32.0896 KOps/s $\color{#d91a1a}-0.54\%$
test_getitem[int] 1.1849ms 17.7399μs 56.3701 KOps/s 57.3979 KOps/s $\color{#d91a1a}-1.79\%$
test_getitem[slice_int] 0.1538ms 29.8664μs 33.4825 KOps/s 34.2937 KOps/s $\color{#d91a1a}-2.37\%$
test_getitem[range] 0.3110ms 0.1195ms 8.3690 KOps/s 8.3887 KOps/s $\color{#d91a1a}-0.23\%$
test_getitem[tuple] 0.1502ms 26.2479μs 38.0983 KOps/s 38.5674 KOps/s $\color{#d91a1a}-1.22\%$
test_getitem[list] 0.2773ms 0.1089ms 9.1793 KOps/s 9.3232 KOps/s $\color{#d91a1a}-1.54\%$
test_setitem_dim[int] 74.0720μs 51.7468μs 19.3249 KOps/s 18.1243 KOps/s $\textbf{\color{#35bf28}+6.62\%}$
test_setitem_dim[slice_int] 0.1005ms 77.0893μs 12.9720 KOps/s 12.6242 KOps/s $\color{#35bf28}+2.76\%$
test_setitem_dim[range] 0.1731ms 0.1412ms 7.0816 KOps/s 7.0082 KOps/s $\color{#35bf28}+1.05\%$
test_setitem_dim[tuple] 0.1619ms 70.1824μs 14.2486 KOps/s 13.8029 KOps/s $\color{#35bf28}+3.23\%$
test_setitem 78.0820μs 43.3361μs 23.0754 KOps/s 22.6521 KOps/s $\color{#35bf28}+1.87\%$
test_set 69.6410μs 41.7419μs 23.9567 KOps/s 22.8678 KOps/s $\color{#35bf28}+4.76\%$
test_set_shared 0.3903ms 55.0147μs 18.1770 KOps/s 18.1636 KOps/s $\color{#35bf28}+0.07\%$
test_update 87.5620μs 51.6162μs 19.3738 KOps/s 18.7866 KOps/s $\color{#35bf28}+3.13\%$
test_update_nested 85.8320μs 59.8592μs 16.7059 KOps/s 16.5913 KOps/s $\color{#35bf28}+0.69\%$
test_update__nested 0.1336ms 68.3630μs 14.6278 KOps/s 15.9756 KOps/s $\textbf{\color{#d91a1a}-8.44\%}$
test_set_nested 77.2620μs 47.9349μs 20.8616 KOps/s 22.0050 KOps/s $\textbf{\color{#d91a1a}-5.20\%}$
test_set_nested_new 74.5720μs 51.7860μs 19.3103 KOps/s 20.0040 KOps/s $\color{#d91a1a}-3.47\%$
test_select 0.1003ms 67.3722μs 14.8429 KOps/s 15.3017 KOps/s $\color{#d91a1a}-3.00\%$
test_select_nested 0.4868ms 54.3975μs 18.3832 KOps/s 18.8719 KOps/s $\color{#d91a1a}-2.59\%$
test_exclude_nested 99.2020μs 75.0972μs 13.3161 KOps/s 13.7418 KOps/s $\color{#d91a1a}-3.10\%$
test_empty[True] 0.3401ms 0.3003ms 3.3295 KOps/s 3.3730 KOps/s $\color{#d91a1a}-1.29\%$
test_empty[False] 2.7380μs 0.9330μs 1.0718 MOps/s 1.0867 MOps/s $\color{#d91a1a}-1.37\%$
test_to 57.8710μs 38.0436μs 26.2856 KOps/s 27.1533 KOps/s $\color{#d91a1a}-3.20\%$
test_to_nonblocking 53.4410μs 23.7716μs 42.0669 KOps/s 42.1143 KOps/s $\color{#d91a1a}-0.11\%$
test_unbind_speed 0.3418ms 0.3070ms 3.2573 KOps/s 3.1425 KOps/s $\color{#35bf28}+3.65\%$
test_unbind_speed_stack0 0.3513ms 0.3014ms 3.3174 KOps/s 3.2050 KOps/s $\color{#35bf28}+3.51\%$
test_unbind_speed_stack1 88.9872ms 0.7859ms 1.2725 KOps/s 1.2597 KOps/s $\color{#35bf28}+1.01\%$
test_split 91.7578ms 2.3881ms 418.7425 Ops/s 420.8472 Ops/s $\color{#d91a1a}-0.50\%$
test_chunk 91.2964ms 2.3755ms 420.9599 Ops/s 419.4262 Ops/s $\color{#35bf28}+0.37\%$
test_creation[device0] 0.1581ms 0.1044ms 9.5781 KOps/s 9.5118 KOps/s $\color{#35bf28}+0.70\%$
test_creation_from_tensor 0.1564ms 0.1025ms 9.7561 KOps/s 9.7913 KOps/s $\color{#d91a1a}-0.36\%$
test_add_one[memmap_tensor0] 26.0510μs 10.1685μs 98.3431 KOps/s 93.5751 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_contiguous[memmap_tensor0] 21.5500μs 2.1845μs 457.7724 KOps/s 448.4145 KOps/s $\color{#35bf28}+2.09\%$
test_stack[memmap_tensor0] 52.5110μs 7.1864μs 139.1509 KOps/s 139.6515 KOps/s $\color{#d91a1a}-0.36\%$
test_memmaptd_index 1.1103ms 0.4469ms 2.2375 KOps/s 2.2420 KOps/s $\color{#d91a1a}-0.20\%$
test_memmaptd_index_astensor 0.7734ms 0.5098ms 1.9614 KOps/s 1.8550 KOps/s $\textbf{\color{#35bf28}+5.74\%}$
test_memmaptd_index_op 1.4590ms 1.0311ms 969.8087 Ops/s 909.8104 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_serialize_model 99.3850ms 94.9508ms 10.5318 Ops/s 10.1877 Ops/s $\color{#35bf28}+3.38\%$
test_serialize_model_pickle 1.3694s 1.2395s 0.8068 Ops/s 0.8064 Ops/s $\color{#35bf28}+0.05\%$
test_serialize_weights 0.1847s 0.1017s 9.8307 Ops/s 9.4492 Ops/s $\color{#35bf28}+4.04\%$
test_serialize_weights_returnearly 0.3139s 88.4810ms 11.3019 Ops/s 11.6390 Ops/s $\color{#d91a1a}-2.90\%$
test_serialize_weights_pickle 1.3511s 1.2362s 0.8089 Ops/s 0.8086 Ops/s $\color{#35bf28}+0.04\%$
test_reshape_pytree 68.9310μs 38.8653μs 25.7299 KOps/s 25.4244 KOps/s $\color{#35bf28}+1.20\%$
test_reshape_td 68.6820μs 48.1644μs 20.7622 KOps/s 21.9670 KOps/s $\textbf{\color{#d91a1a}-5.48\%}$
test_view_pytree 75.1420μs 39.2586μs 25.4721 KOps/s 24.5007 KOps/s $\color{#35bf28}+3.96\%$
test_view_td 0.2260ms 58.2442μs 17.1691 KOps/s 19.5174 KOps/s $\textbf{\color{#d91a1a}-12.03\%}$
test_unbind_pytree 0.1614ms 37.8559μs 26.4160 KOps/s 26.0910 KOps/s $\color{#35bf28}+1.25\%$
test_unbind_td 0.4196ms 46.3014μs 21.5976 KOps/s 21.0134 KOps/s $\color{#35bf28}+2.78\%$
test_split_pytree 80.6210μs 51.4238μs 19.4462 KOps/s 19.4300 KOps/s $\color{#35bf28}+0.08\%$
test_split_td 0.4968ms 61.2298μs 16.3319 KOps/s 16.1170 KOps/s $\color{#35bf28}+1.33\%$
test_add_pytree 0.1003ms 60.7351μs 16.4649 KOps/s 16.4037 KOps/s $\color{#35bf28}+0.37\%$
test_add_td 0.1355ms 91.8794μs 10.8838 KOps/s 10.1946 KOps/s $\textbf{\color{#35bf28}+6.76\%}$
test_compile_add_one_nested[tensordict-compile] 0.4089ms 0.2113ms 4.7316 KOps/s 4.7349 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_add_one_nested[tensordict-eager] 0.2598ms 0.1738ms 5.7535 KOps/s 5.7726 KOps/s $\color{#d91a1a}-0.33\%$
test_compile_add_one_nested[pytree-compile] 0.1806ms 0.1465ms 6.8260 KOps/s 6.7872 KOps/s $\color{#35bf28}+0.57\%$
test_compile_add_one_nested[pytree-eager] 0.2483ms 0.1956ms 5.1132 KOps/s 5.0622 KOps/s $\color{#35bf28}+1.01\%$
test_compile_copy_nested[tensordict-compile] 45.0810μs 21.6034μs 46.2889 KOps/s 44.6837 KOps/s $\color{#35bf28}+3.59\%$
test_compile_copy_nested[tensordict-eager] 76.8810μs 48.8530μs 20.4696 KOps/s 20.5212 KOps/s $\color{#d91a1a}-0.25\%$
test_compile_copy_nested[pytree-compile] 99.2830μs 72.1268μs 13.8645 KOps/s 13.6492 KOps/s $\color{#35bf28}+1.58\%$
test_compile_copy_nested[pytree-eager] 83.0610μs 59.7963μs 16.7234 KOps/s 16.7125 KOps/s $\color{#35bf28}+0.07\%$
test_compile_add_one_flat[tensordict-compile] 0.4068ms 0.3283ms 3.0462 KOps/s 3.0349 KOps/s $\color{#35bf28}+0.37\%$
test_compile_add_one_flat[tensordict-eager] 0.2561ms 0.2222ms 4.5010 KOps/s 4.4932 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensorclass-compile] 0.1704ms 0.1311ms 7.6299 KOps/s 7.6081 KOps/s $\color{#35bf28}+0.29\%$
test_compile_add_one_flat[tensorclass-eager] 0.1171ms 62.8806μs 15.9032 KOps/s 15.8431 KOps/s $\color{#35bf28}+0.38\%$
test_compile_add_one_flat[pytree-compile] 0.3707ms 0.3278ms 3.0502 KOps/s 3.0715 KOps/s $\color{#d91a1a}-0.69\%$
test_compile_add_one_flat[pytree-eager] 0.7052ms 0.6426ms 1.5562 KOps/s 1.5523 KOps/s $\color{#35bf28}+0.25\%$
test_compile_add_self_flat[tensordict-eager] 0.3214ms 0.2723ms 3.6722 KOps/s 3.6849 KOps/s $\color{#d91a1a}-0.34\%$
test_compile_add_self_flat[tensordict-compile] 0.4068ms 0.3298ms 3.0324 KOps/s 3.0189 KOps/s $\color{#35bf28}+0.45\%$
test_compile_add_self_flat[tensorclass-eager] 0.1556ms 75.0415μs 13.3260 KOps/s 13.2308 KOps/s $\color{#35bf28}+0.72\%$
test_compile_add_self_flat[tensorclass-compile] 0.2839ms 0.1323ms 7.5601 KOps/s 7.4552 KOps/s $\color{#35bf28}+1.41\%$
test_compile_add_self_flat[pytree-eager] 0.6151ms 0.5427ms 1.8426 KOps/s 1.8241 KOps/s $\color{#35bf28}+1.02\%$
test_compile_add_self_flat[pytree-compile] 0.3603ms 0.3273ms 3.0552 KOps/s 3.0600 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_copy_flat[tensordict-compile] 40.6810μs 18.6343μs 53.6645 KOps/s 52.9300 KOps/s $\color{#35bf28}+1.39\%$
test_compile_copy_flat[tensordict-eager] 48.0710μs 31.6253μs 31.6202 KOps/s 31.1130 KOps/s $\color{#35bf28}+1.63\%$
test_compile_copy_flat[pytree-compile] 0.1093ms 75.4315μs 13.2571 KOps/s 13.2171 KOps/s $\color{#35bf28}+0.30\%$
test_compile_copy_flat[pytree-eager] 91.8220μs 60.3925μs 16.5583 KOps/s 16.4098 KOps/s $\color{#35bf28}+0.91\%$
test_compile_assign_and_add[tensordict-compile] 2.5066ms 0.9271ms 1.0787 KOps/s 1.0595 KOps/s $\color{#35bf28}+1.81\%$
test_compile_assign_and_add[tensordict-eager] 3.4825ms 3.3472ms 298.7546 Ops/s 290.0832 Ops/s $\color{#35bf28}+2.99\%$
test_compile_assign_and_add[pytree-compile] 2.4925ms 0.9134ms 1.0948 KOps/s 1.0840 KOps/s $\color{#35bf28}+0.99\%$
test_compile_assign_and_add[pytree-eager] 4.5653ms 3.3909ms 294.9053 Ops/s 291.7828 Ops/s $\color{#35bf28}+1.07\%$
test_compile_indexing[tensor-tensordict-compile] 0.1592ms 0.1139ms 8.7772 KOps/s 8.8538 KOps/s $\color{#d91a1a}-0.87\%$
test_compile_indexing[tensor-tensordict-eager] 0.2493ms 67.8281μs 14.7432 KOps/s 14.8670 KOps/s $\color{#d91a1a}-0.83\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1358ms 0.1041ms 9.6027 KOps/s 9.5039 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[tensor-tensorclass-eager] 96.4820μs 46.9714μs 21.2896 KOps/s 21.0671 KOps/s $\color{#35bf28}+1.06\%$
test_compile_indexing[tensor-pytree-compile] 0.1394ms 0.1065ms 9.3916 KOps/s 9.3252 KOps/s $\color{#35bf28}+0.71\%$
test_compile_indexing[tensor-pytree-eager] 80.3120μs 48.2253μs 20.7360 KOps/s 21.1023 KOps/s $\color{#d91a1a}-1.74\%$
test_compile_indexing[slice-tensordict-compile] 0.1721ms 0.1401ms 7.1364 KOps/s 7.0571 KOps/s $\color{#35bf28}+1.12\%$
test_compile_indexing[slice-tensordict-eager] 0.1887ms 26.9879μs 37.0536 KOps/s 36.5376 KOps/s $\color{#35bf28}+1.41\%$
test_compile_indexing[slice-tensorclass-compile] 0.1746ms 0.1312ms 7.6214 KOps/s 7.5215 KOps/s $\color{#35bf28}+1.33\%$
test_compile_indexing[slice-tensorclass-eager] 51.1410μs 22.9556μs 43.5623 KOps/s 42.8208 KOps/s $\color{#35bf28}+1.73\%$
test_compile_indexing[slice-pytree-compile] 0.1642ms 0.1317ms 7.5954 KOps/s 7.4833 KOps/s $\color{#35bf28}+1.50\%$
test_compile_indexing[slice-pytree-eager] 47.5510μs 23.0950μs 43.2994 KOps/s 43.1073 KOps/s $\color{#35bf28}+0.45\%$
test_compile_indexing[int-tensordict-compile] 0.1664ms 0.1398ms 7.1545 KOps/s 7.1064 KOps/s $\color{#35bf28}+0.68\%$
test_compile_indexing[int-tensordict-eager] 0.4971ms 27.2378μs 36.7137 KOps/s 37.1769 KOps/s $\color{#d91a1a}-1.25\%$
test_compile_indexing[int-tensorclass-compile] 0.1614ms 0.1316ms 7.5998 KOps/s 7.5092 KOps/s $\color{#35bf28}+1.21\%$
test_compile_indexing[int-tensorclass-eager] 59.1110μs 23.0454μs 43.3926 KOps/s 43.1061 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[int-pytree-compile] 0.1643ms 0.1315ms 7.6057 KOps/s 7.5312 KOps/s $\color{#35bf28}+0.99\%$
test_compile_indexing[int-pytree-eager] 46.3910μs 22.8789μs 43.7084 KOps/s 43.1523 KOps/s $\color{#35bf28}+1.29\%$
test_mod_add[eager] 70.6310μs 37.9329μs 26.3623 KOps/s 26.3564 KOps/s $\color{#35bf28}+0.02\%$
test_mod_add[compile] 0.1729ms 68.0717μs 14.6904 KOps/s 14.4239 KOps/s $\color{#35bf28}+1.85\%$
test_mod_add[compile-overhead] 0.2628ms 0.1485ms 6.7328 KOps/s 6.6398 KOps/s $\color{#35bf28}+1.40\%$
test_mod_wrap[eager] 0.4046ms 0.2565ms 3.8985 KOps/s 3.8738 KOps/s $\color{#35bf28}+0.64\%$
test_mod_wrap[compile] 1.2240ms 0.2976ms 3.3599 KOps/s 3.3256 KOps/s $\color{#35bf28}+1.03\%$
test_mod_wrap[compile-overhead] 8.2351ms 4.2980ms 232.6657 Ops/s 233.0126 Ops/s $\color{#d91a1a}-0.15\%$
test_mod_wrap_and_backward[eager] 1.5965ms 1.4536ms 687.9593 Ops/s 685.5892 Ops/s $\color{#35bf28}+0.35\%$
test_mod_wrap_and_backward[compile] 1.6111ms 1.4737ms 678.5525 Ops/s 726.7149 Ops/s $\textbf{\color{#d91a1a}-6.63\%}$
test_mod_wrap_and_backward[compile-overhead] 1.7846ms 1.0622ms 941.4182 Ops/s 1.1076 KOps/s $\textbf{\color{#d91a1a}-15.00\%}$
test_seq_add[eager] 0.1524ms 0.1114ms 8.9741 KOps/s 8.5038 KOps/s $\textbf{\color{#35bf28}+5.53\%}$
test_seq_add[compile] 0.1503ms 89.3284μs 11.1946 KOps/s 11.4003 KOps/s $\color{#d91a1a}-1.80\%$
test_seq_add[compile-overhead] 0.1595ms 0.1246ms 8.0264 KOps/s 8.0867 KOps/s $\color{#d91a1a}-0.75\%$
test_seq_wrap[eager] 0.4987ms 0.4368ms 2.2893 KOps/s 2.2156 KOps/s $\color{#35bf28}+3.33\%$
test_seq_wrap[compile] 1.4878ms 0.3445ms 2.9024 KOps/s 2.8442 KOps/s $\color{#35bf28}+2.05\%$
test_seq_wrap[compile-overhead] 0.3045s 0.1456s 6.8703 Ops/s 6.8077 Ops/s $\color{#35bf28}+0.92\%$
test_func_call_runtime[False-eager] 1.1009ms 0.7953ms 1.2574 KOps/s 1.3137 KOps/s $\color{#d91a1a}-4.29\%$
test_func_call_runtime[False-compile] 0.9050ms 0.8293ms 1.2058 KOps/s 1.2065 KOps/s $\color{#d91a1a}-0.06\%$
test_func_call_runtime[False-compile-overhead] 0.4349ms 0.3629ms 2.7554 KOps/s 2.7363 KOps/s $\color{#35bf28}+0.70\%$
test_func_call_runtime[True-eager] 1.3599ms 1.0122ms 987.8991 Ops/s 985.7730 Ops/s $\color{#35bf28}+0.22\%$
test_func_call_runtime[True-compile] 0.9513ms 0.8633ms 1.1583 KOps/s 1.1538 KOps/s $\color{#35bf28}+0.39\%$
test_func_call_runtime[True-compile-overhead] 0.4617ms 0.4049ms 2.4697 KOps/s 2.4655 KOps/s $\color{#35bf28}+0.17\%$
test_distributed 0.2726ms 68.0825μs 14.6881 KOps/s 11.1452 KOps/s $\textbf{\color{#35bf28}+31.79\%}$
test_tdmodule 29.1110μs 14.0585μs 71.1312 KOps/s 62.3466 KOps/s $\textbf{\color{#35bf28}+14.09\%}$
test_tdmodule_dispatch 50.8310μs 29.0860μs 34.3808 KOps/s 29.4473 KOps/s $\textbf{\color{#35bf28}+16.75\%}$
test_tdseq 31.8110μs 15.2549μs 65.5528 KOps/s 57.3123 KOps/s $\textbf{\color{#35bf28}+14.38\%}$
test_tdseq_dispatch 60.0410μs 32.0455μs 31.2056 KOps/s 27.6620 KOps/s $\textbf{\color{#35bf28}+12.81\%}$
test_instantiation_functorch 2.1616ms 2.0276ms 493.1979 Ops/s 496.6940 Ops/s $\color{#d91a1a}-0.70\%$
test_instantiation_td 2.0158ms 1.3146ms 760.6660 Ops/s 759.9221 Ops/s $\color{#35bf28}+0.10\%$
test_exec_functorch 0.2871ms 0.2312ms 4.3253 KOps/s 4.3571 KOps/s $\color{#d91a1a}-0.73\%$
test_exec_functional_call 0.3109ms 0.2263ms 4.4191 KOps/s 4.4353 KOps/s $\color{#d91a1a}-0.37\%$
test_exec_td 0.3293ms 0.2244ms 4.4560 KOps/s 4.4546 KOps/s $\color{#35bf28}+0.03\%$
test_exec_td_decorator 1.0205ms 0.3125ms 3.2002 KOps/s 3.3469 KOps/s $\color{#d91a1a}-4.38\%$
test_vmap_mlp_speed[True-True] 0.8154ms 0.6747ms 1.4821 KOps/s 1.4756 KOps/s $\color{#35bf28}+0.44\%$
test_vmap_mlp_speed[True-False] 0.7866ms 0.7014ms 1.4257 KOps/s 1.4824 KOps/s $\color{#d91a1a}-3.83\%$
test_vmap_mlp_speed[False-True] 0.6872ms 0.6080ms 1.6448 KOps/s 1.6974 KOps/s $\color{#d91a1a}-3.10\%$
test_vmap_mlp_speed[False-False] 0.6813ms 0.6225ms 1.6063 KOps/s 1.6978 KOps/s $\textbf{\color{#d91a1a}-5.39\%}$
test_vmap_mlp_speed_decorator[True-True] 0.8493ms 0.7516ms 1.3306 KOps/s 1.3236 KOps/s $\color{#35bf28}+0.52\%$
test_vmap_mlp_speed_decorator[True-False] 1.1379ms 0.7527ms 1.3285 KOps/s 1.3005 KOps/s $\color{#35bf28}+2.15\%$
test_vmap_mlp_speed_decorator[False-True] 0.8044ms 0.6617ms 1.5113 KOps/s 1.5207 KOps/s $\color{#d91a1a}-0.61\%$
test_vmap_mlp_speed_decorator[False-False] 0.8255ms 0.6616ms 1.5115 KOps/s 1.5195 KOps/s $\color{#d91a1a}-0.52\%$
test_vmap_transformer_speed[True-True] 9.0935ms 8.9162ms 112.1553 Ops/s 112.8696 Ops/s $\color{#d91a1a}-0.63\%$
test_vmap_transformer_speed[True-False] 9.0660ms 8.8972ms 112.3943 Ops/s 112.9501 Ops/s $\color{#d91a1a}-0.49\%$
test_vmap_transformer_speed[False-True] 10.5280ms 8.8407ms 113.1128 Ops/s 114.1490 Ops/s $\color{#d91a1a}-0.91\%$
test_vmap_transformer_speed[False-False] 8.8836ms 8.8008ms 113.6256 Ops/s 114.2045 Ops/s $\color{#d91a1a}-0.51\%$
test_vmap_transformer_speed_decorator[True-True] 22.0313ms 21.2545ms 47.0490 Ops/s 47.4176 Ops/s $\color{#d91a1a}-0.78\%$
test_vmap_transformer_speed_decorator[True-False] 21.9396ms 21.2154ms 47.1355 Ops/s 47.6472 Ops/s $\color{#d91a1a}-1.07\%$
test_vmap_transformer_speed_decorator[False-True] 21.7730ms 21.0526ms 47.5002 Ops/s 48.0240 Ops/s $\color{#d91a1a}-1.09\%$
test_vmap_transformer_speed_decorator[False-False] 21.7879ms 21.0123ms 47.5911 Ops/s 48.0876 Ops/s $\color{#d91a1a}-1.03\%$
test_to_module_speed[True] 2.7672ms 1.4838ms 673.9417 Ops/s 666.2330 Ops/s $\color{#35bf28}+1.16\%$
test_to_module_speed[False] 1.9976ms 1.4845ms 673.6394 Ops/s 666.1134 Ops/s $\color{#35bf28}+1.13\%$
test_tc_init 63.4020μs 33.9831μs 29.4264 KOps/s 28.0399 KOps/s $\color{#35bf28}+4.94\%$
test_tc_init_nested 0.1601ms 68.2841μs 14.6447 KOps/s 14.1729 KOps/s $\color{#35bf28}+3.33\%$
test_tc_first_layer_tensor 18.6710μs 3.9873μs 250.7974 KOps/s 249.5989 KOps/s $\color{#35bf28}+0.48\%$
test_tc_first_layer_nontensor 26.2600μs 4.0249μs 248.4535 KOps/s 248.1238 KOps/s $\color{#35bf28}+0.13\%$
test_tc_second_layer_tensor 4.7703μs 1.2904μs 774.9263 KOps/s 777.2183 KOps/s $\color{#d91a1a}-0.29\%$
test_tc_second_layer_nontensor 26.5710μs 4.6097μs 216.9349 KOps/s 215.5012 KOps/s $\color{#35bf28}+0.67\%$
test_unbind 0.3151s 12.9163ms 77.4214 Ops/s 76.3961 Ops/s $\color{#35bf28}+1.34\%$
test_full_like 0.6593ms 0.5786ms 1.7283 KOps/s 1.7284 KOps/s $-0.00\%$
test_zeros_like 0.2617ms 0.1977ms 5.0584 KOps/s 5.0582 KOps/s $+0.00\%$
test_ones_like 0.2255ms 0.1975ms 5.0625 KOps/s 5.0632 KOps/s $\color{#d91a1a}-0.01\%$
test_clone 0.4481ms 0.4146ms 2.4118 KOps/s 2.4116 KOps/s $+0.01\%$
test_squeeze 38.5010μs 12.5660μs 79.5800 KOps/s 83.8658 KOps/s $\textbf{\color{#d91a1a}-5.11\%}$
test_unsqueeze 0.2624ms 89.5282μs 11.1697 KOps/s 11.5915 KOps/s $\color{#d91a1a}-3.64\%$
test_split 0.4613ms 0.1843ms 5.4264 KOps/s 5.4105 KOps/s $\color{#35bf28}+0.30\%$
test_permute 0.3120ms 0.2013ms 4.9689 KOps/s 5.0381 KOps/s $\color{#d91a1a}-1.37\%$
test_stack 1.2567ms 0.9049ms 1.1051 KOps/s 1.0827 KOps/s $\color{#35bf28}+2.07\%$
test_cat 1.2646ms 1.2318ms 811.8360 Ops/s 811.8973 Ops/s $-0.01\%$

@vmoens vmoens deleted the fix-return-val branch October 21, 2024 14:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants