Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] from_modules expand_identical kwarg #911

Merged
merged 3 commits into from
Jul 23, 2024
Merged

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Jul 23, 2024

No description provided.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jul 23, 2024
Copy link

github-actions bot commented Jul 23, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 144. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 47.9890μs 23.1346μs 43.2253 KOps/s 42.5180 KOps/s $\color{#35bf28}+1.66\%$
test_plain_set_stack_nested 56.9160μs 23.3865μs 42.7597 KOps/s 42.5296 KOps/s $\color{#35bf28}+0.54\%$
test_plain_set_nested_inplace 60.6030μs 25.1798μs 39.7145 KOps/s 39.4347 KOps/s $\color{#35bf28}+0.71\%$
test_plain_set_stack_nested_inplace 60.5030μs 25.1248μs 39.8012 KOps/s 39.6354 KOps/s $\color{#35bf28}+0.42\%$
test_items 16.9520μs 2.6125μs 382.7770 KOps/s 378.1491 KOps/s $\color{#35bf28}+1.22\%$
test_items_nested 0.5156ms 0.3667ms 2.7272 KOps/s 2.7661 KOps/s $\color{#d91a1a}-1.41\%$
test_items_nested_locked 1.3875ms 0.3663ms 2.7301 KOps/s 2.7566 KOps/s $\color{#d91a1a}-0.96\%$
test_items_nested_leaf 0.1635ms 86.7250μs 11.5307 KOps/s 11.2564 KOps/s $\color{#35bf28}+2.44\%$
test_items_stack_nested 0.6266ms 0.3670ms 2.7247 KOps/s 2.7495 KOps/s $\color{#d91a1a}-0.90\%$
test_items_stack_nested_leaf 0.1701ms 88.4321μs 11.3081 KOps/s 11.3087 KOps/s $-0.01\%$
test_items_stack_nested_locked 0.5832ms 0.3679ms 2.7185 KOps/s 2.7511 KOps/s $\color{#d91a1a}-1.19\%$
test_keys 17.8440μs 3.9601μs 252.5192 KOps/s 250.4890 KOps/s $\color{#35bf28}+0.81\%$
test_keys_nested 0.2594ms 0.1437ms 6.9614 KOps/s 6.8706 KOps/s $\color{#35bf28}+1.32\%$
test_keys_nested_locked 0.7761ms 0.1510ms 6.6246 KOps/s 6.6266 KOps/s $\color{#d91a1a}-0.03\%$
test_keys_nested_leaf 0.1894ms 0.1229ms 8.1367 KOps/s 7.9478 KOps/s $\color{#35bf28}+2.38\%$
test_keys_stack_nested 0.3063ms 0.1449ms 6.9028 KOps/s 6.7881 KOps/s $\color{#35bf28}+1.69\%$
test_keys_stack_nested_leaf 0.2081ms 0.1244ms 8.0366 KOps/s 7.9696 KOps/s $\color{#35bf28}+0.84\%$
test_keys_stack_nested_locked 0.2939ms 0.1519ms 6.5837 KOps/s 6.6516 KOps/s $\color{#d91a1a}-1.02\%$
test_values 5.7557μs 1.1477μs 871.2892 KOps/s 851.4647 KOps/s $\color{#35bf28}+2.33\%$
test_values_nested 94.5760μs 50.0082μs 19.9967 KOps/s 19.8068 KOps/s $\color{#35bf28}+0.96\%$
test_values_nested_locked 98.4730μs 50.2364μs 19.9059 KOps/s 19.7084 KOps/s $\color{#35bf28}+1.00\%$
test_values_nested_leaf 92.7430μs 45.1688μs 22.1392 KOps/s 21.7956 KOps/s $\color{#35bf28}+1.58\%$
test_values_stack_nested 0.1057ms 51.1355μs 19.5559 KOps/s 19.2068 KOps/s $\color{#35bf28}+1.82\%$
test_values_stack_nested_leaf 85.0880μs 44.8937μs 22.2748 KOps/s 21.7305 KOps/s $\color{#35bf28}+2.50\%$
test_values_stack_nested_locked 0.1428ms 50.9696μs 19.6195 KOps/s 19.4534 KOps/s $\color{#35bf28}+0.85\%$
test_membership 4.9507μs 0.7548μs 1.3249 MOps/s 1.2420 MOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_membership_nested 24.3650μs 2.6213μs 381.4947 KOps/s 368.5423 KOps/s $\color{#35bf28}+3.51\%$
test_membership_nested_leaf 26.5400μs 2.6576μs 376.2806 KOps/s 370.6125 KOps/s $\color{#35bf28}+1.53\%$
test_membership_stacked_nested 24.6360μs 2.6349μs 379.5157 KOps/s 371.0964 KOps/s $\color{#35bf28}+2.27\%$
test_membership_stacked_nested_leaf 30.5160μs 2.6703μs 374.4855 KOps/s 370.1891 KOps/s $\color{#35bf28}+1.16\%$
test_membership_nested_last 28.1530μs 4.0000μs 249.9980 KOps/s 248.3669 KOps/s $\color{#35bf28}+0.66\%$
test_membership_nested_leaf_last 27.9920μs 3.9836μs 251.0265 KOps/s 247.5793 KOps/s $\color{#35bf28}+1.39\%$
test_membership_stacked_nested_last 34.0240μs 7.5332μs 132.7454 KOps/s 247.1101 KOps/s $\textbf{\color{#d91a1a}-46.28\%}$
test_membership_stacked_nested_leaf_last 0.1428ms 7.9847μs 125.2393 KOps/s 245.9261 KOps/s $\textbf{\color{#d91a1a}-49.07\%}$
test_nested_getleaf 36.6680μs 10.8825μs 91.8905 KOps/s 93.1421 KOps/s $\color{#d91a1a}-1.34\%$
test_nested_get 32.2900μs 10.3834μs 96.3078 KOps/s 95.4187 KOps/s $\color{#35bf28}+0.93\%$
test_stacked_getleaf 38.5620μs 10.8248μs 92.3806 KOps/s 87.1255 KOps/s $\textbf{\color{#35bf28}+6.03\%}$
test_stacked_get 28.0320μs 10.2522μs 97.5400 KOps/s 94.8731 KOps/s $\color{#35bf28}+2.81\%$
test_nested_getitemleaf 43.5710μs 11.4157μs 87.5986 KOps/s 85.5973 KOps/s $\color{#35bf28}+2.34\%$
test_nested_getitem 49.0280μs 10.4425μs 95.7628 KOps/s 92.8310 KOps/s $\color{#35bf28}+3.16\%$
test_stacked_getitemleaf 0.1702ms 11.9067μs 83.9860 KOps/s 87.0051 KOps/s $\color{#d91a1a}-3.47\%$
test_stacked_getitem 33.6330μs 10.5165μs 95.0884 KOps/s 94.0309 KOps/s $\color{#35bf28}+1.12\%$
test_lock_nested 1.2542ms 0.5138ms 1.9464 KOps/s 1.6824 KOps/s $\textbf{\color{#35bf28}+15.69\%}$
test_lock_stack_nested 1.8940ms 0.4818ms 2.0757 KOps/s 2.0783 KOps/s $\color{#d91a1a}-0.12\%$
test_unlock_nested 0.8027ms 0.4350ms 2.2988 KOps/s 2.3399 KOps/s $\color{#d91a1a}-1.76\%$
test_unlock_stack_nested 0.7131ms 0.3951ms 2.5312 KOps/s 2.5407 KOps/s $\color{#d91a1a}-0.37\%$
test_flatten_speed 0.6115ms 0.1079ms 9.2721 KOps/s 9.3786 KOps/s $\color{#d91a1a}-1.14\%$
test_unflatten_speed 1.0046ms 0.4563ms 2.1914 KOps/s 2.1955 KOps/s $\color{#d91a1a}-0.19\%$
test_common_ops 5.0869ms 1.1801ms 847.3864 Ops/s 805.8173 Ops/s $\textbf{\color{#35bf28}+5.16\%}$
test_creation 15.4580μs 2.5205μs 396.7415 KOps/s 399.5331 KOps/s $\color{#d91a1a}-0.70\%$
test_creation_empty 55.0730μs 20.5055μs 48.7673 KOps/s 48.4989 KOps/s $\color{#35bf28}+0.55\%$
test_creation_nested_1 63.8480μs 24.2766μs 41.1919 KOps/s 41.9261 KOps/s $\color{#d91a1a}-1.75\%$
test_creation_nested_2 92.2520μs 28.1893μs 35.4745 KOps/s 36.0820 KOps/s $\color{#d91a1a}-1.68\%$
test_clone 65.0910μs 17.7550μs 56.3223 KOps/s 59.0401 KOps/s $\color{#d91a1a}-4.60\%$
test_getitem[int] 1.3031ms 12.9214μs 77.3910 KOps/s 80.1417 KOps/s $\color{#d91a1a}-3.43\%$
test_getitem[slice_int] 0.1366ms 34.3209μs 29.1367 KOps/s 30.5954 KOps/s $\color{#d91a1a}-4.77\%$
test_getitem[range] 0.1619ms 58.3166μs 17.1478 KOps/s 16.8105 KOps/s $\color{#35bf28}+2.01\%$
test_getitem[tuple] 0.1251ms 27.4851μs 36.3833 KOps/s 37.3770 KOps/s $\color{#d91a1a}-2.66\%$
test_getitem[list] 0.1813ms 53.5021μs 18.6909 KOps/s 18.2725 KOps/s $\color{#35bf28}+2.29\%$
test_setitem_dim[int] 73.5270μs 35.4770μs 28.1873 KOps/s 28.1081 KOps/s $\color{#35bf28}+0.28\%$
test_setitem_dim[slice_int] 0.1247ms 75.3576μs 13.2701 KOps/s 13.3901 KOps/s $\color{#d91a1a}-0.90\%$
test_setitem_dim[range] 0.1749ms 94.0498μs 10.6327 KOps/s 10.3400 KOps/s $\color{#35bf28}+2.83\%$
test_setitem_dim[tuple] 0.1028ms 61.7032μs 16.2066 KOps/s 16.1306 KOps/s $\color{#35bf28}+0.47\%$
test_setitem 84.9090μs 32.0430μs 31.2081 KOps/s 32.7126 KOps/s $\color{#d91a1a}-4.60\%$
test_set 80.2790μs 30.8765μs 32.3871 KOps/s 33.5999 KOps/s $\color{#d91a1a}-3.61\%$
test_set_shared 3.0448ms 0.2164ms 4.6204 KOps/s 4.5231 KOps/s $\color{#35bf28}+2.15\%$
test_update 0.1410ms 39.0498μs 25.6083 KOps/s 26.3296 KOps/s $\color{#d91a1a}-2.74\%$
test_update_nested 0.1053ms 50.2181μs 19.9131 KOps/s 20.2989 KOps/s $\color{#d91a1a}-1.90\%$
test_update__nested 96.6300μs 35.6208μs 28.0735 KOps/s 28.8125 KOps/s $\color{#d91a1a}-2.56\%$
test_set_nested 83.4150μs 33.2221μs 30.1004 KOps/s 31.4621 KOps/s $\color{#d91a1a}-4.33\%$
test_set_nested_new 90.7090μs 38.4862μs 25.9833 KOps/s 27.0837 KOps/s $\color{#d91a1a}-4.06\%$
test_select 0.1556ms 56.0216μs 17.8502 KOps/s 18.3562 KOps/s $\color{#d91a1a}-2.76\%$
test_select_nested 0.9738ms 61.5470μs 16.2477 KOps/s 16.4532 KOps/s $\color{#d91a1a}-1.25\%$
test_exclude_nested 0.1586ms 80.4035μs 12.4373 KOps/s 12.2985 KOps/s $\color{#35bf28}+1.13\%$
test_empty[True] 0.5368ms 0.3417ms 2.9269 KOps/s 2.9012 KOps/s $\color{#35bf28}+0.89\%$
test_empty[False] 7.9497μs 1.2262μs 815.5287 KOps/s 785.2401 KOps/s $\color{#35bf28}+3.86\%$
test_unbind_speed 0.5865ms 0.3281ms 3.0479 KOps/s 3.0862 KOps/s $\color{#d91a1a}-1.24\%$
test_unbind_speed_stack0 0.4579ms 0.3160ms 3.1646 KOps/s 3.1687 KOps/s $\color{#d91a1a}-0.13\%$
test_unbind_speed_stack1 78.9196ms 0.8020ms 1.2469 KOps/s 1.3095 KOps/s $\color{#d91a1a}-4.78\%$
test_split 77.4570ms 2.2799ms 438.6123 Ops/s 438.6565 Ops/s $\color{#d91a1a}-0.01\%$
test_chunk 77.6320ms 2.2771ms 439.1481 Ops/s 442.0022 Ops/s $\color{#d91a1a}-0.65\%$
test_creation[device0] 0.2350ms 0.1210ms 8.2675 KOps/s 8.0849 KOps/s $\color{#35bf28}+2.26\%$
test_creation_from_tensor 4.4125ms 0.1224ms 8.1720 KOps/s 8.2962 KOps/s $\color{#d91a1a}-1.50\%$
test_add_one[memmap_tensor0] 0.1887ms 8.1383μs 122.8761 KOps/s 125.8527 KOps/s $\color{#d91a1a}-2.37\%$
test_contiguous[memmap_tensor0] 23.8250μs 2.1979μs 454.9825 KOps/s 453.6411 KOps/s $\color{#35bf28}+0.30\%$
test_stack[memmap_tensor0] 34.0130μs 6.1540μs 162.4948 KOps/s 169.2464 KOps/s $\color{#d91a1a}-3.99\%$
test_memmaptd_index 1.1517ms 0.4443ms 2.2506 KOps/s 2.2436 KOps/s $\color{#35bf28}+0.31\%$
test_memmaptd_index_astensor 0.7656ms 0.5168ms 1.9350 KOps/s 1.8071 KOps/s $\textbf{\color{#35bf28}+7.07\%}$
test_memmaptd_index_op 1.8612ms 1.1114ms 899.7748 Ops/s 921.5409 Ops/s $\color{#d91a1a}-2.36\%$
test_serialize_model 0.2114s 0.1408s 7.1035 Ops/s 7.7341 Ops/s $\textbf{\color{#d91a1a}-8.15\%}$
test_serialize_model_pickle 0.5028s 0.4106s 2.4356 Ops/s 2.5355 Ops/s $\color{#d91a1a}-3.94\%$
test_serialize_weights 0.1426s 0.1283s 7.7924 Ops/s 7.1624 Ops/s $\textbf{\color{#35bf28}+8.80\%}$
test_serialize_weights_returnearly 0.1776s 0.1676s 5.9673 Ops/s 6.1107 Ops/s $\color{#d91a1a}-2.35\%$
test_serialize_weights_pickle 0.4831s 0.4015s 2.4908 Ops/s 2.5339 Ops/s $\color{#d91a1a}-1.70\%$
test_serialize_weights_filesystem 0.1522s 0.1424s 7.0205 Ops/s 6.9226 Ops/s $\color{#35bf28}+1.41\%$
test_serialize_model_filesystem 0.1613s 0.1487s 6.7259 Ops/s 6.1264 Ops/s $\textbf{\color{#35bf28}+9.78\%}$
test_reshape_pytree 93.2830μs 41.3964μs 24.1567 KOps/s 24.3423 KOps/s $\color{#d91a1a}-0.76\%$
test_reshape_td 91.8510μs 49.7429μs 20.1034 KOps/s 19.9925 KOps/s $\color{#35bf28}+0.55\%$
test_view_pytree 82.2230μs 39.6692μs 25.2084 KOps/s 25.4828 KOps/s $\color{#d91a1a}-1.08\%$
test_view_td 0.1225ms 55.9412μs 17.8759 KOps/s 17.5068 KOps/s $\color{#35bf28}+2.11\%$
test_unbind_pytree 97.4810μs 36.2415μs 27.5927 KOps/s 28.2273 KOps/s $\color{#d91a1a}-2.25\%$
test_unbind_td 0.3863ms 48.2591μs 20.7215 KOps/s 21.0779 KOps/s $\color{#d91a1a}-1.69\%$
test_split_pytree 0.1834ms 39.3918μs 25.3860 KOps/s 25.4532 KOps/s $\color{#d91a1a}-0.26\%$
test_split_td 0.6360ms 63.9391μs 15.6399 KOps/s 16.1949 KOps/s $\color{#d91a1a}-3.43\%$
test_add_pytree 0.1121ms 44.5795μs 22.4318 KOps/s 21.9943 KOps/s $\color{#35bf28}+1.99\%$
test_add_td 0.1760ms 87.8301μs 11.3856 KOps/s 11.4732 KOps/s $\color{#d91a1a}-0.76\%$
test_distributed 0.3688ms 0.1333ms 7.5020 KOps/s 7.4955 KOps/s $\color{#35bf28}+0.09\%$
test_tdmodule 38.9630μs 18.3845μs 54.3936 KOps/s 54.4706 KOps/s $\color{#d91a1a}-0.14\%$
test_tdmodule_dispatch 73.2670μs 38.1798μs 26.1918 KOps/s 26.5152 KOps/s $\color{#d91a1a}-1.22\%$
test_tdseq 45.8550μs 20.4464μs 48.9084 KOps/s 48.7261 KOps/s $\color{#35bf28}+0.37\%$
test_tdseq_dispatch 69.3490μs 43.3697μs 23.0576 KOps/s 23.8518 KOps/s $\color{#d91a1a}-3.33\%$
test_instantiation_functorch 1.8030ms 1.5916ms 628.3023 Ops/s 611.2044 Ops/s $\color{#35bf28}+2.80\%$
test_instantiation_td 2.6964ms 1.2087ms 827.3201 Ops/s 865.5087 Ops/s $\color{#d91a1a}-4.41\%$
test_exec_functorch 0.2928ms 0.1847ms 5.4132 KOps/s 5.5218 KOps/s $\color{#d91a1a}-1.97\%$
test_exec_functional_call 5.8400ms 0.1780ms 5.6165 KOps/s 5.7361 KOps/s $\color{#d91a1a}-2.09\%$
test_exec_td 0.2954ms 0.1855ms 5.3904 KOps/s 5.7570 KOps/s $\textbf{\color{#d91a1a}-6.37\%}$
test_exec_td_decorator 0.7022ms 0.2599ms 3.8470 KOps/s 3.8558 KOps/s $\color{#d91a1a}-0.23\%$
test_vmap_mlp_speed[True-True] 1.8699ms 0.6254ms 1.5990 KOps/s 1.6085 KOps/s $\color{#d91a1a}-0.59\%$
test_vmap_mlp_speed[True-False] 0.7059ms 0.5998ms 1.6673 KOps/s 1.6202 KOps/s $\color{#35bf28}+2.91\%$
test_vmap_mlp_speed[False-True] 0.7106ms 0.4954ms 2.0187 KOps/s 1.9587 KOps/s $\color{#35bf28}+3.06\%$
test_vmap_mlp_speed[False-False] 0.7048ms 0.4954ms 2.0187 KOps/s 1.9738 KOps/s $\color{#35bf28}+2.28\%$
test_vmap_mlp_speed_decorator[True-True] 1.2764ms 0.7012ms 1.4261 KOps/s 1.4044 KOps/s $\color{#35bf28}+1.55\%$
test_vmap_mlp_speed_decorator[True-False] 1.3146ms 0.7048ms 1.4187 KOps/s 1.4095 KOps/s $\color{#35bf28}+0.65\%$
test_vmap_mlp_speed_decorator[False-True] 0.9198ms 0.5834ms 1.7141 KOps/s 1.7105 KOps/s $\color{#35bf28}+0.21\%$
test_vmap_mlp_speed_decorator[False-False] 0.9101ms 0.5843ms 1.7115 KOps/s 1.7111 KOps/s $\color{#35bf28}+0.02\%$
test_to_module_speed[True] 2.4937ms 1.8667ms 535.7162 Ops/s 553.8411 Ops/s $\color{#d91a1a}-3.27\%$
test_to_module_speed[False] 2.0749ms 1.8383ms 543.9744 Ops/s 567.2628 Ops/s $\color{#d91a1a}-4.11\%$
test_tc_init 88.7050μs 45.6718μs 21.8954 KOps/s 22.8922 KOps/s $\color{#d91a1a}-4.35\%$
test_tc_init_nested 0.1735ms 94.3546μs 10.5983 KOps/s 11.4463 KOps/s $\textbf{\color{#d91a1a}-7.41\%}$
test_tc_first_layer_tensor 44.1920μs 9.6152μs 104.0025 KOps/s 108.1275 KOps/s $\color{#d91a1a}-3.81\%$
test_tc_first_layer_nontensor 57.1160μs 9.3772μs 106.6420 KOps/s 108.6628 KOps/s $\color{#d91a1a}-1.86\%$
test_tc_second_layer_tensor 42.3590μs 2.9235μs 342.0509 KOps/s 348.4032 KOps/s $\color{#d91a1a}-1.82\%$
test_tc_second_layer_nontensor 39.1130μs 10.5855μs 94.4689 KOps/s 95.7683 KOps/s $\color{#d91a1a}-1.36\%$
test_unbind 8.9434ms 8.7289ms 114.5621 Ops/s 69.8449 Ops/s $\textbf{\color{#35bf28}+64.02\%}$
test_full_like 10.1084ms 7.6558ms 130.6198 Ops/s 132.0580 Ops/s $\color{#d91a1a}-1.09\%$
test_zeros_like 14.0229ms 6.4522ms 154.9856 Ops/s 133.7796 Ops/s $\textbf{\color{#35bf28}+15.85\%}$
test_ones_like 13.6877ms 7.6355ms 130.9667 Ops/s 132.2644 Ops/s $\color{#d91a1a}-0.98\%$
test_clone 20.6080ms 9.7000ms 103.0927 Ops/s 98.6638 Ops/s $\color{#35bf28}+4.49\%$
test_squeeze 85.4190μs 15.1300μs 66.0940 KOps/s 68.9748 KOps/s $\color{#d91a1a}-4.18\%$
test_unsqueeze 0.2007ms 0.1012ms 9.8861 KOps/s 9.6478 KOps/s $\color{#35bf28}+2.47\%$
test_split 0.3666ms 0.2105ms 4.7502 KOps/s 4.7259 KOps/s $\color{#35bf28}+0.51\%$
test_permute 0.3563ms 0.2295ms 4.3572 KOps/s 4.3410 KOps/s $\color{#35bf28}+0.37\%$
test_stack 31.4041ms 24.6276ms 40.6048 Ops/s 39.0892 Ops/s $\color{#35bf28}+3.88\%$
test_cat 31.7140ms 24.3596ms 41.0515 Ops/s 39.0554 Ops/s $\textbf{\color{#35bf28}+5.11\%}$

Copy link

github-actions bot commented Jul 23, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}10$. Worsened: $\large\color{#d91a1a}2$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 0.1503ms 16.1151μs 62.0536 KOps/s 60.5532 KOps/s $\color{#35bf28}+2.48\%$
test_plain_set_stack_nested 44.0520μs 16.3405μs 61.1977 KOps/s 60.8847 KOps/s $\color{#35bf28}+0.51\%$
test_plain_set_nested_inplace 43.2030μs 17.1602μs 58.2744 KOps/s 56.8305 KOps/s $\color{#35bf28}+2.54\%$
test_plain_set_stack_nested_inplace 48.3930μs 17.2750μs 57.8873 KOps/s 57.1290 KOps/s $\color{#35bf28}+1.33\%$
test_items 15.6510μs 4.6197μs 216.4662 KOps/s 215.0275 KOps/s $\color{#35bf28}+0.67\%$
test_items_nested 0.5028ms 0.3945ms 2.5348 KOps/s 2.5119 KOps/s $\color{#35bf28}+0.91\%$
test_items_nested_locked 0.4186ms 0.3954ms 2.5293 KOps/s 2.5163 KOps/s $\color{#35bf28}+0.51\%$
test_items_nested_leaf 0.1022ms 85.6951μs 11.6693 KOps/s 11.6972 KOps/s $\color{#d91a1a}-0.24\%$
test_items_stack_nested 0.4446ms 0.3965ms 2.5219 KOps/s 2.5682 KOps/s $\color{#d91a1a}-1.80\%$
test_items_stack_nested_leaf 0.1038ms 86.4380μs 11.5690 KOps/s 11.6552 KOps/s $\color{#d91a1a}-0.74\%$
test_items_stack_nested_locked 0.4585ms 0.3971ms 2.5180 KOps/s 2.5475 KOps/s $\color{#d91a1a}-1.16\%$
test_keys 17.2610μs 4.3697μs 228.8481 KOps/s 229.6237 KOps/s $\color{#d91a1a}-0.34\%$
test_keys_nested 89.8450μs 66.9640μs 14.9334 KOps/s 15.1213 KOps/s $\color{#d91a1a}-1.24\%$
test_keys_nested_locked 2.1728ms 72.5651μs 13.7807 KOps/s 13.6387 KOps/s $\color{#35bf28}+1.04\%$
test_keys_nested_leaf 76.5240μs 57.1717μs 17.4912 KOps/s 17.2024 KOps/s $\color{#35bf28}+1.68\%$
test_keys_stack_nested 91.5550μs 66.9998μs 14.9254 KOps/s 15.0768 KOps/s $\color{#d91a1a}-1.00\%$
test_keys_stack_nested_leaf 75.4640μs 57.4133μs 17.4176 KOps/s 17.3997 KOps/s $\color{#35bf28}+0.10\%$
test_keys_stack_nested_locked 95.4560μs 71.1789μs 14.0491 KOps/s 14.1107 KOps/s $\color{#d91a1a}-0.44\%$
test_values 8.5340μs 1.7595μs 568.3377 KOps/s 566.7445 KOps/s $\color{#35bf28}+0.28\%$
test_values_nested 51.8430μs 33.7316μs 29.6458 KOps/s 29.6333 KOps/s $\color{#35bf28}+0.04\%$
test_values_nested_locked 57.3930μs 35.3748μs 28.2687 KOps/s 27.6907 KOps/s $\color{#35bf28}+2.09\%$
test_values_nested_leaf 50.6330μs 29.9890μs 33.3456 KOps/s 33.3724 KOps/s $\color{#d91a1a}-0.08\%$
test_values_stack_nested 51.3030μs 34.0513μs 29.3675 KOps/s 28.8446 KOps/s $\color{#35bf28}+1.81\%$
test_values_stack_nested_leaf 54.1830μs 30.2004μs 33.1122 KOps/s 32.6056 KOps/s $\color{#35bf28}+1.55\%$
test_values_stack_nested_locked 52.0720μs 35.5123μs 28.1593 KOps/s 27.1221 KOps/s $\color{#35bf28}+3.82\%$
test_membership 1.5156μs 0.5506μs 1.8161 MOps/s 1.7751 MOps/s $\color{#35bf28}+2.31\%$
test_membership_nested 25.3410μs 2.0758μs 481.7498 KOps/s 505.6741 KOps/s $\color{#d91a1a}-4.73\%$
test_membership_nested_leaf 12.4805μs 2.0185μs 495.4256 KOps/s 518.1331 KOps/s $\color{#d91a1a}-4.38\%$
test_membership_stacked_nested 13.8300μs 2.0593μs 485.5928 KOps/s 498.8071 KOps/s $\color{#d91a1a}-2.65\%$
test_membership_stacked_nested_leaf 21.2810μs 2.0495μs 487.9237 KOps/s 496.0931 KOps/s $\color{#d91a1a}-1.65\%$
test_membership_nested_last 15.6610μs 2.9572μs 338.1563 KOps/s 337.7853 KOps/s $\color{#35bf28}+0.11\%$
test_membership_nested_leaf_last 31.0510μs 2.9503μs 338.9491 KOps/s 339.1804 KOps/s $\color{#d91a1a}-0.07\%$
test_membership_stacked_nested_last 22.3810μs 2.9640μs 337.3868 KOps/s 108.6829 KOps/s $\textbf{\color{#35bf28}+210.43\%}$
test_membership_stacked_nested_leaf_last 31.8420μs 2.9747μs 336.1703 KOps/s 109.0206 KOps/s $\textbf{\color{#35bf28}+208.35\%}$
test_nested_getleaf 30.5510μs 7.9978μs 125.0351 KOps/s 124.3522 KOps/s $\color{#35bf28}+0.55\%$
test_nested_get 26.9120μs 7.5349μs 132.7154 KOps/s 131.6263 KOps/s $\color{#35bf28}+0.83\%$
test_stacked_getleaf 34.2320μs 8.0269μs 124.5816 KOps/s 123.1861 KOps/s $\color{#35bf28}+1.13\%$
test_stacked_get 20.7310μs 7.5537μs 132.3861 KOps/s 131.6090 KOps/s $\color{#35bf28}+0.59\%$
test_nested_getitemleaf 30.5310μs 8.2073μs 121.8428 KOps/s 121.7109 KOps/s $\color{#35bf28}+0.11\%$
test_nested_getitem 25.8210μs 7.7179μs 129.5696 KOps/s 129.6423 KOps/s $\color{#d91a1a}-0.06\%$
test_stacked_getitemleaf 24.1010μs 8.1871μs 122.1435 KOps/s 120.1426 KOps/s $\color{#35bf28}+1.67\%$
test_stacked_getitem 29.3810μs 7.7332μs 129.3122 KOps/s 128.8823 KOps/s $\color{#35bf28}+0.33\%$
test_lock_nested 7.0491ms 0.4827ms 2.0718 KOps/s 2.0889 KOps/s $\color{#d91a1a}-0.82\%$
test_lock_stack_nested 0.4663ms 0.4342ms 2.3029 KOps/s 2.3717 KOps/s $\color{#d91a1a}-2.90\%$
test_unlock_nested 0.8590ms 0.3952ms 2.5304 KOps/s 2.5360 KOps/s $\color{#d91a1a}-0.22\%$
test_unlock_stack_nested 0.3955ms 0.3533ms 2.8301 KOps/s 2.9272 KOps/s $\color{#d91a1a}-3.32\%$
test_flatten_speed 0.4030ms 0.1064ms 9.4024 KOps/s 9.3839 KOps/s $\color{#35bf28}+0.20\%$
test_unflatten_speed 0.3181ms 0.2931ms 3.4119 KOps/s 3.3725 KOps/s $\color{#35bf28}+1.17\%$
test_common_ops 1.5125ms 1.2825ms 779.7463 Ops/s 749.2621 Ops/s $\color{#35bf28}+4.07\%$
test_creation 15.4110μs 1.9727μs 506.9118 KOps/s 506.1312 KOps/s $\color{#35bf28}+0.15\%$
test_creation_empty 44.7530μs 15.9105μs 62.8517 KOps/s 60.7270 KOps/s $\color{#35bf28}+3.50\%$
test_creation_nested_1 44.4720μs 17.7768μs 56.2532 KOps/s 54.4193 KOps/s $\color{#35bf28}+3.37\%$
test_creation_nested_2 43.1320μs 20.5124μs 48.7511 KOps/s 47.9906 KOps/s $\color{#35bf28}+1.58\%$
test_clone 90.9950μs 29.1828μs 34.2668 KOps/s 31.5411 KOps/s $\textbf{\color{#35bf28}+8.64\%}$
test_getitem[int] 1.1373ms 16.7624μs 59.6573 KOps/s 57.7972 KOps/s $\color{#35bf28}+3.22\%$
test_getitem[slice_int] 0.1568ms 29.0661μs 34.4043 KOps/s 31.9388 KOps/s $\textbf{\color{#35bf28}+7.72\%}$
test_getitem[range] 0.2901ms 0.1149ms 8.7049 KOps/s 8.7305 KOps/s $\color{#d91a1a}-0.29\%$
test_getitem[tuple] 0.1562ms 24.7785μs 40.3575 KOps/s 39.5676 KOps/s $\color{#35bf28}+2.00\%$
test_getitem[list] 0.2332ms 0.1044ms 9.5797 KOps/s 9.6300 KOps/s $\color{#d91a1a}-0.52\%$
test_setitem_dim[int] 70.1640μs 51.3111μs 19.4890 KOps/s 19.9335 KOps/s $\color{#d91a1a}-2.23\%$
test_setitem_dim[slice_int] 98.5760μs 77.5812μs 12.8897 KOps/s 13.3516 KOps/s $\color{#d91a1a}-3.46\%$
test_setitem_dim[range] 0.3087ms 0.1402ms 7.1308 KOps/s 7.2756 KOps/s $\color{#d91a1a}-1.99\%$
test_setitem_dim[tuple] 0.1073ms 70.1597μs 14.2532 KOps/s 14.7160 KOps/s $\color{#d91a1a}-3.15\%$
test_setitem 75.8040μs 41.9050μs 23.8635 KOps/s 23.4159 KOps/s $\color{#35bf28}+1.91\%$
test_set 79.6940μs 40.8289μs 24.4924 KOps/s 23.4228 KOps/s $\color{#35bf28}+4.57\%$
test_set_shared 0.3917ms 52.9694μs 18.8788 KOps/s 18.8961 KOps/s $\color{#d91a1a}-0.09\%$
test_update 76.3540μs 49.0549μs 20.3853 KOps/s 20.0044 KOps/s $\color{#35bf28}+1.90\%$
test_update_nested 88.4350μs 57.7283μs 17.3225 KOps/s 16.7255 KOps/s $\color{#35bf28}+3.57\%$
test_update__nested 85.6950μs 60.3602μs 16.5672 KOps/s 15.5767 KOps/s $\textbf{\color{#35bf28}+6.36\%}$
test_set_nested 74.3240μs 43.8046μs 22.8287 KOps/s 21.7203 KOps/s $\textbf{\color{#35bf28}+5.10\%}$
test_set_nested_new 74.8040μs 46.9945μs 21.2791 KOps/s 20.0185 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_select 0.1479ms 62.4507μs 16.0126 KOps/s 15.3570 KOps/s $\color{#35bf28}+4.27\%$
test_select_nested 75.1240μs 52.5391μs 19.0335 KOps/s 19.1845 KOps/s $\color{#d91a1a}-0.79\%$
test_exclude_nested 91.6360μs 71.9874μs 13.8913 KOps/s 14.1696 KOps/s $\color{#d91a1a}-1.96\%$
test_empty[True] 0.3186ms 0.2954ms 3.3852 KOps/s 3.4127 KOps/s $\color{#d91a1a}-0.81\%$
test_empty[False] 2.9482μs 0.9763μs 1.0243 MOps/s 1.0678 MOps/s $\color{#d91a1a}-4.07\%$
test_to 69.5040μs 36.2015μs 27.6232 KOps/s 27.0540 KOps/s $\color{#35bf28}+2.10\%$
test_to_nonblocking 49.1730μs 23.0876μs 43.3133 KOps/s 43.7317 KOps/s $\color{#d91a1a}-0.96\%$
test_unbind_speed 1.2864ms 0.3000ms 3.3336 KOps/s 3.2666 KOps/s $\color{#35bf28}+2.05\%$
test_unbind_speed_stack0 0.3459ms 0.2925ms 3.4184 KOps/s 3.3618 KOps/s $\color{#35bf28}+1.68\%$
test_unbind_speed_stack1 89.5938ms 0.7807ms 1.2809 KOps/s 1.3126 KOps/s $\color{#d91a1a}-2.42\%$
test_split 91.0505ms 2.3208ms 430.8933 Ops/s 432.3662 Ops/s $\color{#d91a1a}-0.34\%$
test_chunk 93.6567ms 2.3320ms 428.8196 Ops/s 428.9476 Ops/s $\color{#d91a1a}-0.03\%$
test_creation[device0] 0.1812ms 0.1012ms 9.8795 KOps/s 9.7374 KOps/s $\color{#35bf28}+1.46\%$
test_creation_from_tensor 0.1543ms 0.1013ms 9.8728 KOps/s 9.9989 KOps/s $\color{#d91a1a}-1.26\%$
test_add_one[memmap_tensor0] 0.1546ms 8.9755μs 111.4147 KOps/s 109.6679 KOps/s $\color{#35bf28}+1.59\%$
test_contiguous[memmap_tensor0] 28.8010μs 2.1450μs 466.1927 KOps/s 456.0233 KOps/s $\color{#35bf28}+2.23\%$
test_stack[memmap_tensor0] 28.8820μs 6.6566μs 150.2265 KOps/s 145.6105 KOps/s $\color{#35bf28}+3.17\%$
test_memmaptd_index 1.1215ms 0.4228ms 2.3651 KOps/s 2.3795 KOps/s $\color{#d91a1a}-0.60\%$
test_memmaptd_index_astensor 0.7444ms 0.4881ms 2.0489 KOps/s 2.0453 KOps/s $\color{#35bf28}+0.18\%$
test_memmaptd_index_op 1.3928ms 1.0096ms 990.4644 Ops/s 972.6125 Ops/s $\color{#35bf28}+1.84\%$
test_serialize_model 98.2257ms 94.5216ms 10.5796 Ops/s 10.2507 Ops/s $\color{#35bf28}+3.21\%$
test_serialize_model_pickle 1.3481s 1.2361s 0.8090 Ops/s 0.8073 Ops/s $\color{#35bf28}+0.22\%$
test_serialize_weights 0.1874s 0.1025s 9.7586 Ops/s 9.3361 Ops/s $\color{#35bf28}+4.53\%$
test_serialize_weights_returnearly 0.2961s 86.8233ms 11.5176 Ops/s 11.5587 Ops/s $\color{#d91a1a}-0.36\%$
test_serialize_weights_pickle 1.3486s 1.2365s 0.8087 Ops/s 0.8085 Ops/s $\color{#35bf28}+0.03\%$
test_reshape_pytree 67.8140μs 38.4009μs 26.0411 KOps/s 26.0292 KOps/s $\color{#35bf28}+0.05\%$
test_reshape_td 67.4230μs 43.4809μs 22.9986 KOps/s 22.9846 KOps/s $\color{#35bf28}+0.06\%$
test_view_pytree 59.5240μs 37.1411μs 26.9244 KOps/s 26.4767 KOps/s $\color{#35bf28}+1.69\%$
test_view_td 70.5240μs 47.8034μs 20.9190 KOps/s 20.6744 KOps/s $\color{#35bf28}+1.18\%$
test_unbind_pytree 60.8640μs 36.4035μs 27.4699 KOps/s 27.1035 KOps/s $\color{#35bf28}+1.35\%$
test_unbind_td 0.3825ms 45.5273μs 21.9648 KOps/s 21.4838 KOps/s $\color{#35bf28}+2.24\%$
test_split_pytree 0.3557ms 50.3481μs 19.8617 KOps/s 19.8289 KOps/s $\color{#35bf28}+0.17\%$
test_split_td 0.1681ms 58.1451μs 17.1984 KOps/s 16.9217 KOps/s $\color{#35bf28}+1.63\%$
test_add_pytree 93.0950μs 59.8685μs 16.7033 KOps/s 16.9145 KOps/s $\color{#d91a1a}-1.25\%$
test_add_td 0.1284ms 91.0626μs 10.9815 KOps/s 9.8495 KOps/s $\textbf{\color{#35bf28}+11.49\%}$
test_compile_add_one_nested[tensordict-compile] 0.4089ms 0.2074ms 4.8213 KOps/s 4.8450 KOps/s $\color{#d91a1a}-0.49\%$
test_compile_add_one_nested[tensordict-eager] 0.2620ms 0.1765ms 5.6644 KOps/s 5.8389 KOps/s $\color{#d91a1a}-2.99\%$
test_compile_add_one_nested[pytree-compile] 0.1832ms 0.1427ms 7.0083 KOps/s 6.9756 KOps/s $\color{#35bf28}+0.47\%$
test_compile_add_one_nested[pytree-eager] 0.2556ms 0.1951ms 5.1253 KOps/s 5.1692 KOps/s $\color{#d91a1a}-0.85\%$
test_compile_copy_nested[tensordict-compile] 46.9920μs 22.2679μs 44.9076 KOps/s 44.9561 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_copy_nested[tensordict-eager] 74.6640μs 48.5825μs 20.5836 KOps/s 20.4688 KOps/s $\color{#35bf28}+0.56\%$
test_compile_copy_nested[pytree-compile] 0.1025ms 71.8815μs 13.9118 KOps/s 14.0386 KOps/s $\color{#d91a1a}-0.90\%$
test_compile_copy_nested[pytree-eager] 80.4340μs 59.8818μs 16.6996 KOps/s 16.7849 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_add_one_flat[tensordict-compile] 0.4118ms 0.3206ms 3.1187 KOps/s 3.1125 KOps/s $\color{#35bf28}+0.20\%$
test_compile_add_one_flat[tensordict-eager] 0.2692ms 0.2217ms 4.5104 KOps/s 4.5411 KOps/s $\color{#d91a1a}-0.68\%$
test_compile_add_one_flat[tensorclass-compile] 0.2061ms 0.1288ms 7.7658 KOps/s 7.8060 KOps/s $\color{#d91a1a}-0.52\%$
test_compile_add_one_flat[tensorclass-eager] 0.1265ms 64.1869μs 15.5795 KOps/s 15.9490 KOps/s $\color{#d91a1a}-2.32\%$
test_compile_add_one_flat[pytree-compile] 0.3732ms 0.3202ms 3.1230 KOps/s 3.1417 KOps/s $\color{#d91a1a}-0.59\%$
test_compile_add_one_flat[pytree-eager] 0.6909ms 0.6274ms 1.5938 KOps/s 1.5938 KOps/s $+0.00\%$
test_compile_add_self_flat[tensordict-eager] 0.3194ms 0.2719ms 3.6782 KOps/s 3.7439 KOps/s $\color{#d91a1a}-1.76\%$
test_compile_add_self_flat[tensordict-compile] 0.3624ms 0.3235ms 3.0912 KOps/s 3.0940 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_add_self_flat[tensorclass-eager] 0.1722ms 78.3853μs 12.7575 KOps/s 13.2040 KOps/s $\color{#d91a1a}-3.38\%$
test_compile_add_self_flat[tensorclass-compile] 0.2591ms 0.1290ms 7.7546 KOps/s 7.7577 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_add_self_flat[pytree-eager] 0.5887ms 0.5307ms 1.8844 KOps/s 1.8606 KOps/s $\color{#35bf28}+1.28\%$
test_compile_add_self_flat[pytree-compile] 0.3646ms 0.3206ms 3.1192 KOps/s 3.1353 KOps/s $\color{#d91a1a}-0.51\%$
test_compile_copy_flat[tensordict-compile] 42.7620μs 18.6984μs 53.4804 KOps/s 49.9210 KOps/s $\textbf{\color{#35bf28}+7.13\%}$
test_compile_copy_flat[tensordict-eager] 53.5530μs 32.1989μs 31.0570 KOps/s 31.0680 KOps/s $\color{#d91a1a}-0.04\%$
test_compile_copy_flat[pytree-compile] 0.1073ms 74.9095μs 13.3494 KOps/s 13.3583 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_copy_flat[pytree-eager] 87.8050μs 60.6165μs 16.4972 KOps/s 16.4582 KOps/s $\color{#35bf28}+0.24\%$
test_compile_assign_and_add[tensordict-compile] 2.5256ms 0.9203ms 1.0867 KOps/s 1.0856 KOps/s $\color{#35bf28}+0.10\%$
test_compile_assign_and_add[tensordict-eager] 3.4465ms 3.3030ms 302.7596 Ops/s 300.4911 Ops/s $\color{#35bf28}+0.75\%$
test_compile_assign_and_add[pytree-compile] 2.4870ms 0.9023ms 1.1082 KOps/s 1.0923 KOps/s $\color{#35bf28}+1.46\%$
test_compile_assign_and_add[pytree-eager] 3.3601ms 3.2918ms 303.7806 Ops/s 300.8587 Ops/s $\color{#35bf28}+0.97\%$
test_compile_indexing[tensor-tensordict-compile] 0.1383ms 0.1094ms 9.1445 KOps/s 9.0807 KOps/s $\color{#35bf28}+0.70\%$
test_compile_indexing[tensor-tensordict-eager] 0.2340ms 62.6558μs 15.9602 KOps/s 15.4238 KOps/s $\color{#35bf28}+3.48\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1332ms 0.1015ms 9.8555 KOps/s 9.7545 KOps/s $\color{#35bf28}+1.04\%$
test_compile_indexing[tensor-tensorclass-eager] 82.9550μs 45.6193μs 21.9205 KOps/s 21.9240 KOps/s $\color{#d91a1a}-0.02\%$
test_compile_indexing[tensor-pytree-compile] 0.1484ms 0.1036ms 9.6566 KOps/s 9.6404 KOps/s $\color{#35bf28}+0.17\%$
test_compile_indexing[tensor-pytree-eager] 91.3150μs 45.5959μs 21.9318 KOps/s 22.2766 KOps/s $\color{#d91a1a}-1.55\%$
test_compile_indexing[slice-tensordict-compile] 0.1801ms 0.1382ms 7.2356 KOps/s 7.2331 KOps/s $\color{#35bf28}+0.03\%$
test_compile_indexing[slice-tensordict-eager] 0.1885ms 26.2222μs 38.1356 KOps/s 38.1632 KOps/s $\color{#d91a1a}-0.07\%$
test_compile_indexing[slice-tensorclass-compile] 0.2305ms 0.1299ms 7.6967 KOps/s 7.6960 KOps/s $+0.01\%$
test_compile_indexing[slice-tensorclass-eager] 52.8730μs 22.3485μs 44.7457 KOps/s 45.7893 KOps/s $\color{#d91a1a}-2.28\%$
test_compile_indexing[slice-pytree-compile] 0.1704ms 0.1295ms 7.7245 KOps/s 7.3912 KOps/s $\color{#35bf28}+4.51\%$
test_compile_indexing[slice-pytree-eager] 49.9730μs 22.2336μs 44.9770 KOps/s 44.1257 KOps/s $\color{#35bf28}+1.93\%$
test_compile_indexing[int-tensordict-compile] 0.2058ms 0.1373ms 7.2831 KOps/s 7.2632 KOps/s $\color{#35bf28}+0.27\%$
test_compile_indexing[int-tensordict-eager] 0.5280ms 26.2072μs 38.1574 KOps/s 39.0316 KOps/s $\color{#d91a1a}-2.24\%$
test_compile_indexing[int-tensorclass-compile] 0.1712ms 0.1293ms 7.7326 KOps/s 7.4475 KOps/s $\color{#35bf28}+3.83\%$
test_compile_indexing[int-tensorclass-eager] 53.3830μs 21.7753μs 45.9237 KOps/s 45.8961 KOps/s $\color{#35bf28}+0.06\%$
test_compile_indexing[int-pytree-compile] 0.1840ms 0.1290ms 7.7528 KOps/s 7.5564 KOps/s $\color{#35bf28}+2.60\%$
test_compile_indexing[int-pytree-eager] 55.1730μs 22.2112μs 45.0224 KOps/s 46.2377 KOps/s $\color{#d91a1a}-2.63\%$
test_mod_add[eager] 81.9350μs 37.4959μs 26.6696 KOps/s 26.1562 KOps/s $\color{#35bf28}+1.96\%$
test_mod_add[compile] 0.2223ms 66.5021μs 15.0371 KOps/s 14.8957 KOps/s $\color{#35bf28}+0.95\%$
test_mod_add[compile-overhead] 0.2776ms 0.1471ms 6.7991 KOps/s 6.8597 KOps/s $\color{#d91a1a}-0.88\%$
test_mod_wrap[eager] 0.3496ms 0.2536ms 3.9438 KOps/s 3.8417 KOps/s $\color{#35bf28}+2.66\%$
test_mod_wrap[compile] 1.2290ms 0.2895ms 3.4547 KOps/s 3.3895 KOps/s $\color{#35bf28}+1.92\%$
test_mod_wrap[compile-overhead] 8.1905ms 4.3093ms 232.0582 Ops/s 227.7363 Ops/s $\color{#35bf28}+1.90\%$
test_mod_wrap_and_backward[eager] 1.5305ms 1.4265ms 701.0390 Ops/s 695.9611 Ops/s $\color{#35bf28}+0.73\%$
test_mod_wrap_and_backward[compile] 1.5696ms 1.4356ms 696.5696 Ops/s 749.9832 Ops/s $\textbf{\color{#d91a1a}-7.12\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4473ms 0.9902ms 1.0099 KOps/s 1.0411 KOps/s $\color{#d91a1a}-3.00\%$
test_seq_add[eager] 0.1585ms 0.1083ms 9.2319 KOps/s 9.3196 KOps/s $\color{#d91a1a}-0.94\%$
test_seq_add[compile] 0.2078ms 84.4950μs 11.8350 KOps/s 12.0624 KOps/s $\color{#d91a1a}-1.89\%$
test_seq_add[compile-overhead] 0.1592ms 0.1215ms 8.2290 KOps/s 8.0952 KOps/s $\color{#35bf28}+1.65\%$
test_seq_wrap[eager] 0.4815ms 0.4182ms 2.3910 KOps/s 2.3390 KOps/s $\color{#35bf28}+2.22\%$
test_seq_wrap[compile] 1.4694ms 0.3193ms 3.1321 KOps/s 3.1191 KOps/s $\color{#35bf28}+0.42\%$
test_seq_wrap[compile-overhead] 0.3082s 0.1475s 6.7819 Ops/s 6.8147 Ops/s $\color{#d91a1a}-0.48\%$
test_func_call_runtime[False-eager] 0.7784ms 0.7425ms 1.3468 KOps/s 1.3438 KOps/s $\color{#35bf28}+0.22\%$
test_func_call_runtime[False-compile] 0.8627ms 0.8010ms 1.2484 KOps/s 1.2114 KOps/s $\color{#35bf28}+3.05\%$
test_func_call_runtime[False-compile-overhead] 0.4103ms 0.3564ms 2.8061 KOps/s 2.7911 KOps/s $\color{#35bf28}+0.54\%$
test_func_call_runtime[True-eager] 1.0494ms 0.9838ms 1.0165 KOps/s 993.1818 Ops/s $\color{#35bf28}+2.35\%$
test_func_call_runtime[True-compile] 0.8865ms 0.8416ms 1.1882 KOps/s 1.1683 KOps/s $\color{#35bf28}+1.71\%$
test_func_call_runtime[True-compile-overhead] 0.4539ms 0.3975ms 2.5158 KOps/s 2.4943 KOps/s $\color{#35bf28}+0.86\%$
test_distributed 0.2678ms 67.9732μs 14.7117 KOps/s 13.9846 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_tdmodule 40.1520μs 15.2450μs 65.5954 KOps/s 63.2390 KOps/s $\color{#35bf28}+3.73\%$
test_tdmodule_dispatch 46.8520μs 31.2317μs 32.0188 KOps/s 30.6596 KOps/s $\color{#35bf28}+4.43\%$
test_tdseq 31.0420μs 15.8082μs 63.2585 KOps/s 60.4443 KOps/s $\color{#35bf28}+4.66\%$
test_tdseq_dispatch 53.1730μs 33.0989μs 30.2125 KOps/s 29.1364 KOps/s $\color{#35bf28}+3.69\%$
test_instantiation_functorch 2.0828ms 2.0056ms 498.6105 Ops/s 502.3022 Ops/s $\color{#d91a1a}-0.73\%$
test_instantiation_td 2.0019ms 1.2939ms 772.8581 Ops/s 779.5555 Ops/s $\color{#d91a1a}-0.86\%$
test_exec_functorch 0.2796ms 0.2209ms 4.5266 KOps/s 4.5544 KOps/s $\color{#d91a1a}-0.61\%$
test_exec_functional_call 0.3648ms 0.2173ms 4.6028 KOps/s 4.5375 KOps/s $\color{#35bf28}+1.44\%$
test_exec_td 0.2428ms 0.2164ms 4.6219 KOps/s 4.4410 KOps/s $\color{#35bf28}+4.07\%$
test_exec_td_decorator 1.0915ms 0.2933ms 3.4094 KOps/s 3.3420 KOps/s $\color{#35bf28}+2.02\%$
test_vmap_mlp_speed[True-True] 0.8101ms 0.6694ms 1.4938 KOps/s 1.4714 KOps/s $\color{#35bf28}+1.52\%$
test_vmap_mlp_speed[True-False] 0.7281ms 0.6680ms 1.4970 KOps/s 1.4807 KOps/s $\color{#35bf28}+1.10\%$
test_vmap_mlp_speed[False-True] 0.6295ms 0.5879ms 1.7009 KOps/s 1.6971 KOps/s $\color{#35bf28}+0.22\%$
test_vmap_mlp_speed[False-False] 0.6555ms 0.5896ms 1.6962 KOps/s 1.6869 KOps/s $\color{#35bf28}+0.55\%$
test_vmap_mlp_speed_decorator[True-True] 1.2639ms 0.7506ms 1.3323 KOps/s 1.3346 KOps/s $\color{#d91a1a}-0.18\%$
test_vmap_mlp_speed_decorator[True-False] 0.8644ms 0.7495ms 1.3342 KOps/s 1.3397 KOps/s $\color{#d91a1a}-0.41\%$
test_vmap_mlp_speed_decorator[False-True] 0.8871ms 0.6546ms 1.5276 KOps/s 1.5254 KOps/s $\color{#35bf28}+0.14\%$
test_vmap_mlp_speed_decorator[False-False] 0.8142ms 0.6549ms 1.5269 KOps/s 1.5346 KOps/s $\color{#d91a1a}-0.50\%$
test_vmap_transformer_speed[True-True] 8.8948ms 8.8067ms 113.5498 Ops/s 113.4149 Ops/s $\color{#35bf28}+0.12\%$
test_vmap_transformer_speed[True-False] 8.8767ms 8.8076ms 113.5377 Ops/s 113.6520 Ops/s $\color{#d91a1a}-0.10\%$
test_vmap_transformer_speed[False-True] 9.1055ms 8.7650ms 114.0904 Ops/s 114.9823 Ops/s $\color{#d91a1a}-0.78\%$
test_vmap_transformer_speed[False-False] 8.8007ms 8.7055ms 114.8702 Ops/s 114.8687 Ops/s $+0.00\%$
test_vmap_transformer_speed_decorator[True-True] 21.0758ms 21.0062ms 47.6050 Ops/s 47.4779 Ops/s $\color{#35bf28}+0.27\%$
test_vmap_transformer_speed_decorator[True-False] 21.0788ms 20.9918ms 47.6376 Ops/s 47.5675 Ops/s $\color{#35bf28}+0.15\%$
test_vmap_transformer_speed_decorator[False-True] 21.2681ms 20.8074ms 48.0598 Ops/s 48.0192 Ops/s $\color{#35bf28}+0.08\%$
test_vmap_transformer_speed_decorator[False-False] 20.9405ms 20.8057ms 48.0638 Ops/s 47.9699 Ops/s $\color{#35bf28}+0.20\%$
test_to_module_speed[True] 2.8941ms 1.4815ms 675.0067 Ops/s 672.7747 Ops/s $\color{#35bf28}+0.33\%$
test_to_module_speed[False] 1.9151ms 1.4656ms 682.3212 Ops/s 682.1307 Ops/s $\color{#35bf28}+0.03\%$
test_tc_init 51.0730μs 33.8570μs 29.5360 KOps/s 28.8001 KOps/s $\color{#35bf28}+2.56\%$
test_tc_init_nested 0.2006ms 71.3604μs 14.0134 KOps/s 14.2723 KOps/s $\color{#d91a1a}-1.81\%$
test_tc_first_layer_tensor 17.4910μs 4.0045μs 249.7209 KOps/s 247.8738 KOps/s $\color{#35bf28}+0.75\%$
test_tc_first_layer_nontensor 16.5110μs 4.0213μs 248.6787 KOps/s 246.4074 KOps/s $\color{#35bf28}+0.92\%$
test_tc_second_layer_tensor 30.7068μs 1.3028μs 767.5665 KOps/s 775.0990 KOps/s $\color{#d91a1a}-0.97\%$
test_tc_second_layer_nontensor 17.4910μs 4.6231μs 216.3066 KOps/s 218.6052 KOps/s $\color{#d91a1a}-1.05\%$
test_unbind 0.3183s 12.9437ms 77.2578 Ops/s 82.5423 Ops/s $\textbf{\color{#d91a1a}-6.40\%}$
test_full_like 0.6619ms 0.5774ms 1.7318 KOps/s 1.7318 KOps/s $+0.00\%$
test_zeros_like 0.2659ms 0.1977ms 5.0582 KOps/s 5.0559 KOps/s $\color{#35bf28}+0.05\%$
test_ones_like 0.2179ms 0.1975ms 5.0630 KOps/s 5.0580 KOps/s $\color{#35bf28}+0.10\%$
test_clone 0.4441ms 0.4146ms 2.4118 KOps/s 2.4141 KOps/s $\color{#d91a1a}-0.09\%$
test_squeeze 28.2210μs 11.6828μs 85.5963 KOps/s 84.9616 KOps/s $\color{#35bf28}+0.75\%$
test_unsqueeze 0.2612ms 81.1712μs 12.3196 KOps/s 11.9810 KOps/s $\color{#35bf28}+2.83\%$
test_split 0.4694ms 0.1815ms 5.5107 KOps/s 5.5152 KOps/s $\color{#d91a1a}-0.08\%$
test_permute 0.3005ms 0.1919ms 5.2119 KOps/s 5.1288 KOps/s $\color{#35bf28}+1.62\%$
test_stack 1.2518ms 0.9072ms 1.1023 KOps/s 1.1345 KOps/s $\color{#d91a1a}-2.84\%$
test_cat 1.2490ms 1.2316ms 811.9562 Ops/s 812.0408 Ops/s $\color{#d91a1a}-0.01\%$

@vmoens vmoens added the enhancement New feature or request label Jul 23, 2024
@vmoens vmoens merged commit 7de33a4 into main Jul 23, 2024
39 of 41 checks passed
@vmoens vmoens deleted the from_modules-expand branch July 23, 2024 12:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants