Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Streaming tensordicts #956

Merged
merged 5 commits into from
Aug 9, 2024
Merged

[Doc] Streaming tensordicts #956

merged 5 commits into from
Aug 9, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 9, 2024

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 969e272d2c8a8823d162c55381a8b70b3787931c
Pull Request resolved: #956
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 9, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 98813cf349698cd4b2edd0e34efd16e17f42d644
Pull Request resolved: #956
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 11a3a8a01bdd84f7a6ecfbe7f4b66895db76a55f
Pull Request resolved: #956
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: 76094e20f9486fb7363e8aee57d51bc24b4fe525
Pull Request resolved: #956
@vmoens vmoens added the documentation Improvements or additions to documentation label Aug 9, 2024
Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 219. Improved: $\large\color{#35bf28}6$. Worsened: $\large\color{#d91a1a}38$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 50.8050μs 22.1213μs 45.2054 KOps/s 46.9291 KOps/s $\color{#d91a1a}-3.67\%$
test_plain_set_stack_nested 62.6370μs 22.2489μs 44.9461 KOps/s 47.6114 KOps/s $\textbf{\color{#d91a1a}-5.60\%}$
test_plain_set_nested_inplace 75.1340μs 23.8617μs 41.9082 KOps/s 43.8111 KOps/s $\color{#d91a1a}-4.34\%$
test_plain_set_stack_nested_inplace 76.0730μs 24.3317μs 41.0986 KOps/s 43.6710 KOps/s $\textbf{\color{#d91a1a}-5.89\%}$
test_items 20.7090μs 2.7378μs 365.2526 KOps/s 379.3574 KOps/s $\color{#d91a1a}-3.72\%$
test_items_nested 2.2305ms 0.3484ms 2.8699 KOps/s 3.0338 KOps/s $\textbf{\color{#d91a1a}-5.40\%}$
test_items_nested_locked 0.6186ms 0.3504ms 2.8538 KOps/s 2.9996 KOps/s $\color{#d91a1a}-4.86\%$
test_items_nested_leaf 0.1530ms 82.3176μs 12.1481 KOps/s 12.2048 KOps/s $\color{#d91a1a}-0.46\%$
test_items_stack_nested 0.5789ms 0.3407ms 2.9349 KOps/s 2.9997 KOps/s $\color{#d91a1a}-2.16\%$
test_items_stack_nested_leaf 0.1520ms 82.4611μs 12.1269 KOps/s 12.7122 KOps/s $\color{#d91a1a}-4.60\%$
test_items_stack_nested_locked 0.5843ms 0.3404ms 2.9376 KOps/s 2.9455 KOps/s $\color{#d91a1a}-0.27\%$
test_keys 23.0630μs 3.8171μs 261.9821 KOps/s 261.6522 KOps/s $\color{#35bf28}+0.13\%$
test_keys_nested 0.2947ms 0.1401ms 7.1362 KOps/s 6.9494 KOps/s $\color{#35bf28}+2.69\%$
test_keys_nested_locked 0.6989ms 0.1472ms 6.7927 KOps/s 6.6931 KOps/s $\color{#35bf28}+1.49\%$
test_keys_nested_leaf 0.2426ms 0.1225ms 8.1605 KOps/s 8.1625 KOps/s $\color{#d91a1a}-0.02\%$
test_keys_stack_nested 0.2539ms 0.1431ms 6.9874 KOps/s 7.1098 KOps/s $\color{#d91a1a}-1.72\%$
test_keys_stack_nested_leaf 0.2249ms 0.1211ms 8.2581 KOps/s 8.2695 KOps/s $\color{#d91a1a}-0.14\%$
test_keys_stack_nested_locked 0.2447ms 0.1496ms 6.6863 KOps/s 6.7931 KOps/s $\color{#d91a1a}-1.57\%$
test_values 13.5205μs 2.2475μs 444.9459 KOps/s 851.4352 KOps/s $\textbf{\color{#d91a1a}-47.74\%}$
test_values_nested 89.0170μs 48.6968μs 20.5352 KOps/s 20.3904 KOps/s $\color{#35bf28}+0.71\%$
test_values_nested_locked 0.1064ms 48.9223μs 20.4406 KOps/s 20.4614 KOps/s $\color{#d91a1a}-0.10\%$
test_values_nested_leaf 95.1990μs 44.5201μs 22.4618 KOps/s 22.3074 KOps/s $\color{#35bf28}+0.69\%$
test_values_stack_nested 99.3060μs 49.6965μs 20.1222 KOps/s 19.6784 KOps/s $\color{#35bf28}+2.25\%$
test_values_stack_nested_leaf 88.0250μs 43.9989μs 22.7278 KOps/s 23.1266 KOps/s $\color{#d91a1a}-1.72\%$
test_values_stack_nested_locked 90.7200μs 49.3471μs 20.2646 KOps/s 19.7143 KOps/s $\color{#35bf28}+2.79\%$
test_membership 2.5493μs 0.7238μs 1.3816 MOps/s 1.3453 MOps/s $\color{#35bf28}+2.70\%$
test_membership_nested 30.4470μs 2.5934μs 385.5886 KOps/s 378.1200 KOps/s $\color{#35bf28}+1.98\%$
test_membership_nested_leaf 42.2490μs 2.6357μs 379.4052 KOps/s 389.1424 KOps/s $\color{#d91a1a}-2.50\%$
test_membership_stacked_nested 29.3860μs 2.6155μs 382.3300 KOps/s 401.7692 KOps/s $\color{#d91a1a}-4.84\%$
test_membership_stacked_nested_leaf 25.8380μs 2.6390μs 378.9258 KOps/s 392.3195 KOps/s $\color{#d91a1a}-3.41\%$
test_membership_nested_last 28.3330μs 3.9099μs 255.7634 KOps/s 263.5290 KOps/s $\color{#d91a1a}-2.95\%$
test_membership_nested_leaf_last 30.4670μs 3.9042μs 256.1333 KOps/s 254.4892 KOps/s $\color{#35bf28}+0.65\%$
test_membership_stacked_nested_last 24.4450μs 3.9255μs 254.7453 KOps/s 78.6358 KOps/s $\textbf{\color{#35bf28}+223.96\%}$
test_membership_stacked_nested_leaf_last 43.0300μs 3.8070μs 262.6746 KOps/s 80.6033 KOps/s $\textbf{\color{#35bf28}+225.89\%}$
test_nested_getleaf 41.2470μs 10.4037μs 96.1198 KOps/s 96.1789 KOps/s $\color{#d91a1a}-0.06\%$
test_nested_get 50.8450μs 9.7485μs 102.5799 KOps/s 104.7909 KOps/s $\color{#d91a1a}-2.11\%$
test_stacked_getleaf 39.5270μs 10.3153μs 96.9433 KOps/s 95.5553 KOps/s $\color{#35bf28}+1.45\%$
test_stacked_get 32.7310μs 9.6172μs 103.9802 KOps/s 102.9688 KOps/s $\color{#35bf28}+0.98\%$
test_nested_getitemleaf 52.1880μs 10.9828μs 91.0511 KOps/s 94.8358 KOps/s $\color{#d91a1a}-3.99\%$
test_nested_getitem 53.4600μs 10.0788μs 99.2182 KOps/s 102.4214 KOps/s $\color{#d91a1a}-3.13\%$
test_stacked_getitemleaf 44.2930μs 11.0666μs 90.3618 KOps/s 94.3194 KOps/s $\color{#d91a1a}-4.20\%$
test_stacked_getitem 30.9990μs 10.0009μs 99.9913 KOps/s 100.8612 KOps/s $\color{#d91a1a}-0.86\%$
test_lock_nested 80.1372ms 0.5698ms 1.7551 KOps/s 2.0350 KOps/s $\textbf{\color{#d91a1a}-13.75\%}$
test_lock_stack_nested 0.7302ms 0.4648ms 2.1515 KOps/s 2.3230 KOps/s $\textbf{\color{#d91a1a}-7.38\%}$
test_unlock_nested 80.1133ms 0.4919ms 2.0329 KOps/s 2.4290 KOps/s $\textbf{\color{#d91a1a}-16.31\%}$
test_unlock_stack_nested 0.6147ms 0.3759ms 2.6599 KOps/s 2.7956 KOps/s $\color{#d91a1a}-4.85\%$
test_flatten_speed 0.1961ms 0.1015ms 9.8502 KOps/s 9.9062 KOps/s $\color{#d91a1a}-0.57\%$
test_unflatten_speed 0.8908ms 0.4550ms 2.1977 KOps/s 2.2592 KOps/s $\color{#d91a1a}-2.72\%$
test_common_ops 3.8549ms 1.1478ms 871.2471 Ops/s 930.9555 Ops/s $\textbf{\color{#d91a1a}-6.41\%}$
test_creation 21.0190μs 2.0107μs 497.3489 KOps/s 487.4781 KOps/s $\color{#35bf28}+2.02\%$
test_creation_empty 47.0080μs 18.0837μs 55.2983 KOps/s 57.3083 KOps/s $\color{#d91a1a}-3.51\%$
test_creation_nested_1 79.4090μs 21.6670μs 46.1532 KOps/s 48.0131 KOps/s $\color{#d91a1a}-3.87\%$
test_creation_nested_2 60.8140μs 25.8720μs 38.6518 KOps/s 40.0496 KOps/s $\color{#d91a1a}-3.49\%$
test_clone 63.7690μs 16.6489μs 60.0641 KOps/s 58.9920 KOps/s $\color{#35bf28}+1.82\%$
test_getitem[int] 1.1166ms 16.5893μs 60.2796 KOps/s 62.3522 KOps/s $\color{#d91a1a}-3.32\%$
test_getitem[slice_int] 0.1382ms 32.1640μs 31.0907 KOps/s 31.9540 KOps/s $\color{#d91a1a}-2.70\%$
test_getitem[range] 0.1904ms 58.3519μs 17.1374 KOps/s 17.4949 KOps/s $\color{#d91a1a}-2.04\%$
test_getitem[tuple] 0.1242ms 26.6571μs 37.5134 KOps/s 39.7972 KOps/s $\textbf{\color{#d91a1a}-5.74\%}$
test_getitem[list] 0.1808ms 53.2215μs 18.7894 KOps/s 19.3687 KOps/s $\color{#d91a1a}-2.99\%$
test_setitem_dim[int] 64.1800μs 41.5662μs 24.0580 KOps/s 23.8293 KOps/s $\color{#35bf28}+0.96\%$
test_setitem_dim[slice_int] 0.1166ms 72.0086μs 13.8872 KOps/s 14.2225 KOps/s $\color{#d91a1a}-2.36\%$
test_setitem_dim[range] 0.1613ms 94.4621μs 10.5863 KOps/s 10.7491 KOps/s $\color{#d91a1a}-1.51\%$
test_setitem_dim[tuple] 0.1319ms 63.3339μs 15.7893 KOps/s 16.9953 KOps/s $\textbf{\color{#d91a1a}-7.10\%}$
test_setitem 88.5460μs 30.2292μs 33.0806 KOps/s 34.3680 KOps/s $\color{#d91a1a}-3.75\%$
test_set 0.1194ms 29.3009μs 34.1286 KOps/s 35.7499 KOps/s $\color{#d91a1a}-4.54\%$
test_set_shared 1.2410ms 0.2108ms 4.7437 KOps/s 4.6768 KOps/s $\color{#35bf28}+1.43\%$
test_update 0.1368ms 38.0333μs 26.2927 KOps/s 27.6364 KOps/s $\color{#d91a1a}-4.86\%$
test_update_nested 2.4328ms 49.1819μs 20.3327 KOps/s 22.3683 KOps/s $\textbf{\color{#d91a1a}-9.10\%}$
test_update__nested 0.1367ms 35.9852μs 27.7892 KOps/s 30.1758 KOps/s $\textbf{\color{#d91a1a}-7.91\%}$
test_set_nested 0.1062ms 33.6566μs 29.7119 KOps/s 32.1969 KOps/s $\textbf{\color{#d91a1a}-7.72\%}$
test_set_nested_new 0.1072ms 37.8656μs 26.4092 KOps/s 28.4151 KOps/s $\textbf{\color{#d91a1a}-7.06\%}$
test_select 0.1559ms 53.7072μs 18.6195 KOps/s 19.6196 KOps/s $\textbf{\color{#d91a1a}-5.10\%}$
test_select_nested 0.1388ms 57.6372μs 17.3499 KOps/s 17.5861 KOps/s $\color{#d91a1a}-1.34\%$
test_exclude_nested 0.1666ms 75.4828μs 13.2481 KOps/s 13.4105 KOps/s $\color{#d91a1a}-1.21\%$
test_empty[True] 0.4533ms 0.3222ms 3.1033 KOps/s 3.2109 KOps/s $\color{#d91a1a}-3.35\%$
test_empty[False] 5.8488μs 1.1318μs 883.5292 KOps/s 888.1983 KOps/s $\color{#d91a1a}-0.53\%$
test_unbind_speed 0.3562ms 0.2889ms 3.4618 KOps/s 3.2825 KOps/s $\textbf{\color{#35bf28}+5.46\%}$
test_unbind_speed_stack0 0.6814ms 0.2956ms 3.3831 KOps/s 3.5595 KOps/s $\color{#d91a1a}-4.95\%$
test_unbind_speed_stack1 83.0676ms 0.7535ms 1.3271 KOps/s 1.4793 KOps/s $\textbf{\color{#d91a1a}-10.29\%}$
test_split 81.8858ms 2.0206ms 494.8983 Ops/s 440.3197 Ops/s $\textbf{\color{#35bf28}+12.40\%}$
test_chunk 83.8380ms 2.0211ms 494.7700 Ops/s 512.7088 Ops/s $\color{#d91a1a}-3.50\%$
test_creation[device0] 4.2665ms 0.1177ms 8.4954 KOps/s 8.2454 KOps/s $\color{#35bf28}+3.03\%$
test_creation_from_tensor 0.2283ms 0.1219ms 8.2051 KOps/s 8.6709 KOps/s $\textbf{\color{#d91a1a}-5.37\%}$
test_add_one[memmap_tensor0] 0.2274ms 7.7485μs 129.0572 KOps/s 129.6854 KOps/s $\color{#d91a1a}-0.48\%$
test_contiguous[memmap_tensor0] 29.6660μs 2.0095μs 497.6379 KOps/s 501.0831 KOps/s $\color{#d91a1a}-0.69\%$
test_stack[memmap_tensor0] 56.7360μs 5.7693μs 173.3322 KOps/s 179.4097 KOps/s $\color{#d91a1a}-3.39\%$
test_memmaptd_index 1.0497ms 0.3944ms 2.5352 KOps/s 2.5582 KOps/s $\color{#d91a1a}-0.90\%$
test_memmaptd_index_astensor 0.9891ms 0.4763ms 2.0996 KOps/s 2.1319 KOps/s $\color{#d91a1a}-1.52\%$
test_memmaptd_index_op 1.3412ms 1.0304ms 970.4619 Ops/s 986.1767 Ops/s $\color{#d91a1a}-1.59\%$
test_serialize_model 0.1227s 0.1154s 8.6689 Ops/s 8.8360 Ops/s $\color{#d91a1a}-1.89\%$
test_serialize_model_pickle 0.4691s 0.3933s 2.5427 Ops/s 2.5751 Ops/s $\color{#d91a1a}-1.26\%$
test_serialize_weights 0.1255s 0.1148s 8.7130 Ops/s 8.6048 Ops/s $\color{#35bf28}+1.26\%$
test_serialize_weights_returnearly 0.1842s 0.1607s 6.2234 Ops/s 6.6857 Ops/s $\textbf{\color{#d91a1a}-6.91\%}$
test_serialize_weights_pickle 0.4935s 0.4097s 2.4409 Ops/s 2.4326 Ops/s $\color{#35bf28}+0.34\%$
test_serialize_weights_filesystem 0.2159s 0.1472s 6.7943 Ops/s 6.6155 Ops/s $\color{#35bf28}+2.70\%$
test_serialize_model_filesystem 0.1626s 0.1488s 6.7211 Ops/s 6.5693 Ops/s $\color{#35bf28}+2.31\%$
test_reshape_pytree 99.7560μs 39.9408μs 25.0371 KOps/s 25.7887 KOps/s $\color{#d91a1a}-2.91\%$
test_reshape_td 0.1491ms 46.1178μs 21.6836 KOps/s 22.6204 KOps/s $\color{#d91a1a}-4.14\%$
test_view_pytree 85.7200μs 37.3977μs 26.7396 KOps/s 25.8662 KOps/s $\color{#35bf28}+3.38\%$
test_view_td 0.1178ms 52.8992μs 18.9039 KOps/s 19.4729 KOps/s $\color{#d91a1a}-2.92\%$
test_unbind_pytree 71.7140μs 36.5132μs 27.3874 KOps/s 26.4886 KOps/s $\color{#35bf28}+3.39\%$
test_unbind_td 0.3475ms 45.7989μs 21.8346 KOps/s 22.0983 KOps/s $\color{#d91a1a}-1.19\%$
test_split_pytree 0.1266ms 40.7171μs 24.5597 KOps/s 25.0395 KOps/s $\color{#d91a1a}-1.92\%$
test_split_td 0.4835ms 57.3181μs 17.4465 KOps/s 17.4714 KOps/s $\color{#d91a1a}-0.14\%$
test_add_pytree 0.1327ms 46.3715μs 21.5650 KOps/s 20.9437 KOps/s $\color{#35bf28}+2.97\%$
test_add_td 0.2147ms 82.9381μs 12.0572 KOps/s 12.0761 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_add_one_nested[tensordict-compile] 0.1247ms 55.0544μs 18.1638 KOps/s 19.2303 KOps/s $\textbf{\color{#d91a1a}-5.55\%}$
test_compile_add_one_nested[tensordict-eager] 0.4262ms 0.1871ms 5.3437 KOps/s 5.2553 KOps/s $\color{#35bf28}+1.68\%$
test_compile_add_one_nested[pytree-compile] 0.1672ms 55.6856μs 17.9580 KOps/s 19.3185 KOps/s $\textbf{\color{#d91a1a}-7.04\%}$
test_compile_add_one_nested[pytree-eager] 0.2574ms 0.1452ms 6.8856 KOps/s 6.9254 KOps/s $\color{#d91a1a}-0.57\%$
test_compile_copy_nested[tensordict-compile] 92.6930μs 20.0171μs 49.9573 KOps/s 51.1677 KOps/s $\color{#d91a1a}-2.37\%$
test_compile_copy_nested[tensordict-eager] 0.1392ms 62.9847μs 15.8769 KOps/s 15.6960 KOps/s $\color{#35bf28}+1.15\%$
test_compile_copy_nested[pytree-compile] 0.1639ms 78.3114μs 12.7695 KOps/s 12.5430 KOps/s $\color{#35bf28}+1.81\%$
test_compile_copy_nested[pytree-eager] 0.1568ms 70.1176μs 14.2617 KOps/s 14.3399 KOps/s $\color{#d91a1a}-0.54\%$
test_compile_add_one_flat[tensordict-compile] 0.4073ms 0.1735ms 5.7636 KOps/s 5.9750 KOps/s $\color{#d91a1a}-3.54\%$
test_compile_add_one_flat[tensordict-eager] 0.3427ms 0.1876ms 5.3302 KOps/s 5.2359 KOps/s $\color{#35bf28}+1.80\%$
test_compile_add_one_flat[tensorclass-compile] 79.7790μs 38.3436μs 26.0800 KOps/s 26.0357 KOps/s $\color{#35bf28}+0.17\%$
test_compile_add_one_flat[tensorclass-eager] 0.8747ms 71.5792μs 13.9705 KOps/s 14.2801 KOps/s $\color{#d91a1a}-2.17\%$
test_compile_add_one_flat[pytree-compile] 0.4451ms 0.1708ms 5.8556 KOps/s 5.7002 KOps/s $\color{#35bf28}+2.73\%$
test_compile_add_one_flat[pytree-eager] 0.5843ms 0.2873ms 3.4802 KOps/s 3.3867 KOps/s $\color{#35bf28}+2.76\%$
test_compile_add_self_flat[tensordict-eager] 0.3513ms 0.2047ms 4.8851 KOps/s 4.9339 KOps/s $\color{#d91a1a}-0.99\%$
test_compile_add_self_flat[tensordict-compile] 0.5942ms 0.1849ms 5.4091 KOps/s 5.7868 KOps/s $\textbf{\color{#d91a1a}-6.53\%}$
test_compile_add_self_flat[tensorclass-eager] 0.7223ms 61.9265μs 16.1482 KOps/s 15.4179 KOps/s $\color{#35bf28}+4.74\%$
test_compile_add_self_flat[tensorclass-compile] 0.1222ms 40.4746μs 24.7068 KOps/s 26.1352 KOps/s $\textbf{\color{#d91a1a}-5.47\%}$
test_compile_add_self_flat[pytree-eager] 0.4585ms 0.2387ms 4.1892 KOps/s 4.1298 KOps/s $\color{#35bf28}+1.44\%$
test_compile_add_self_flat[pytree-compile] 0.5730ms 0.1784ms 5.6040 KOps/s 5.9170 KOps/s $\textbf{\color{#d91a1a}-5.29\%}$
test_compile_copy_flat[tensordict-compile] 0.1890ms 0.1092ms 9.1544 KOps/s 9.3946 KOps/s $\color{#d91a1a}-2.56\%$
test_compile_copy_flat[tensordict-eager] 0.1187ms 55.5192μs 18.0118 KOps/s 18.2197 KOps/s $\color{#d91a1a}-1.14\%$
test_compile_copy_flat[pytree-compile] 0.1923ms 79.5288μs 12.5741 KOps/s 12.4543 KOps/s $\color{#35bf28}+0.96\%$
test_compile_copy_flat[pytree-eager] 0.1596ms 70.5934μs 14.1656 KOps/s 14.1226 KOps/s $\color{#35bf28}+0.30\%$
test_compile_assign_and_add[tensordict-compile] 0.3744ms 0.1892ms 5.2863 KOps/s 5.6742 KOps/s $\textbf{\color{#d91a1a}-6.84\%}$
test_compile_assign_and_add[tensordict-eager] 1.7978ms 1.5941ms 627.3206 Ops/s 629.4596 Ops/s $\color{#d91a1a}-0.34\%$
test_compile_assign_and_add[pytree-compile] 0.2589ms 0.1841ms 5.4322 KOps/s 5.4736 KOps/s $\color{#d91a1a}-0.76\%$
test_compile_assign_and_add[pytree-eager] 1.2500ms 1.0946ms 913.5807 Ops/s 906.4682 Ops/s $\color{#35bf28}+0.78\%$
test_compile_assign_and_add_stack[compile] 0.5032ms 0.4024ms 2.4849 KOps/s 2.4481 KOps/s $\color{#35bf28}+1.50\%$
test_compile_assign_and_add_stack[eager] 6.0025ms 3.8352ms 260.7445 Ops/s 261.5586 Ops/s $\color{#d91a1a}-0.31\%$
test_compile_indexing[tensor-tensordict-compile] 0.1027ms 34.5615μs 28.9339 KOps/s 31.1249 KOps/s $\textbf{\color{#d91a1a}-7.04\%}$
test_compile_indexing[tensor-tensordict-eager] 0.6876ms 49.0686μs 20.3796 KOps/s 20.7403 KOps/s $\color{#d91a1a}-1.74\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1014ms 29.6494μs 33.7275 KOps/s 35.5073 KOps/s $\textbf{\color{#d91a1a}-5.01\%}$
test_compile_indexing[tensor-tensorclass-eager] 90.9610μs 30.6873μs 32.5868 KOps/s 32.2175 KOps/s $\color{#35bf28}+1.15\%$
test_compile_indexing[tensor-pytree-compile] 89.4570μs 29.1739μs 34.2772 KOps/s 36.1667 KOps/s $\textbf{\color{#d91a1a}-5.22\%}$
test_compile_indexing[tensor-pytree-eager] 74.3390μs 30.2095μs 33.1022 KOps/s 32.0791 KOps/s $\color{#35bf28}+3.19\%$
test_compile_indexing[slice-tensordict-compile] 0.1493ms 74.1263μs 13.4905 KOps/s 13.9865 KOps/s $\color{#d91a1a}-3.55\%$
test_compile_indexing[slice-tensordict-eager] 0.4810ms 27.5960μs 36.2371 KOps/s 35.6280 KOps/s $\color{#35bf28}+1.71\%$
test_compile_indexing[slice-tensorclass-compile] 0.1362ms 67.5965μs 14.7937 KOps/s 14.9875 KOps/s $\color{#d91a1a}-1.29\%$
test_compile_indexing[slice-tensorclass-eager] 80.9320μs 24.6755μs 40.5260 KOps/s 40.0183 KOps/s $\color{#35bf28}+1.27\%$
test_compile_indexing[slice-pytree-compile] 0.1366ms 68.7803μs 14.5390 KOps/s 14.8858 KOps/s $\color{#d91a1a}-2.33\%$
test_compile_indexing[slice-pytree-eager] 87.4100μs 23.7950μs 42.0256 KOps/s 40.1481 KOps/s $\color{#35bf28}+4.68\%$
test_compile_indexing[int-tensordict-compile] 0.1630ms 74.5150μs 13.4201 KOps/s 14.1689 KOps/s $\textbf{\color{#d91a1a}-5.28\%}$
test_compile_indexing[int-tensordict-eager] 0.8566ms 28.0015μs 35.7123 KOps/s 35.5151 KOps/s $\color{#35bf28}+0.56\%$
test_compile_indexing[int-tensorclass-compile] 0.1373ms 66.6155μs 15.0115 KOps/s 14.9792 KOps/s $\color{#35bf28}+0.22\%$
test_compile_indexing[int-tensorclass-eager] 76.8640μs 23.9733μs 41.7130 KOps/s 39.7755 KOps/s $\color{#35bf28}+4.87\%$
test_compile_indexing[int-pytree-compile] 0.1410ms 68.1062μs 14.6830 KOps/s 14.7989 KOps/s $\color{#d91a1a}-0.78\%$
test_compile_indexing[int-pytree-eager] 96.7370μs 23.9240μs 41.7991 KOps/s 40.1230 KOps/s $\color{#35bf28}+4.18\%$
test_mod_add[eager] 79.3420μs 25.9158μs 38.5865 KOps/s 40.4335 KOps/s $\color{#d91a1a}-4.57\%$
test_mod_add[compile] 0.1008ms 38.1308μs 26.2255 KOps/s 27.9114 KOps/s $\textbf{\color{#d91a1a}-6.04\%}$
test_mod_add[compile-overhead] 0.1206ms 38.0815μs 26.2594 KOps/s 27.9664 KOps/s $\textbf{\color{#d91a1a}-6.10\%}$
test_mod_wrap[eager] 0.4360ms 0.2160ms 4.6297 KOps/s 4.7913 KOps/s $\color{#d91a1a}-3.37\%$
test_mod_wrap[compile] 1.7432ms 0.2336ms 4.2808 KOps/s 4.2914 KOps/s $\color{#d91a1a}-0.25\%$
test_mod_wrap[compile-overhead] 0.4201ms 0.2265ms 4.4151 KOps/s 4.4128 KOps/s $\color{#35bf28}+0.05\%$
test_mod_wrap_and_backward[eager] 17.3096ms 12.7882ms 78.1971 Ops/s 92.2091 Ops/s $\textbf{\color{#d91a1a}-15.20\%}$
test_mod_wrap_and_backward[compile] 14.7260ms 11.5685ms 86.4416 Ops/s 85.7060 Ops/s $\color{#35bf28}+0.86\%$
test_mod_wrap_and_backward[compile-overhead] 18.0968ms 11.7087ms 85.4063 Ops/s 85.4381 Ops/s $\color{#d91a1a}-0.04\%$
test_seq_add[eager] 0.1774ms 91.0680μs 10.9808 KOps/s 11.7742 KOps/s $\textbf{\color{#d91a1a}-6.74\%}$
test_seq_add[compile] 0.1596ms 62.4547μs 16.0116 KOps/s 16.2754 KOps/s $\color{#d91a1a}-1.62\%$
test_seq_add[compile-overhead] 0.1544ms 61.9439μs 16.1436 KOps/s 16.8881 KOps/s $\color{#d91a1a}-4.41\%$
test_seq_wrap[eager] 0.4958ms 0.3825ms 2.6146 KOps/s 2.6358 KOps/s $\color{#d91a1a}-0.80\%$
test_seq_wrap[compile] 0.4023ms 0.2639ms 3.7895 KOps/s 3.7617 KOps/s $\color{#35bf28}+0.74\%$
test_seq_wrap[compile-overhead] 0.6233ms 0.2623ms 3.8129 KOps/s 3.7737 KOps/s $\color{#35bf28}+1.04\%$
test_func_call_runtime[False-eager] 0.9264ms 0.5242ms 1.9076 KOps/s 1.9085 KOps/s $\color{#d91a1a}-0.05\%$
test_func_call_runtime[False-compile] 0.8347ms 0.4969ms 2.0124 KOps/s 1.9780 KOps/s $\color{#35bf28}+1.74\%$
test_func_call_runtime[False-compile-overhead] 0.6087ms 0.4902ms 2.0401 KOps/s 2.0360 KOps/s $\color{#35bf28}+0.20\%$
test_func_call_runtime[True-eager] 1.2069ms 0.7424ms 1.3471 KOps/s 1.3162 KOps/s $\color{#35bf28}+2.35\%$
test_func_call_runtime[True-compile] 0.6987ms 0.5031ms 1.9877 KOps/s 1.9695 KOps/s $\color{#35bf28}+0.92\%$
test_func_call_runtime[True-compile-overhead] 1.0138ms 0.5137ms 1.9465 KOps/s 1.9544 KOps/s $\color{#d91a1a}-0.40\%$
test_func_call_cm_runtime[False-eager] 0.8895ms 0.5367ms 1.8633 KOps/s 1.9010 KOps/s $\color{#d91a1a}-1.98\%$
test_func_call_cm_runtime[False-compile] 0.8583ms 0.4966ms 2.0137 KOps/s 2.0222 KOps/s $\color{#d91a1a}-0.42\%$
test_func_call_cm_runtime[False-compile-overhead] 0.9194ms 0.5124ms 1.9517 KOps/s 2.0070 KOps/s $\color{#d91a1a}-2.75\%$
test_func_call_cm_runtime[True-eager] 1.0309ms 0.8888ms 1.1251 KOps/s 1.1357 KOps/s $\color{#d91a1a}-0.93\%$
test_func_call_cm_runtime[True-compile] 0.9591ms 0.8364ms 1.1956 KOps/s 1.1889 KOps/s $\color{#35bf28}+0.57\%$
test_func_call_cm_runtime[True-compile-overhead] 1.4083ms 0.8485ms 1.1785 KOps/s 1.2178 KOps/s $\color{#d91a1a}-3.23\%$
test_distributed 0.2309ms 0.1291ms 7.7470 KOps/s 7.7164 KOps/s $\color{#35bf28}+0.40\%$
test_tdmodule 0.1118ms 18.7839μs 53.2371 KOps/s 58.0495 KOps/s $\textbf{\color{#d91a1a}-8.29\%}$
test_tdmodule_dispatch 63.7500μs 36.9392μs 27.0715 KOps/s 27.9120 KOps/s $\color{#d91a1a}-3.01\%$
test_tdseq 38.7530μs 19.3335μs 51.7237 KOps/s 55.6705 KOps/s $\textbf{\color{#d91a1a}-7.09\%}$
test_tdseq_dispatch 72.9370μs 41.1859μs 24.2802 KOps/s 25.9842 KOps/s $\textbf{\color{#d91a1a}-6.56\%}$
test_instantiation_functorch 2.5988ms 1.6440ms 608.2579 Ops/s 610.4911 Ops/s $\color{#d91a1a}-0.37\%$
test_instantiation_td 1.8002ms 1.1711ms 853.8966 Ops/s 867.0962 Ops/s $\color{#d91a1a}-1.52\%$
test_exec_functorch 0.3081ms 0.1834ms 5.4537 KOps/s 5.5605 KOps/s $\color{#d91a1a}-1.92\%$
test_exec_functional_call 0.4337ms 0.1769ms 5.6525 KOps/s 5.9190 KOps/s $\color{#d91a1a}-4.50\%$
test_exec_td 0.2740ms 0.1783ms 5.6072 KOps/s 6.1399 KOps/s $\textbf{\color{#d91a1a}-8.68\%}$
test_exec_td_decorator 0.4364ms 0.2352ms 4.2523 KOps/s 4.5975 KOps/s $\textbf{\color{#d91a1a}-7.51\%}$
test_vmap_mlp_speed[True-True] 0.9979ms 0.5837ms 1.7132 KOps/s 1.7217 KOps/s $\color{#d91a1a}-0.49\%$
test_vmap_mlp_speed[True-False] 0.8548ms 0.5859ms 1.7067 KOps/s 1.7744 KOps/s $\color{#d91a1a}-3.82\%$
test_vmap_mlp_speed[False-True] 0.5987ms 0.4809ms 2.0795 KOps/s 2.0966 KOps/s $\color{#d91a1a}-0.82\%$
test_vmap_mlp_speed[False-False] 0.6836ms 0.4761ms 2.1006 KOps/s 2.1051 KOps/s $\color{#d91a1a}-0.22\%$
test_vmap_mlp_speed_decorator[True-True] 1.4037ms 0.6276ms 1.5934 KOps/s 1.5680 KOps/s $\color{#35bf28}+1.62\%$
test_vmap_mlp_speed_decorator[True-False] 0.8114ms 0.6323ms 1.5814 KOps/s 1.5815 KOps/s $-0.00\%$
test_vmap_mlp_speed_decorator[False-True] 0.7655ms 0.5099ms 1.9612 KOps/s 1.9208 KOps/s $\color{#35bf28}+2.10\%$
test_vmap_mlp_speed_decorator[False-False] 0.8735ms 0.5230ms 1.9120 KOps/s 1.9137 KOps/s $\color{#d91a1a}-0.09\%$
test_to_module_speed[True] 2.1547ms 1.3257ms 754.3450 Ops/s 762.3985 Ops/s $\color{#d91a1a}-1.06\%$
test_to_module_speed[False] 1.9147ms 1.2922ms 773.8783 Ops/s 803.3222 Ops/s $\color{#d91a1a}-3.67\%$
test_tc_init 75.0100μs 43.7404μs 22.8622 KOps/s 24.0358 KOps/s $\color{#d91a1a}-4.88\%$
test_tc_init_nested 0.1526ms 81.6331μs 12.2499 KOps/s 12.2176 KOps/s $\color{#35bf28}+0.26\%$
test_tc_first_layer_tensor 22.8130μs 1.4896μs 671.3391 KOps/s 681.7609 KOps/s $\color{#d91a1a}-1.53\%$
test_tc_first_layer_nontensor 31.2080μs 4.3057μs 232.2523 KOps/s 241.0800 KOps/s $\color{#d91a1a}-3.66\%$
test_tc_second_layer_tensor 39.9240μs 2.6911μs 371.5917 KOps/s 376.6347 KOps/s $\color{#d91a1a}-1.34\%$
test_tc_second_layer_nontensor 23.8040μs 5.3613μs 186.5202 KOps/s 189.8637 KOps/s $\color{#d91a1a}-1.76\%$
test_unbind 0.4499s 13.5578ms 73.7582 Ops/s 77.8211 Ops/s $\textbf{\color{#d91a1a}-5.22\%}$
test_full_like 9.1547ms 7.1727ms 139.4179 Ops/s 141.2983 Ops/s $\color{#d91a1a}-1.33\%$
test_zeros_like 10.5868ms 6.4270ms 155.5925 Ops/s 134.0806 Ops/s $\textbf{\color{#35bf28}+16.04\%}$
test_ones_like 13.0813ms 7.6339ms 130.9942 Ops/s 123.6974 Ops/s $\textbf{\color{#35bf28}+5.90\%}$
test_clone 15.4455ms 9.0470ms 110.5342 Ops/s 110.4748 Ops/s $\color{#35bf28}+0.05\%$
test_squeeze 73.1260μs 13.3245μs 75.0500 KOps/s 77.5264 KOps/s $\color{#d91a1a}-3.19\%$
test_unsqueeze 0.1682ms 93.5248μs 10.6924 KOps/s 10.8410 KOps/s $\color{#d91a1a}-1.37\%$
test_split 0.4576ms 0.1997ms 5.0085 KOps/s 5.0036 KOps/s $\color{#35bf28}+0.10\%$
test_permute 0.4866ms 0.2221ms 4.5027 KOps/s 4.5921 KOps/s $\color{#d91a1a}-1.95\%$
test_stack 29.4656ms 24.0781ms 41.5315 Ops/s 40.7821 Ops/s $\color{#35bf28}+1.84\%$
test_cat 32.6967ms 24.1859ms 41.3464 Ops/s 40.8805 Ops/s $\color{#35bf28}+1.14\%$

Copy link

github-actions bot commented Aug 9, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 225. Improved: $\large\color{#35bf28}47$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_plain_set_nested 88.7720μs 16.4811μs 60.6757 KOps/s 57.4391 KOps/s $\textbf{\color{#35bf28}+5.63\%}$
test_plain_set_stack_nested 42.1510μs 16.5407μs 60.4570 KOps/s 57.0619 KOps/s $\textbf{\color{#35bf28}+5.95\%}$
test_plain_set_nested_inplace 36.5210μs 17.5936μs 56.8390 KOps/s 53.9215 KOps/s $\textbf{\color{#35bf28}+5.41\%}$
test_plain_set_stack_nested_inplace 36.7500μs 17.5913μs 56.8462 KOps/s 54.0373 KOps/s $\textbf{\color{#35bf28}+5.20\%}$
test_items 21.4710μs 4.7121μs 212.2187 KOps/s 212.9091 KOps/s $\color{#d91a1a}-0.32\%$
test_items_nested 0.3992ms 0.3605ms 2.7740 KOps/s 2.7604 KOps/s $\color{#35bf28}+0.49\%$
test_items_nested_locked 0.4018ms 0.3605ms 2.7742 KOps/s 2.7133 KOps/s $\color{#35bf28}+2.25\%$
test_items_nested_leaf 0.1090ms 86.9033μs 11.5070 KOps/s 11.8862 KOps/s $\color{#d91a1a}-3.19\%$
test_items_stack_nested 0.4021ms 0.3610ms 2.7701 KOps/s 2.7128 KOps/s $\color{#35bf28}+2.11\%$
test_items_stack_nested_leaf 0.1091ms 84.2656μs 11.8672 KOps/s 11.7385 KOps/s $\color{#35bf28}+1.10\%$
test_items_stack_nested_locked 0.3993ms 0.3647ms 2.7417 KOps/s 2.7169 KOps/s $\color{#35bf28}+0.91\%$
test_keys 26.0790μs 4.3448μs 230.1590 KOps/s 228.3803 KOps/s $\color{#35bf28}+0.78\%$
test_keys_nested 90.0610μs 67.4461μs 14.8267 KOps/s 14.9945 KOps/s $\color{#d91a1a}-1.12\%$
test_keys_nested_locked 2.4560ms 72.2595μs 13.8390 KOps/s 13.9412 KOps/s $\color{#d91a1a}-0.73\%$
test_keys_nested_leaf 76.1110μs 57.0947μs 17.5148 KOps/s 17.6999 KOps/s $\color{#d91a1a}-1.05\%$
test_keys_stack_nested 87.7400μs 67.5043μs 14.8139 KOps/s 15.1523 KOps/s $\color{#d91a1a}-2.23\%$
test_keys_stack_nested_leaf 79.4010μs 57.5890μs 17.3644 KOps/s 17.7699 KOps/s $\color{#d91a1a}-2.28\%$
test_keys_stack_nested_locked 88.8120μs 72.3060μs 13.8301 KOps/s 14.0993 KOps/s $\color{#d91a1a}-1.91\%$
test_values 10.5705μs 1.7791μs 562.0664 KOps/s 569.3669 KOps/s $\color{#d91a1a}-1.28\%$
test_values_nested 58.7410μs 33.7805μs 29.6029 KOps/s 29.5694 KOps/s $\color{#35bf28}+0.11\%$
test_values_nested_locked 56.1620μs 35.5087μs 28.1621 KOps/s 27.8615 KOps/s $\color{#35bf28}+1.08\%$
test_values_nested_leaf 52.2900μs 29.9729μs 33.3635 KOps/s 32.9954 KOps/s $\color{#35bf28}+1.12\%$
test_values_stack_nested 58.6010μs 34.0386μs 29.3784 KOps/s 29.1563 KOps/s $\color{#35bf28}+0.76\%$
test_values_stack_nested_leaf 50.9910μs 30.2465μs 33.0616 KOps/s 32.6745 KOps/s $\color{#35bf28}+1.18\%$
test_values_stack_nested_locked 56.6510μs 35.9864μs 27.7883 KOps/s 27.7168 KOps/s $\color{#35bf28}+0.26\%$
test_membership 17.2810μs 0.6593μs 1.5168 MOps/s 1.8595 MOps/s $\textbf{\color{#d91a1a}-18.43\%}$
test_membership_nested 29.9510μs 2.0171μs 495.7548 KOps/s 516.6183 KOps/s $\color{#d91a1a}-4.04\%$
test_membership_nested_leaf 13.4555μs 1.9497μs 512.8884 KOps/s 522.9499 KOps/s $\color{#d91a1a}-1.92\%$
test_membership_stacked_nested 22.6610μs 2.0384μs 490.5771 KOps/s 506.4207 KOps/s $\color{#d91a1a}-3.13\%$
test_membership_stacked_nested_leaf 31.7200μs 1.9888μs 502.8214 KOps/s 513.0577 KOps/s $\color{#d91a1a}-2.00\%$
test_membership_nested_last 34.4420μs 2.9392μs 340.2255 KOps/s 338.0408 KOps/s $\color{#35bf28}+0.65\%$
test_membership_nested_leaf_last 24.2600μs 2.9788μs 335.7051 KOps/s 340.2811 KOps/s $\color{#d91a1a}-1.34\%$
test_membership_stacked_nested_last 38.5110μs 2.9359μs 340.6112 KOps/s 334.7437 KOps/s $\color{#35bf28}+1.75\%$
test_membership_stacked_nested_leaf_last 18.9600μs 2.9028μs 344.4967 KOps/s 341.7779 KOps/s $\color{#35bf28}+0.80\%$
test_nested_getleaf 34.7600μs 7.8620μs 127.1943 KOps/s 127.6049 KOps/s $\color{#d91a1a}-0.32\%$
test_nested_get 29.6810μs 7.3846μs 135.4164 KOps/s 136.3190 KOps/s $\color{#d91a1a}-0.66\%$
test_stacked_getleaf 36.3400μs 7.8493μs 127.3996 KOps/s 127.8768 KOps/s $\color{#d91a1a}-0.37\%$
test_stacked_get 23.9910μs 7.3897μs 135.3233 KOps/s 136.5724 KOps/s $\color{#d91a1a}-0.91\%$
test_nested_getitemleaf 23.6110μs 8.0835μs 123.7095 KOps/s 123.4178 KOps/s $\color{#35bf28}+0.24\%$
test_nested_getitem 25.1000μs 7.6567μs 130.6047 KOps/s 130.8080 KOps/s $\color{#d91a1a}-0.16\%$
test_stacked_getitemleaf 34.2110μs 8.0971μs 123.5005 KOps/s 123.4172 KOps/s $\color{#35bf28}+0.07\%$
test_stacked_getitem 30.0500μs 7.6103μs 131.4012 KOps/s 131.8058 KOps/s $\color{#d91a1a}-0.31\%$
test_lock_nested 7.4554ms 0.4725ms 2.1166 KOps/s 2.0975 KOps/s $\color{#35bf28}+0.91\%$
test_lock_stack_nested 0.4842ms 0.4321ms 2.3141 KOps/s 2.2640 KOps/s $\color{#35bf28}+2.21\%$
test_unlock_nested 0.8820ms 0.3815ms 2.6212 KOps/s 2.5091 KOps/s $\color{#35bf28}+4.47\%$
test_unlock_stack_nested 0.3844ms 0.3497ms 2.8592 KOps/s 2.7583 KOps/s $\color{#35bf28}+3.66\%$
test_flatten_speed 0.5054ms 0.1038ms 9.6312 KOps/s 9.5787 KOps/s $\color{#35bf28}+0.55\%$
test_unflatten_speed 0.3638ms 0.3169ms 3.1552 KOps/s 3.1430 KOps/s $\color{#35bf28}+0.39\%$
test_common_ops 1.5976ms 1.3559ms 737.4926 Ops/s 709.6307 Ops/s $\color{#35bf28}+3.93\%$
test_creation 20.0100μs 1.6344μs 611.8436 KOps/s 605.7707 KOps/s $\color{#35bf28}+1.00\%$
test_creation_empty 38.2810μs 16.1998μs 61.7290 KOps/s 54.4259 KOps/s $\textbf{\color{#35bf28}+13.42\%}$
test_creation_nested_1 1.0858ms 18.3865μs 54.3877 KOps/s 50.2701 KOps/s $\textbf{\color{#35bf28}+8.19\%}$
test_creation_nested_2 47.5410μs 21.2142μs 47.1383 KOps/s 42.8106 KOps/s $\textbf{\color{#35bf28}+10.11\%}$
test_clone 54.9620μs 29.8092μs 33.5467 KOps/s 30.3126 KOps/s $\textbf{\color{#35bf28}+10.67\%}$
test_getitem[int] 1.1760ms 17.6444μs 56.6752 KOps/s 53.1296 KOps/s $\textbf{\color{#35bf28}+6.67\%}$
test_getitem[slice_int] 0.1520ms 29.7492μs 33.6143 KOps/s 31.9003 KOps/s $\textbf{\color{#35bf28}+5.37\%}$
test_getitem[range] 0.2947ms 0.1200ms 8.3366 KOps/s 8.5380 KOps/s $\color{#d91a1a}-2.36\%$
test_getitem[tuple] 0.1508ms 26.8672μs 37.2201 KOps/s 36.7824 KOps/s $\color{#35bf28}+1.19\%$
test_getitem[list] 0.2300ms 0.1078ms 9.2804 KOps/s 9.3285 KOps/s $\color{#d91a1a}-0.52\%$
test_setitem_dim[int] 76.6220μs 55.0845μs 18.1539 KOps/s 16.9365 KOps/s $\textbf{\color{#35bf28}+7.19\%}$
test_setitem_dim[slice_int] 0.1042ms 80.1279μs 12.4801 KOps/s 12.0336 KOps/s $\color{#35bf28}+3.71\%$
test_setitem_dim[range] 0.1899ms 0.1453ms 6.8832 KOps/s 6.8413 KOps/s $\color{#35bf28}+0.61\%$
test_setitem_dim[tuple] 0.1017ms 77.0709μs 12.9751 KOps/s 13.1575 KOps/s $\color{#d91a1a}-1.39\%$
test_setitem 79.9320μs 45.1690μs 22.1391 KOps/s 21.4140 KOps/s $\color{#35bf28}+3.39\%$
test_set 78.4220μs 43.8273μs 22.8168 KOps/s 21.7069 KOps/s $\textbf{\color{#35bf28}+5.11\%}$
test_set_shared 92.8241ms 63.5737μs 15.7298 KOps/s 17.3249 KOps/s $\textbf{\color{#d91a1a}-9.21\%}$
test_update 89.5320μs 53.0263μs 18.8586 KOps/s 17.7105 KOps/s $\textbf{\color{#35bf28}+6.48\%}$
test_update_nested 90.4520μs 59.8253μs 16.7153 KOps/s 15.1220 KOps/s $\textbf{\color{#35bf28}+10.54\%}$
test_update__nested 98.7710μs 61.0891μs 16.3695 KOps/s 14.6318 KOps/s $\textbf{\color{#35bf28}+11.88\%}$
test_set_nested 88.5910μs 46.4047μs 21.5495 KOps/s 20.4075 KOps/s $\textbf{\color{#35bf28}+5.60\%}$
test_set_nested_new 85.2110μs 47.9680μs 20.8472 KOps/s 17.9946 KOps/s $\textbf{\color{#35bf28}+15.85\%}$
test_select 0.1227ms 64.7519μs 15.4436 KOps/s 14.6730 KOps/s $\textbf{\color{#35bf28}+5.25\%}$
test_select_nested 0.4925ms 52.4386μs 19.0699 KOps/s 19.3167 KOps/s $\color{#d91a1a}-1.28\%$
test_exclude_nested 98.8910μs 69.3698μs 14.4155 KOps/s 14.3987 KOps/s $\color{#35bf28}+0.12\%$
test_empty[True] 0.3315ms 0.2829ms 3.5352 KOps/s 3.5181 KOps/s $\color{#35bf28}+0.48\%$
test_empty[False] 2.9071μs 0.8555μs 1.1689 MOps/s 1.1538 MOps/s $\color{#35bf28}+1.31\%$
test_to 55.5110μs 27.0267μs 37.0005 KOps/s 35.3081 KOps/s $\color{#35bf28}+4.79\%$
test_to_nonblocking 54.2900μs 26.3348μs 37.9725 KOps/s 37.7068 KOps/s $\color{#35bf28}+0.70\%$
test_unbind_speed 1.2973ms 0.2979ms 3.3573 KOps/s 3.2047 KOps/s $\color{#35bf28}+4.76\%$
test_unbind_speed_stack0 0.3442ms 0.2958ms 3.3805 KOps/s 3.2325 KOps/s $\color{#35bf28}+4.58\%$
test_unbind_speed_stack1 91.7662ms 0.7696ms 1.2994 KOps/s 1.2594 KOps/s $\color{#35bf28}+3.17\%$
test_split 92.9094ms 2.3549ms 424.6432 Ops/s 411.7045 Ops/s $\color{#35bf28}+3.14\%$
test_chunk 2.2977ms 2.1641ms 462.0962 Ops/s 411.2591 Ops/s $\textbf{\color{#35bf28}+12.36\%}$
test_creation[device0] 0.1639ms 0.1104ms 9.0606 KOps/s 9.2341 KOps/s $\color{#d91a1a}-1.88\%$
test_creation_from_tensor 0.1664ms 0.1082ms 9.2400 KOps/s 9.4295 KOps/s $\color{#d91a1a}-2.01\%$
test_add_one[memmap_tensor0] 0.1512ms 9.4687μs 105.6106 KOps/s 99.7784 KOps/s $\textbf{\color{#35bf28}+5.85\%}$
test_contiguous[memmap_tensor0] 24.4200μs 2.2703μs 440.4676 KOps/s 424.9915 KOps/s $\color{#35bf28}+3.64\%$
test_stack[memmap_tensor0] 37.6600μs 6.9396μs 144.1005 KOps/s 135.5578 KOps/s $\textbf{\color{#35bf28}+6.30\%}$
test_memmaptd_index 1.2806ms 0.4546ms 2.1999 KOps/s 2.1374 KOps/s $\color{#35bf28}+2.92\%$
test_memmaptd_index_astensor 97.1417ms 0.6090ms 1.6419 KOps/s 1.8697 KOps/s $\textbf{\color{#d91a1a}-12.18\%}$
test_memmaptd_index_op 1.4804ms 1.0898ms 917.5655 Ops/s 867.4427 Ops/s $\textbf{\color{#35bf28}+5.78\%}$
test_serialize_model 91.9756ms 88.9556ms 11.2416 Ops/s 10.8865 Ops/s $\color{#35bf28}+3.26\%$
test_serialize_model_pickle 1.3488s 1.2366s 0.8087 Ops/s 0.8080 Ops/s $\color{#35bf28}+0.08\%$
test_serialize_weights 89.8306ms 85.7310ms 11.6644 Ops/s 11.0766 Ops/s $\textbf{\color{#35bf28}+5.31\%}$
test_serialize_weights_returnearly 58.5255ms 53.0578ms 18.8474 Ops/s 14.9228 Ops/s $\textbf{\color{#35bf28}+26.30\%}$
test_serialize_weights_pickle 1.3508s 1.2375s 0.8081 Ops/s 0.8083 Ops/s $\color{#d91a1a}-0.03\%$
test_reshape_pytree 63.3320μs 38.3906μs 26.0481 KOps/s 24.7657 KOps/s $\textbf{\color{#35bf28}+5.18\%}$
test_reshape_td 0.2486ms 43.6455μs 22.9119 KOps/s 21.8843 KOps/s $\color{#35bf28}+4.70\%$
test_view_pytree 61.7710μs 37.6279μs 26.5761 KOps/s 25.3308 KOps/s $\color{#35bf28}+4.92\%$
test_view_td 0.2593ms 47.9099μs 20.8725 KOps/s 18.9791 KOps/s $\textbf{\color{#35bf28}+9.98\%}$
test_unbind_pytree 68.4820μs 37.0180μs 27.0139 KOps/s 25.8622 KOps/s $\color{#35bf28}+4.45\%$
test_unbind_td 0.3654ms 45.9114μs 21.7811 KOps/s 20.6732 KOps/s $\textbf{\color{#35bf28}+5.36\%}$
test_split_pytree 76.9310μs 50.3647μs 19.8552 KOps/s 18.2497 KOps/s $\textbf{\color{#35bf28}+8.80\%}$
test_split_td 0.4517ms 58.4309μs 17.1142 KOps/s 13.4623 KOps/s $\textbf{\color{#35bf28}+27.13\%}$
test_add_pytree 89.4720μs 59.1530μs 16.9053 KOps/s 15.3729 KOps/s $\textbf{\color{#35bf28}+9.97\%}$
test_add_td 0.3129ms 95.9315μs 10.4241 KOps/s 9.4713 KOps/s $\textbf{\color{#35bf28}+10.06\%}$
test_compile_add_one_nested[tensordict-compile] 0.4244ms 0.2203ms 4.5390 KOps/s 4.4948 KOps/s $\color{#35bf28}+0.98\%$
test_compile_add_one_nested[tensordict-eager] 0.2664ms 0.1746ms 5.7283 KOps/s 5.5315 KOps/s $\color{#35bf28}+3.56\%$
test_compile_add_one_nested[pytree-compile] 0.1936ms 0.1552ms 6.4437 KOps/s 6.3200 KOps/s $\color{#35bf28}+1.96\%$
test_compile_add_one_nested[pytree-eager] 0.2466ms 0.1929ms 5.1848 KOps/s 4.9598 KOps/s $\color{#35bf28}+4.54\%$
test_compile_copy_nested[tensordict-compile] 59.1220μs 22.6339μs 44.1815 KOps/s 43.7496 KOps/s $\color{#35bf28}+0.99\%$
test_compile_copy_nested[tensordict-eager] 79.8410μs 48.0632μs 20.8059 KOps/s 19.9822 KOps/s $\color{#35bf28}+4.12\%$
test_compile_copy_nested[pytree-compile] 0.1108ms 74.7653μs 13.3752 KOps/s 13.5879 KOps/s $\color{#d91a1a}-1.57\%$
test_compile_copy_nested[pytree-eager] 84.0420μs 59.3263μs 16.8559 KOps/s 16.8789 KOps/s $\color{#d91a1a}-0.14\%$
test_compile_add_one_flat[tensordict-compile] 0.5053ms 0.3463ms 2.8873 KOps/s 2.9084 KOps/s $\color{#d91a1a}-0.73\%$
test_compile_add_one_flat[tensordict-eager] 0.2803ms 0.2219ms 4.5055 KOps/s 4.3551 KOps/s $\color{#35bf28}+3.45\%$
test_compile_add_one_flat[tensorclass-compile] 0.1873ms 0.1384ms 7.2261 KOps/s 7.2020 KOps/s $\color{#35bf28}+0.34\%$
test_compile_add_one_flat[tensorclass-eager] 0.1230ms 64.6867μs 15.4591 KOps/s 15.2609 KOps/s $\color{#35bf28}+1.30\%$
test_compile_add_one_flat[pytree-compile] 0.3958ms 0.3449ms 2.8994 KOps/s 2.9019 KOps/s $\color{#d91a1a}-0.09\%$
test_compile_add_one_flat[pytree-eager] 0.7147ms 0.6400ms 1.5625 KOps/s 1.5011 KOps/s $\color{#35bf28}+4.09\%$
test_compile_add_self_flat[tensordict-eager] 0.3122ms 0.2691ms 3.7167 KOps/s 3.6184 KOps/s $\color{#35bf28}+2.71\%$
test_compile_add_self_flat[tensordict-compile] 0.3945ms 0.3468ms 2.8834 KOps/s 2.8866 KOps/s $\color{#d91a1a}-0.11\%$
test_compile_add_self_flat[tensorclass-eager] 0.1751ms 76.9582μs 12.9941 KOps/s 12.8248 KOps/s $\color{#35bf28}+1.32\%$
test_compile_add_self_flat[tensorclass-compile] 0.1934ms 0.1440ms 6.9435 KOps/s 7.1761 KOps/s $\color{#d91a1a}-3.24\%$
test_compile_add_self_flat[pytree-eager] 0.6705ms 0.5424ms 1.8436 KOps/s 1.7574 KOps/s $\color{#35bf28}+4.91\%$
test_compile_add_self_flat[pytree-compile] 0.4340ms 0.3445ms 2.9030 KOps/s 2.8976 KOps/s $\color{#35bf28}+0.19\%$
test_compile_copy_flat[tensordict-compile] 48.3510μs 19.9274μs 50.1822 KOps/s 49.0181 KOps/s $\color{#35bf28}+2.37\%$
test_compile_copy_flat[tensordict-eager] 64.7410μs 31.6887μs 31.5570 KOps/s 30.4326 KOps/s $\color{#35bf28}+3.69\%$
test_compile_copy_flat[pytree-compile] 0.1136ms 76.8512μs 13.0122 KOps/s 12.9645 KOps/s $\color{#35bf28}+0.37\%$
test_compile_copy_flat[pytree-eager] 87.4720μs 60.5115μs 16.5258 KOps/s 16.4745 KOps/s $\color{#35bf28}+0.31\%$
test_compile_assign_and_add[tensordict-compile] 2.4807ms 0.8786ms 1.1382 KOps/s 1.0578 KOps/s $\textbf{\color{#35bf28}+7.60\%}$
test_compile_assign_and_add[tensordict-eager] 3.6026ms 3.3976ms 294.3232 Ops/s 285.1702 Ops/s $\color{#35bf28}+3.21\%$
test_compile_assign_and_add[pytree-compile] 2.5032ms 0.8813ms 1.1347 KOps/s 1.0700 KOps/s $\textbf{\color{#35bf28}+6.05\%}$
test_compile_assign_and_add[pytree-eager] 3.4935ms 3.4550ms 289.4320 Ops/s 281.3667 Ops/s $\color{#35bf28}+2.87\%$
test_compile_indexing[tensor-tensordict-compile] 0.1814ms 0.1184ms 8.4445 KOps/s 8.3998 KOps/s $\color{#35bf28}+0.53\%$
test_compile_indexing[tensor-tensordict-eager] 0.2250ms 63.6984μs 15.6990 KOps/s 15.1795 KOps/s $\color{#35bf28}+3.42\%$
test_compile_indexing[tensor-tensorclass-compile] 0.1440ms 0.1110ms 9.0052 KOps/s 8.9459 KOps/s $\color{#35bf28}+0.66\%$
test_compile_indexing[tensor-tensorclass-eager] 0.1837ms 46.5963μs 21.4609 KOps/s 20.9719 KOps/s $\color{#35bf28}+2.33\%$
test_compile_indexing[tensor-pytree-compile] 0.1373ms 0.1107ms 9.0308 KOps/s 8.6678 KOps/s $\color{#35bf28}+4.19\%$
test_compile_indexing[tensor-pytree-eager] 90.1720μs 47.3374μs 21.1249 KOps/s 19.4909 KOps/s $\textbf{\color{#35bf28}+8.38\%}$
test_compile_indexing[slice-tensordict-compile] 0.1880ms 0.1481ms 6.7537 KOps/s 6.7029 KOps/s $\color{#35bf28}+0.76\%$
test_compile_indexing[slice-tensordict-eager] 0.1727ms 26.5483μs 37.6672 KOps/s 35.7572 KOps/s $\textbf{\color{#35bf28}+5.34\%}$
test_compile_indexing[slice-tensorclass-compile] 0.1705ms 0.1404ms 7.1220 KOps/s 7.1204 KOps/s $\color{#35bf28}+0.02\%$
test_compile_indexing[slice-tensorclass-eager] 46.6810μs 22.9661μs 43.5424 KOps/s 41.9504 KOps/s $\color{#35bf28}+3.80\%$
test_compile_indexing[slice-pytree-compile] 0.2396ms 0.1400ms 7.1403 KOps/s 7.0138 KOps/s $\color{#35bf28}+1.80\%$
test_compile_indexing[slice-pytree-eager] 51.5500μs 22.7697μs 43.9180 KOps/s 42.4636 KOps/s $\color{#35bf28}+3.43\%$
test_compile_indexing[int-tensordict-compile] 0.1838ms 0.1488ms 6.7221 KOps/s 6.7328 KOps/s $\color{#d91a1a}-0.16\%$
test_compile_indexing[int-tensordict-eager] 0.4862ms 26.0994μs 38.3151 KOps/s 37.1112 KOps/s $\color{#35bf28}+3.24\%$
test_compile_indexing[int-tensorclass-compile] 0.2480ms 0.1397ms 7.1558 KOps/s 7.1301 KOps/s $\color{#35bf28}+0.36\%$
test_compile_indexing[int-tensorclass-eager] 47.0810μs 23.0887μs 43.3112 KOps/s 42.7477 KOps/s $\color{#35bf28}+1.32\%$
test_compile_indexing[int-pytree-compile] 0.1830ms 0.1397ms 7.1571 KOps/s 7.1386 KOps/s $\color{#35bf28}+0.26\%$
test_compile_indexing[int-pytree-eager] 45.1910μs 22.6792μs 44.0933 KOps/s 42.2548 KOps/s $\color{#35bf28}+4.35\%$
test_mod_add[eager] 67.8810μs 33.0774μs 30.2321 KOps/s 28.8900 KOps/s $\color{#35bf28}+4.65\%$
test_mod_add[compile] 0.1034ms 73.9123μs 13.5296 KOps/s 12.9670 KOps/s $\color{#35bf28}+4.34\%$
test_mod_add[compile-overhead] 0.2676ms 0.1447ms 6.9091 KOps/s 6.3987 KOps/s $\textbf{\color{#35bf28}+7.98\%}$
test_mod_wrap[eager] 0.3517ms 0.2510ms 3.9834 KOps/s 3.6498 KOps/s $\textbf{\color{#35bf28}+9.14\%}$
test_mod_wrap[compile] 1.2023ms 0.2995ms 3.3393 KOps/s 3.1106 KOps/s $\textbf{\color{#35bf28}+7.35\%}$
test_mod_wrap[compile-overhead] 8.2204ms 4.3193ms 231.5172 Ops/s 233.6075 Ops/s $\color{#d91a1a}-0.89\%$
test_mod_wrap_and_backward[eager] 1.5236ms 1.3898ms 719.5095 Ops/s 674.1418 Ops/s $\textbf{\color{#35bf28}+6.73\%}$
test_mod_wrap_and_backward[compile] 1.7028ms 1.3714ms 729.1751 Ops/s 665.9277 Ops/s $\textbf{\color{#35bf28}+9.50\%}$
test_mod_wrap_and_backward[compile-overhead] 1.4380ms 0.9983ms 1.0017 KOps/s 959.7093 Ops/s $\color{#35bf28}+4.37\%$
test_seq_add[eager] 0.1594ms 0.1032ms 9.6885 KOps/s 8.8873 KOps/s $\textbf{\color{#35bf28}+9.02\%}$
test_seq_add[compile] 0.1456ms 87.0193μs 11.4917 KOps/s 11.3781 KOps/s $\color{#35bf28}+1.00\%$
test_seq_add[compile-overhead] 0.1595ms 0.1235ms 8.0952 KOps/s 7.9855 KOps/s $\color{#35bf28}+1.37\%$
test_seq_wrap[eager] 0.4671ms 0.3976ms 2.5149 KOps/s 2.4534 KOps/s $\color{#35bf28}+2.51\%$
test_seq_wrap[compile] 0.3811ms 0.3246ms 3.0808 KOps/s 2.9971 KOps/s $\color{#35bf28}+2.79\%$
test_seq_wrap[compile-overhead] 0.2989ms 0.2375ms 4.2097 KOps/s 4.1618 KOps/s $\color{#35bf28}+1.15\%$
test_func_call_runtime[False-eager] 0.8490ms 0.7582ms 1.3189 KOps/s 1.2462 KOps/s $\textbf{\color{#35bf28}+5.83\%}$
test_func_call_runtime[False-compile] 0.9139ms 0.8030ms 1.2453 KOps/s 1.1902 KOps/s $\color{#35bf28}+4.63\%$
test_func_call_runtime[False-compile-overhead] 0.4217ms 0.3828ms 2.6121 KOps/s 2.5870 KOps/s $\color{#35bf28}+0.97\%$
test_func_call_runtime[True-eager] 1.0709ms 0.9401ms 1.0637 KOps/s 1.0366 KOps/s $\color{#35bf28}+2.62\%$
test_func_call_runtime[True-compile] 0.9291ms 0.8509ms 1.1752 KOps/s 1.1305 KOps/s $\color{#35bf28}+3.96\%$
test_func_call_runtime[True-compile-overhead] 0.4777ms 0.4240ms 2.3587 KOps/s 2.3184 KOps/s $\color{#35bf28}+1.73\%$
test_func_call_cm_runtime[False-eager] 0.8739ms 0.7522ms 1.3295 KOps/s 1.3048 KOps/s $\color{#35bf28}+1.89\%$
test_func_call_cm_runtime[False-compile] 0.8843ms 0.8035ms 1.2445 KOps/s 1.1867 KOps/s $\color{#35bf28}+4.87\%$
test_func_call_cm_runtime[False-compile-overhead] 0.4291ms 0.3845ms 2.6007 KOps/s 2.5950 KOps/s $\color{#35bf28}+0.22\%$
test_func_call_cm_runtime[True-eager] 1.1784ms 1.0482ms 954.0519 Ops/s 925.8179 Ops/s $\color{#35bf28}+3.05\%$
test_func_call_cm_runtime[True-compile] 1.1282ms 1.0275ms 973.2354 Ops/s 947.3746 Ops/s $\color{#35bf28}+2.73\%$
test_func_call_cm_runtime[True-compile-overhead] 1.1373ms 1.0334ms 967.6626 Ops/s 946.6131 Ops/s $\color{#35bf28}+2.22\%$
test_distributed 2.2169ms 71.0611μs 14.0724 KOps/s 14.3982 KOps/s $\color{#d91a1a}-2.26\%$
test_tdmodule 0.1306ms 16.1199μs 62.0352 KOps/s 57.3812 KOps/s $\textbf{\color{#35bf28}+8.11\%}$
test_tdmodule_dispatch 51.1400μs 33.1497μs 30.1661 KOps/s 29.3308 KOps/s $\color{#35bf28}+2.85\%$
test_tdseq 31.4410μs 16.3116μs 61.3059 KOps/s 55.3768 KOps/s $\textbf{\color{#35bf28}+10.71\%}$
test_tdseq_dispatch 62.4510μs 34.5900μs 28.9101 KOps/s 27.4430 KOps/s $\textbf{\color{#35bf28}+5.35\%}$
test_instantiation_functorch 2.1484ms 2.0231ms 494.2865 Ops/s 480.7596 Ops/s $\color{#35bf28}+2.81\%$
test_instantiation_td 1.9956ms 1.3137ms 761.2191 Ops/s 741.9893 Ops/s $\color{#35bf28}+2.59\%$
test_exec_functorch 0.2882ms 0.2332ms 4.2877 KOps/s 4.3083 KOps/s $\color{#d91a1a}-0.48\%$
test_exec_functional_call 0.2756ms 0.2245ms 4.4544 KOps/s 4.4359 KOps/s $\color{#35bf28}+0.41\%$
test_exec_td 0.2863ms 0.2322ms 4.3066 KOps/s 4.2257 KOps/s $\color{#35bf28}+1.91\%$
test_exec_td_decorator 0.4203ms 0.2842ms 3.5184 KOps/s 3.5008 KOps/s $\color{#35bf28}+0.50\%$
test_vmap_mlp_speed[True-True] 1.0818ms 0.6871ms 1.4555 KOps/s 1.4842 KOps/s $\color{#d91a1a}-1.94\%$
test_vmap_mlp_speed[True-False] 0.7460ms 0.6777ms 1.4757 KOps/s 1.5217 KOps/s $\color{#d91a1a}-3.03\%$
test_vmap_mlp_speed[False-True] 0.7069ms 0.6034ms 1.6573 KOps/s 1.6750 KOps/s $\color{#d91a1a}-1.05\%$
test_vmap_mlp_speed[False-False] 0.7562ms 0.6046ms 1.6539 KOps/s 1.7382 KOps/s $\color{#d91a1a}-4.85\%$
test_vmap_mlp_speed_decorator[True-True] 1.2762ms 0.7253ms 1.3787 KOps/s 1.3920 KOps/s $\color{#d91a1a}-0.96\%$
test_vmap_mlp_speed_decorator[True-False] 0.8631ms 0.7230ms 1.3831 KOps/s 1.4004 KOps/s $\color{#d91a1a}-1.24\%$
test_vmap_mlp_speed_decorator[False-True] 0.8084ms 0.6420ms 1.5576 KOps/s 1.6131 KOps/s $\color{#d91a1a}-3.44\%$
test_vmap_mlp_speed_decorator[False-False] 0.7681ms 0.6418ms 1.5580 KOps/s 1.6130 KOps/s $\color{#d91a1a}-3.41\%$
test_vmap_transformer_speed[True-True] 9.7319ms 9.0996ms 109.8953 Ops/s 112.4520 Ops/s $\color{#d91a1a}-2.27\%$
test_vmap_transformer_speed[True-False] 9.3235ms 9.1003ms 109.8863 Ops/s 112.5415 Ops/s $\color{#d91a1a}-2.36\%$
test_vmap_transformer_speed[False-True] 9.3839ms 9.0476ms 110.5263 Ops/s 113.6750 Ops/s $\color{#d91a1a}-2.77\%$
test_vmap_transformer_speed[False-False] 9.4642ms 9.0727ms 110.2211 Ops/s 113.6027 Ops/s $\color{#d91a1a}-2.98\%$
test_vmap_transformer_speed_decorator[True-True] 21.9784ms 21.2502ms 47.0584 Ops/s 47.8920 Ops/s $\color{#d91a1a}-1.74\%$
test_vmap_transformer_speed_decorator[True-False] 22.2636ms 21.2681ms 47.0188 Ops/s 47.7093 Ops/s $\color{#d91a1a}-1.45\%$
test_vmap_transformer_speed_decorator[False-True] 21.9077ms 21.5709ms 46.3589 Ops/s 48.1983 Ops/s $\color{#d91a1a}-3.82\%$
test_vmap_transformer_speed_decorator[False-False] 22.4343ms 21.5105ms 46.4889 Ops/s 48.0699 Ops/s $\color{#d91a1a}-3.29\%$
test_to_module_speed[True] 1.2477ms 1.1494ms 870.0229 Ops/s 864.9787 Ops/s $\color{#35bf28}+0.58\%$
test_to_module_speed[False] 1.6208ms 1.1216ms 891.6029 Ops/s 890.6113 Ops/s $\color{#35bf28}+0.11\%$
test_tc_init 93.0110μs 38.3116μs 26.1018 KOps/s 25.4783 KOps/s $\color{#35bf28}+2.45\%$
test_tc_init_nested 0.1114ms 80.5705μs 12.4115 KOps/s 12.6318 KOps/s $\color{#d91a1a}-1.74\%$
test_tc_first_layer_tensor 4.7368μs 0.7915μs 1.2633 MOps/s 1.2772 MOps/s $\color{#d91a1a}-1.09\%$
test_tc_first_layer_nontensor 15.6300μs 2.5609μs 390.4902 KOps/s 392.9770 KOps/s $\color{#d91a1a}-0.63\%$
test_tc_second_layer_tensor 6.4033μs 1.6170μs 618.4151 KOps/s 619.2610 KOps/s $\color{#d91a1a}-0.14\%$
test_tc_second_layer_nontensor 16.9910μs 3.3483μs 298.6625 KOps/s 295.8309 KOps/s $\color{#35bf28}+0.96\%$
test_unbind 0.1813s 10.5059ms 95.1843 Ops/s 64.0069 Ops/s $\textbf{\color{#35bf28}+48.71\%}$
test_full_like 0.1691s 0.6604ms 1.5142 KOps/s 1.7340 KOps/s $\textbf{\color{#d91a1a}-12.68\%}$
test_zeros_like 0.2650ms 0.1974ms 5.0652 KOps/s 5.0590 KOps/s $\color{#35bf28}+0.12\%$
test_ones_like 0.2334ms 0.1975ms 5.0642 KOps/s 5.0626 KOps/s $\color{#35bf28}+0.03\%$
test_clone 0.4427ms 0.4144ms 2.4130 KOps/s 2.4134 KOps/s $\color{#d91a1a}-0.02\%$
test_squeeze 29.2110μs 10.6579μs 93.8269 KOps/s 88.6054 KOps/s $\textbf{\color{#35bf28}+5.89\%}$
test_unsqueeze 0.2146ms 80.3084μs 12.4520 KOps/s 11.9587 KOps/s $\color{#35bf28}+4.13\%$
test_split 0.4437ms 0.1734ms 5.7681 KOps/s 5.6765 KOps/s $\color{#35bf28}+1.61\%$
test_permute 0.2507ms 0.1896ms 5.2755 KOps/s 5.2027 KOps/s $\color{#35bf28}+1.40\%$
test_stack 1.2481ms 0.9055ms 1.1043 KOps/s 1.0808 KOps/s $\color{#35bf28}+2.18\%$
test_cat 1.2476ms 1.2316ms 811.9537 Ops/s 811.8915 Ops/s $+0.01\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: f16f5593b780c1d4538c1115b0d84b8ff173d0c7
Pull Request resolved: #956
@vmoens vmoens merged commit d30a323 into gh/vmoens/11/base Aug 9, 2024
20 of 35 checks passed
vmoens added a commit that referenced this pull request Aug 9, 2024
ghstack-source-id: f16f5593b780c1d4538c1115b0d84b8ff173d0c7
Pull Request resolved: #956
@vmoens vmoens deleted the gh/vmoens/11/head branch August 9, 2024 23:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants