-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Doc] Streaming tensordicts #956
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 969e272d2c8a8823d162c55381a8b70b3787931c Pull Request resolved: #956
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Aug 9, 2024
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 98813cf349698cd4b2edd0e34efd16e17f42d644 Pull Request resolved: #956
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 11a3a8a01bdd84f7a6ecfbe7f4b66895db76a55f Pull Request resolved: #956
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: 76094e20f9486fb7363e8aee57d51bc24b4fe525 Pull Request resolved: #956
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 50.8050μs | 22.1213μs | 45.2054 KOps/s | 46.9291 KOps/s | |
test_plain_set_stack_nested | 62.6370μs | 22.2489μs | 44.9461 KOps/s | 47.6114 KOps/s | |
test_plain_set_nested_inplace | 75.1340μs | 23.8617μs | 41.9082 KOps/s | 43.8111 KOps/s | |
test_plain_set_stack_nested_inplace | 76.0730μs | 24.3317μs | 41.0986 KOps/s | 43.6710 KOps/s | |
test_items | 20.7090μs | 2.7378μs | 365.2526 KOps/s | 379.3574 KOps/s | |
test_items_nested | 2.2305ms | 0.3484ms | 2.8699 KOps/s | 3.0338 KOps/s | |
test_items_nested_locked | 0.6186ms | 0.3504ms | 2.8538 KOps/s | 2.9996 KOps/s | |
test_items_nested_leaf | 0.1530ms | 82.3176μs | 12.1481 KOps/s | 12.2048 KOps/s | |
test_items_stack_nested | 0.5789ms | 0.3407ms | 2.9349 KOps/s | 2.9997 KOps/s | |
test_items_stack_nested_leaf | 0.1520ms | 82.4611μs | 12.1269 KOps/s | 12.7122 KOps/s | |
test_items_stack_nested_locked | 0.5843ms | 0.3404ms | 2.9376 KOps/s | 2.9455 KOps/s | |
test_keys | 23.0630μs | 3.8171μs | 261.9821 KOps/s | 261.6522 KOps/s | |
test_keys_nested | 0.2947ms | 0.1401ms | 7.1362 KOps/s | 6.9494 KOps/s | |
test_keys_nested_locked | 0.6989ms | 0.1472ms | 6.7927 KOps/s | 6.6931 KOps/s | |
test_keys_nested_leaf | 0.2426ms | 0.1225ms | 8.1605 KOps/s | 8.1625 KOps/s | |
test_keys_stack_nested | 0.2539ms | 0.1431ms | 6.9874 KOps/s | 7.1098 KOps/s | |
test_keys_stack_nested_leaf | 0.2249ms | 0.1211ms | 8.2581 KOps/s | 8.2695 KOps/s | |
test_keys_stack_nested_locked | 0.2447ms | 0.1496ms | 6.6863 KOps/s | 6.7931 KOps/s | |
test_values | 13.5205μs | 2.2475μs | 444.9459 KOps/s | 851.4352 KOps/s | |
test_values_nested | 89.0170μs | 48.6968μs | 20.5352 KOps/s | 20.3904 KOps/s | |
test_values_nested_locked | 0.1064ms | 48.9223μs | 20.4406 KOps/s | 20.4614 KOps/s | |
test_values_nested_leaf | 95.1990μs | 44.5201μs | 22.4618 KOps/s | 22.3074 KOps/s | |
test_values_stack_nested | 99.3060μs | 49.6965μs | 20.1222 KOps/s | 19.6784 KOps/s | |
test_values_stack_nested_leaf | 88.0250μs | 43.9989μs | 22.7278 KOps/s | 23.1266 KOps/s | |
test_values_stack_nested_locked | 90.7200μs | 49.3471μs | 20.2646 KOps/s | 19.7143 KOps/s | |
test_membership | 2.5493μs | 0.7238μs | 1.3816 MOps/s | 1.3453 MOps/s | |
test_membership_nested | 30.4470μs | 2.5934μs | 385.5886 KOps/s | 378.1200 KOps/s | |
test_membership_nested_leaf | 42.2490μs | 2.6357μs | 379.4052 KOps/s | 389.1424 KOps/s | |
test_membership_stacked_nested | 29.3860μs | 2.6155μs | 382.3300 KOps/s | 401.7692 KOps/s | |
test_membership_stacked_nested_leaf | 25.8380μs | 2.6390μs | 378.9258 KOps/s | 392.3195 KOps/s | |
test_membership_nested_last | 28.3330μs | 3.9099μs | 255.7634 KOps/s | 263.5290 KOps/s | |
test_membership_nested_leaf_last | 30.4670μs | 3.9042μs | 256.1333 KOps/s | 254.4892 KOps/s | |
test_membership_stacked_nested_last | 24.4450μs | 3.9255μs | 254.7453 KOps/s | 78.6358 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.0300μs | 3.8070μs | 262.6746 KOps/s | 80.6033 KOps/s | |
test_nested_getleaf | 41.2470μs | 10.4037μs | 96.1198 KOps/s | 96.1789 KOps/s | |
test_nested_get | 50.8450μs | 9.7485μs | 102.5799 KOps/s | 104.7909 KOps/s | |
test_stacked_getleaf | 39.5270μs | 10.3153μs | 96.9433 KOps/s | 95.5553 KOps/s | |
test_stacked_get | 32.7310μs | 9.6172μs | 103.9802 KOps/s | 102.9688 KOps/s | |
test_nested_getitemleaf | 52.1880μs | 10.9828μs | 91.0511 KOps/s | 94.8358 KOps/s | |
test_nested_getitem | 53.4600μs | 10.0788μs | 99.2182 KOps/s | 102.4214 KOps/s | |
test_stacked_getitemleaf | 44.2930μs | 11.0666μs | 90.3618 KOps/s | 94.3194 KOps/s | |
test_stacked_getitem | 30.9990μs | 10.0009μs | 99.9913 KOps/s | 100.8612 KOps/s | |
test_lock_nested | 80.1372ms | 0.5698ms | 1.7551 KOps/s | 2.0350 KOps/s | |
test_lock_stack_nested | 0.7302ms | 0.4648ms | 2.1515 KOps/s | 2.3230 KOps/s | |
test_unlock_nested | 80.1133ms | 0.4919ms | 2.0329 KOps/s | 2.4290 KOps/s | |
test_unlock_stack_nested | 0.6147ms | 0.3759ms | 2.6599 KOps/s | 2.7956 KOps/s | |
test_flatten_speed | 0.1961ms | 0.1015ms | 9.8502 KOps/s | 9.9062 KOps/s | |
test_unflatten_speed | 0.8908ms | 0.4550ms | 2.1977 KOps/s | 2.2592 KOps/s | |
test_common_ops | 3.8549ms | 1.1478ms | 871.2471 Ops/s | 930.9555 Ops/s | |
test_creation | 21.0190μs | 2.0107μs | 497.3489 KOps/s | 487.4781 KOps/s | |
test_creation_empty | 47.0080μs | 18.0837μs | 55.2983 KOps/s | 57.3083 KOps/s | |
test_creation_nested_1 | 79.4090μs | 21.6670μs | 46.1532 KOps/s | 48.0131 KOps/s | |
test_creation_nested_2 | 60.8140μs | 25.8720μs | 38.6518 KOps/s | 40.0496 KOps/s | |
test_clone | 63.7690μs | 16.6489μs | 60.0641 KOps/s | 58.9920 KOps/s | |
test_getitem[int] | 1.1166ms | 16.5893μs | 60.2796 KOps/s | 62.3522 KOps/s | |
test_getitem[slice_int] | 0.1382ms | 32.1640μs | 31.0907 KOps/s | 31.9540 KOps/s | |
test_getitem[range] | 0.1904ms | 58.3519μs | 17.1374 KOps/s | 17.4949 KOps/s | |
test_getitem[tuple] | 0.1242ms | 26.6571μs | 37.5134 KOps/s | 39.7972 KOps/s | |
test_getitem[list] | 0.1808ms | 53.2215μs | 18.7894 KOps/s | 19.3687 KOps/s | |
test_setitem_dim[int] | 64.1800μs | 41.5662μs | 24.0580 KOps/s | 23.8293 KOps/s | |
test_setitem_dim[slice_int] | 0.1166ms | 72.0086μs | 13.8872 KOps/s | 14.2225 KOps/s | |
test_setitem_dim[range] | 0.1613ms | 94.4621μs | 10.5863 KOps/s | 10.7491 KOps/s | |
test_setitem_dim[tuple] | 0.1319ms | 63.3339μs | 15.7893 KOps/s | 16.9953 KOps/s | |
test_setitem | 88.5460μs | 30.2292μs | 33.0806 KOps/s | 34.3680 KOps/s | |
test_set | 0.1194ms | 29.3009μs | 34.1286 KOps/s | 35.7499 KOps/s | |
test_set_shared | 1.2410ms | 0.2108ms | 4.7437 KOps/s | 4.6768 KOps/s | |
test_update | 0.1368ms | 38.0333μs | 26.2927 KOps/s | 27.6364 KOps/s | |
test_update_nested | 2.4328ms | 49.1819μs | 20.3327 KOps/s | 22.3683 KOps/s | |
test_update__nested | 0.1367ms | 35.9852μs | 27.7892 KOps/s | 30.1758 KOps/s | |
test_set_nested | 0.1062ms | 33.6566μs | 29.7119 KOps/s | 32.1969 KOps/s | |
test_set_nested_new | 0.1072ms | 37.8656μs | 26.4092 KOps/s | 28.4151 KOps/s | |
test_select | 0.1559ms | 53.7072μs | 18.6195 KOps/s | 19.6196 KOps/s | |
test_select_nested | 0.1388ms | 57.6372μs | 17.3499 KOps/s | 17.5861 KOps/s | |
test_exclude_nested | 0.1666ms | 75.4828μs | 13.2481 KOps/s | 13.4105 KOps/s | |
test_empty[True] | 0.4533ms | 0.3222ms | 3.1033 KOps/s | 3.2109 KOps/s | |
test_empty[False] | 5.8488μs | 1.1318μs | 883.5292 KOps/s | 888.1983 KOps/s | |
test_unbind_speed | 0.3562ms | 0.2889ms | 3.4618 KOps/s | 3.2825 KOps/s | |
test_unbind_speed_stack0 | 0.6814ms | 0.2956ms | 3.3831 KOps/s | 3.5595 KOps/s | |
test_unbind_speed_stack1 | 83.0676ms | 0.7535ms | 1.3271 KOps/s | 1.4793 KOps/s | |
test_split | 81.8858ms | 2.0206ms | 494.8983 Ops/s | 440.3197 Ops/s | |
test_chunk | 83.8380ms | 2.0211ms | 494.7700 Ops/s | 512.7088 Ops/s | |
test_creation[device0] | 4.2665ms | 0.1177ms | 8.4954 KOps/s | 8.2454 KOps/s | |
test_creation_from_tensor | 0.2283ms | 0.1219ms | 8.2051 KOps/s | 8.6709 KOps/s | |
test_add_one[memmap_tensor0] | 0.2274ms | 7.7485μs | 129.0572 KOps/s | 129.6854 KOps/s | |
test_contiguous[memmap_tensor0] | 29.6660μs | 2.0095μs | 497.6379 KOps/s | 501.0831 KOps/s | |
test_stack[memmap_tensor0] | 56.7360μs | 5.7693μs | 173.3322 KOps/s | 179.4097 KOps/s | |
test_memmaptd_index | 1.0497ms | 0.3944ms | 2.5352 KOps/s | 2.5582 KOps/s | |
test_memmaptd_index_astensor | 0.9891ms | 0.4763ms | 2.0996 KOps/s | 2.1319 KOps/s | |
test_memmaptd_index_op | 1.3412ms | 1.0304ms | 970.4619 Ops/s | 986.1767 Ops/s | |
test_serialize_model | 0.1227s | 0.1154s | 8.6689 Ops/s | 8.8360 Ops/s | |
test_serialize_model_pickle | 0.4691s | 0.3933s | 2.5427 Ops/s | 2.5751 Ops/s | |
test_serialize_weights | 0.1255s | 0.1148s | 8.7130 Ops/s | 8.6048 Ops/s | |
test_serialize_weights_returnearly | 0.1842s | 0.1607s | 6.2234 Ops/s | 6.6857 Ops/s | |
test_serialize_weights_pickle | 0.4935s | 0.4097s | 2.4409 Ops/s | 2.4326 Ops/s | |
test_serialize_weights_filesystem | 0.2159s | 0.1472s | 6.7943 Ops/s | 6.6155 Ops/s | |
test_serialize_model_filesystem | 0.1626s | 0.1488s | 6.7211 Ops/s | 6.5693 Ops/s | |
test_reshape_pytree | 99.7560μs | 39.9408μs | 25.0371 KOps/s | 25.7887 KOps/s | |
test_reshape_td | 0.1491ms | 46.1178μs | 21.6836 KOps/s | 22.6204 KOps/s | |
test_view_pytree | 85.7200μs | 37.3977μs | 26.7396 KOps/s | 25.8662 KOps/s | |
test_view_td | 0.1178ms | 52.8992μs | 18.9039 KOps/s | 19.4729 KOps/s | |
test_unbind_pytree | 71.7140μs | 36.5132μs | 27.3874 KOps/s | 26.4886 KOps/s | |
test_unbind_td | 0.3475ms | 45.7989μs | 21.8346 KOps/s | 22.0983 KOps/s | |
test_split_pytree | 0.1266ms | 40.7171μs | 24.5597 KOps/s | 25.0395 KOps/s | |
test_split_td | 0.4835ms | 57.3181μs | 17.4465 KOps/s | 17.4714 KOps/s | |
test_add_pytree | 0.1327ms | 46.3715μs | 21.5650 KOps/s | 20.9437 KOps/s | |
test_add_td | 0.2147ms | 82.9381μs | 12.0572 KOps/s | 12.0761 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1247ms | 55.0544μs | 18.1638 KOps/s | 19.2303 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.4262ms | 0.1871ms | 5.3437 KOps/s | 5.2553 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1672ms | 55.6856μs | 17.9580 KOps/s | 19.3185 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2574ms | 0.1452ms | 6.8856 KOps/s | 6.9254 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 92.6930μs | 20.0171μs | 49.9573 KOps/s | 51.1677 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1392ms | 62.9847μs | 15.8769 KOps/s | 15.6960 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1639ms | 78.3114μs | 12.7695 KOps/s | 12.5430 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1568ms | 70.1176μs | 14.2617 KOps/s | 14.3399 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4073ms | 0.1735ms | 5.7636 KOps/s | 5.9750 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3427ms | 0.1876ms | 5.3302 KOps/s | 5.2359 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 79.7790μs | 38.3436μs | 26.0800 KOps/s | 26.0357 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.8747ms | 71.5792μs | 13.9705 KOps/s | 14.2801 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4451ms | 0.1708ms | 5.8556 KOps/s | 5.7002 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.5843ms | 0.2873ms | 3.4802 KOps/s | 3.3867 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3513ms | 0.2047ms | 4.8851 KOps/s | 4.9339 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.5942ms | 0.1849ms | 5.4091 KOps/s | 5.7868 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.7223ms | 61.9265μs | 16.1482 KOps/s | 15.4179 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1222ms | 40.4746μs | 24.7068 KOps/s | 26.1352 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4585ms | 0.2387ms | 4.1892 KOps/s | 4.1298 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.5730ms | 0.1784ms | 5.6040 KOps/s | 5.9170 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1890ms | 0.1092ms | 9.1544 KOps/s | 9.3946 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1187ms | 55.5192μs | 18.0118 KOps/s | 18.2197 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1923ms | 79.5288μs | 12.5741 KOps/s | 12.4543 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1596ms | 70.5934μs | 14.1656 KOps/s | 14.1226 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3744ms | 0.1892ms | 5.2863 KOps/s | 5.6742 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.7978ms | 1.5941ms | 627.3206 Ops/s | 629.4596 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.2589ms | 0.1841ms | 5.4322 KOps/s | 5.4736 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2500ms | 1.0946ms | 913.5807 Ops/s | 906.4682 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.5032ms | 0.4024ms | 2.4849 KOps/s | 2.4481 KOps/s | |
test_compile_assign_and_add_stack[eager] | 6.0025ms | 3.8352ms | 260.7445 Ops/s | 261.5586 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1027ms | 34.5615μs | 28.9339 KOps/s | 31.1249 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6876ms | 49.0686μs | 20.3796 KOps/s | 20.7403 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1014ms | 29.6494μs | 33.7275 KOps/s | 35.5073 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 90.9610μs | 30.6873μs | 32.5868 KOps/s | 32.2175 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 89.4570μs | 29.1739μs | 34.2772 KOps/s | 36.1667 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 74.3390μs | 30.2095μs | 33.1022 KOps/s | 32.0791 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1493ms | 74.1263μs | 13.4905 KOps/s | 13.9865 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.4810ms | 27.5960μs | 36.2371 KOps/s | 35.6280 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1362ms | 67.5965μs | 14.7937 KOps/s | 14.9875 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 80.9320μs | 24.6755μs | 40.5260 KOps/s | 40.0183 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1366ms | 68.7803μs | 14.5390 KOps/s | 14.8858 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 87.4100μs | 23.7950μs | 42.0256 KOps/s | 40.1481 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1630ms | 74.5150μs | 13.4201 KOps/s | 14.1689 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.8566ms | 28.0015μs | 35.7123 KOps/s | 35.5151 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1373ms | 66.6155μs | 15.0115 KOps/s | 14.9792 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 76.8640μs | 23.9733μs | 41.7130 KOps/s | 39.7755 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1410ms | 68.1062μs | 14.6830 KOps/s | 14.7989 KOps/s | |
test_compile_indexing[int-pytree-eager] | 96.7370μs | 23.9240μs | 41.7991 KOps/s | 40.1230 KOps/s | |
test_mod_add[eager] | 79.3420μs | 25.9158μs | 38.5865 KOps/s | 40.4335 KOps/s | |
test_mod_add[compile] | 0.1008ms | 38.1308μs | 26.2255 KOps/s | 27.9114 KOps/s | |
test_mod_add[compile-overhead] | 0.1206ms | 38.0815μs | 26.2594 KOps/s | 27.9664 KOps/s | |
test_mod_wrap[eager] | 0.4360ms | 0.2160ms | 4.6297 KOps/s | 4.7913 KOps/s | |
test_mod_wrap[compile] | 1.7432ms | 0.2336ms | 4.2808 KOps/s | 4.2914 KOps/s | |
test_mod_wrap[compile-overhead] | 0.4201ms | 0.2265ms | 4.4151 KOps/s | 4.4128 KOps/s | |
test_mod_wrap_and_backward[eager] | 17.3096ms | 12.7882ms | 78.1971 Ops/s | 92.2091 Ops/s | |
test_mod_wrap_and_backward[compile] | 14.7260ms | 11.5685ms | 86.4416 Ops/s | 85.7060 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 18.0968ms | 11.7087ms | 85.4063 Ops/s | 85.4381 Ops/s | |
test_seq_add[eager] | 0.1774ms | 91.0680μs | 10.9808 KOps/s | 11.7742 KOps/s | |
test_seq_add[compile] | 0.1596ms | 62.4547μs | 16.0116 KOps/s | 16.2754 KOps/s | |
test_seq_add[compile-overhead] | 0.1544ms | 61.9439μs | 16.1436 KOps/s | 16.8881 KOps/s | |
test_seq_wrap[eager] | 0.4958ms | 0.3825ms | 2.6146 KOps/s | 2.6358 KOps/s | |
test_seq_wrap[compile] | 0.4023ms | 0.2639ms | 3.7895 KOps/s | 3.7617 KOps/s | |
test_seq_wrap[compile-overhead] | 0.6233ms | 0.2623ms | 3.8129 KOps/s | 3.7737 KOps/s | |
test_func_call_runtime[False-eager] | 0.9264ms | 0.5242ms | 1.9076 KOps/s | 1.9085 KOps/s | |
test_func_call_runtime[False-compile] | 0.8347ms | 0.4969ms | 2.0124 KOps/s | 1.9780 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.6087ms | 0.4902ms | 2.0401 KOps/s | 2.0360 KOps/s | |
test_func_call_runtime[True-eager] | 1.2069ms | 0.7424ms | 1.3471 KOps/s | 1.3162 KOps/s | |
test_func_call_runtime[True-compile] | 0.6987ms | 0.5031ms | 1.9877 KOps/s | 1.9695 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 1.0138ms | 0.5137ms | 1.9465 KOps/s | 1.9544 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8895ms | 0.5367ms | 1.8633 KOps/s | 1.9010 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8583ms | 0.4966ms | 2.0137 KOps/s | 2.0222 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9194ms | 0.5124ms | 1.9517 KOps/s | 2.0070 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0309ms | 0.8888ms | 1.1251 KOps/s | 1.1357 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9591ms | 0.8364ms | 1.1956 KOps/s | 1.1889 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.4083ms | 0.8485ms | 1.1785 KOps/s | 1.2178 KOps/s | |
test_distributed | 0.2309ms | 0.1291ms | 7.7470 KOps/s | 7.7164 KOps/s | |
test_tdmodule | 0.1118ms | 18.7839μs | 53.2371 KOps/s | 58.0495 KOps/s | |
test_tdmodule_dispatch | 63.7500μs | 36.9392μs | 27.0715 KOps/s | 27.9120 KOps/s | |
test_tdseq | 38.7530μs | 19.3335μs | 51.7237 KOps/s | 55.6705 KOps/s | |
test_tdseq_dispatch | 72.9370μs | 41.1859μs | 24.2802 KOps/s | 25.9842 KOps/s | |
test_instantiation_functorch | 2.5988ms | 1.6440ms | 608.2579 Ops/s | 610.4911 Ops/s | |
test_instantiation_td | 1.8002ms | 1.1711ms | 853.8966 Ops/s | 867.0962 Ops/s | |
test_exec_functorch | 0.3081ms | 0.1834ms | 5.4537 KOps/s | 5.5605 KOps/s | |
test_exec_functional_call | 0.4337ms | 0.1769ms | 5.6525 KOps/s | 5.9190 KOps/s | |
test_exec_td | 0.2740ms | 0.1783ms | 5.6072 KOps/s | 6.1399 KOps/s | |
test_exec_td_decorator | 0.4364ms | 0.2352ms | 4.2523 KOps/s | 4.5975 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9979ms | 0.5837ms | 1.7132 KOps/s | 1.7217 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8548ms | 0.5859ms | 1.7067 KOps/s | 1.7744 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.5987ms | 0.4809ms | 2.0795 KOps/s | 2.0966 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6836ms | 0.4761ms | 2.1006 KOps/s | 2.1051 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.4037ms | 0.6276ms | 1.5934 KOps/s | 1.5680 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8114ms | 0.6323ms | 1.5814 KOps/s | 1.5815 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7655ms | 0.5099ms | 1.9612 KOps/s | 1.9208 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8735ms | 0.5230ms | 1.9120 KOps/s | 1.9137 KOps/s | |
test_to_module_speed[True] | 2.1547ms | 1.3257ms | 754.3450 Ops/s | 762.3985 Ops/s | |
test_to_module_speed[False] | 1.9147ms | 1.2922ms | 773.8783 Ops/s | 803.3222 Ops/s | |
test_tc_init | 75.0100μs | 43.7404μs | 22.8622 KOps/s | 24.0358 KOps/s | |
test_tc_init_nested | 0.1526ms | 81.6331μs | 12.2499 KOps/s | 12.2176 KOps/s | |
test_tc_first_layer_tensor | 22.8130μs | 1.4896μs | 671.3391 KOps/s | 681.7609 KOps/s | |
test_tc_first_layer_nontensor | 31.2080μs | 4.3057μs | 232.2523 KOps/s | 241.0800 KOps/s | |
test_tc_second_layer_tensor | 39.9240μs | 2.6911μs | 371.5917 KOps/s | 376.6347 KOps/s | |
test_tc_second_layer_nontensor | 23.8040μs | 5.3613μs | 186.5202 KOps/s | 189.8637 KOps/s | |
test_unbind | 0.4499s | 13.5578ms | 73.7582 Ops/s | 77.8211 Ops/s | |
test_full_like | 9.1547ms | 7.1727ms | 139.4179 Ops/s | 141.2983 Ops/s | |
test_zeros_like | 10.5868ms | 6.4270ms | 155.5925 Ops/s | 134.0806 Ops/s | |
test_ones_like | 13.0813ms | 7.6339ms | 130.9942 Ops/s | 123.6974 Ops/s | |
test_clone | 15.4455ms | 9.0470ms | 110.5342 Ops/s | 110.4748 Ops/s | |
test_squeeze | 73.1260μs | 13.3245μs | 75.0500 KOps/s | 77.5264 KOps/s | |
test_unsqueeze | 0.1682ms | 93.5248μs | 10.6924 KOps/s | 10.8410 KOps/s | |
test_split | 0.4576ms | 0.1997ms | 5.0085 KOps/s | 5.0036 KOps/s | |
test_permute | 0.4866ms | 0.2221ms | 4.5027 KOps/s | 4.5921 KOps/s | |
test_stack | 29.4656ms | 24.0781ms | 41.5315 Ops/s | 40.7821 Ops/s | |
test_cat | 32.6967ms | 24.1859ms | 41.3464 Ops/s | 40.8805 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 88.7720μs | 16.4811μs | 60.6757 KOps/s | 57.4391 KOps/s | |
test_plain_set_stack_nested | 42.1510μs | 16.5407μs | 60.4570 KOps/s | 57.0619 KOps/s | |
test_plain_set_nested_inplace | 36.5210μs | 17.5936μs | 56.8390 KOps/s | 53.9215 KOps/s | |
test_plain_set_stack_nested_inplace | 36.7500μs | 17.5913μs | 56.8462 KOps/s | 54.0373 KOps/s | |
test_items | 21.4710μs | 4.7121μs | 212.2187 KOps/s | 212.9091 KOps/s | |
test_items_nested | 0.3992ms | 0.3605ms | 2.7740 KOps/s | 2.7604 KOps/s | |
test_items_nested_locked | 0.4018ms | 0.3605ms | 2.7742 KOps/s | 2.7133 KOps/s | |
test_items_nested_leaf | 0.1090ms | 86.9033μs | 11.5070 KOps/s | 11.8862 KOps/s | |
test_items_stack_nested | 0.4021ms | 0.3610ms | 2.7701 KOps/s | 2.7128 KOps/s | |
test_items_stack_nested_leaf | 0.1091ms | 84.2656μs | 11.8672 KOps/s | 11.7385 KOps/s | |
test_items_stack_nested_locked | 0.3993ms | 0.3647ms | 2.7417 KOps/s | 2.7169 KOps/s | |
test_keys | 26.0790μs | 4.3448μs | 230.1590 KOps/s | 228.3803 KOps/s | |
test_keys_nested | 90.0610μs | 67.4461μs | 14.8267 KOps/s | 14.9945 KOps/s | |
test_keys_nested_locked | 2.4560ms | 72.2595μs | 13.8390 KOps/s | 13.9412 KOps/s | |
test_keys_nested_leaf | 76.1110μs | 57.0947μs | 17.5148 KOps/s | 17.6999 KOps/s | |
test_keys_stack_nested | 87.7400μs | 67.5043μs | 14.8139 KOps/s | 15.1523 KOps/s | |
test_keys_stack_nested_leaf | 79.4010μs | 57.5890μs | 17.3644 KOps/s | 17.7699 KOps/s | |
test_keys_stack_nested_locked | 88.8120μs | 72.3060μs | 13.8301 KOps/s | 14.0993 KOps/s | |
test_values | 10.5705μs | 1.7791μs | 562.0664 KOps/s | 569.3669 KOps/s | |
test_values_nested | 58.7410μs | 33.7805μs | 29.6029 KOps/s | 29.5694 KOps/s | |
test_values_nested_locked | 56.1620μs | 35.5087μs | 28.1621 KOps/s | 27.8615 KOps/s | |
test_values_nested_leaf | 52.2900μs | 29.9729μs | 33.3635 KOps/s | 32.9954 KOps/s | |
test_values_stack_nested | 58.6010μs | 34.0386μs | 29.3784 KOps/s | 29.1563 KOps/s | |
test_values_stack_nested_leaf | 50.9910μs | 30.2465μs | 33.0616 KOps/s | 32.6745 KOps/s | |
test_values_stack_nested_locked | 56.6510μs | 35.9864μs | 27.7883 KOps/s | 27.7168 KOps/s | |
test_membership | 17.2810μs | 0.6593μs | 1.5168 MOps/s | 1.8595 MOps/s | |
test_membership_nested | 29.9510μs | 2.0171μs | 495.7548 KOps/s | 516.6183 KOps/s | |
test_membership_nested_leaf | 13.4555μs | 1.9497μs | 512.8884 KOps/s | 522.9499 KOps/s | |
test_membership_stacked_nested | 22.6610μs | 2.0384μs | 490.5771 KOps/s | 506.4207 KOps/s | |
test_membership_stacked_nested_leaf | 31.7200μs | 1.9888μs | 502.8214 KOps/s | 513.0577 KOps/s | |
test_membership_nested_last | 34.4420μs | 2.9392μs | 340.2255 KOps/s | 338.0408 KOps/s | |
test_membership_nested_leaf_last | 24.2600μs | 2.9788μs | 335.7051 KOps/s | 340.2811 KOps/s | |
test_membership_stacked_nested_last | 38.5110μs | 2.9359μs | 340.6112 KOps/s | 334.7437 KOps/s | |
test_membership_stacked_nested_leaf_last | 18.9600μs | 2.9028μs | 344.4967 KOps/s | 341.7779 KOps/s | |
test_nested_getleaf | 34.7600μs | 7.8620μs | 127.1943 KOps/s | 127.6049 KOps/s | |
test_nested_get | 29.6810μs | 7.3846μs | 135.4164 KOps/s | 136.3190 KOps/s | |
test_stacked_getleaf | 36.3400μs | 7.8493μs | 127.3996 KOps/s | 127.8768 KOps/s | |
test_stacked_get | 23.9910μs | 7.3897μs | 135.3233 KOps/s | 136.5724 KOps/s | |
test_nested_getitemleaf | 23.6110μs | 8.0835μs | 123.7095 KOps/s | 123.4178 KOps/s | |
test_nested_getitem | 25.1000μs | 7.6567μs | 130.6047 KOps/s | 130.8080 KOps/s | |
test_stacked_getitemleaf | 34.2110μs | 8.0971μs | 123.5005 KOps/s | 123.4172 KOps/s | |
test_stacked_getitem | 30.0500μs | 7.6103μs | 131.4012 KOps/s | 131.8058 KOps/s | |
test_lock_nested | 7.4554ms | 0.4725ms | 2.1166 KOps/s | 2.0975 KOps/s | |
test_lock_stack_nested | 0.4842ms | 0.4321ms | 2.3141 KOps/s | 2.2640 KOps/s | |
test_unlock_nested | 0.8820ms | 0.3815ms | 2.6212 KOps/s | 2.5091 KOps/s | |
test_unlock_stack_nested | 0.3844ms | 0.3497ms | 2.8592 KOps/s | 2.7583 KOps/s | |
test_flatten_speed | 0.5054ms | 0.1038ms | 9.6312 KOps/s | 9.5787 KOps/s | |
test_unflatten_speed | 0.3638ms | 0.3169ms | 3.1552 KOps/s | 3.1430 KOps/s | |
test_common_ops | 1.5976ms | 1.3559ms | 737.4926 Ops/s | 709.6307 Ops/s | |
test_creation | 20.0100μs | 1.6344μs | 611.8436 KOps/s | 605.7707 KOps/s | |
test_creation_empty | 38.2810μs | 16.1998μs | 61.7290 KOps/s | 54.4259 KOps/s | |
test_creation_nested_1 | 1.0858ms | 18.3865μs | 54.3877 KOps/s | 50.2701 KOps/s | |
test_creation_nested_2 | 47.5410μs | 21.2142μs | 47.1383 KOps/s | 42.8106 KOps/s | |
test_clone | 54.9620μs | 29.8092μs | 33.5467 KOps/s | 30.3126 KOps/s | |
test_getitem[int] | 1.1760ms | 17.6444μs | 56.6752 KOps/s | 53.1296 KOps/s | |
test_getitem[slice_int] | 0.1520ms | 29.7492μs | 33.6143 KOps/s | 31.9003 KOps/s | |
test_getitem[range] | 0.2947ms | 0.1200ms | 8.3366 KOps/s | 8.5380 KOps/s | |
test_getitem[tuple] | 0.1508ms | 26.8672μs | 37.2201 KOps/s | 36.7824 KOps/s | |
test_getitem[list] | 0.2300ms | 0.1078ms | 9.2804 KOps/s | 9.3285 KOps/s | |
test_setitem_dim[int] | 76.6220μs | 55.0845μs | 18.1539 KOps/s | 16.9365 KOps/s | |
test_setitem_dim[slice_int] | 0.1042ms | 80.1279μs | 12.4801 KOps/s | 12.0336 KOps/s | |
test_setitem_dim[range] | 0.1899ms | 0.1453ms | 6.8832 KOps/s | 6.8413 KOps/s | |
test_setitem_dim[tuple] | 0.1017ms | 77.0709μs | 12.9751 KOps/s | 13.1575 KOps/s | |
test_setitem | 79.9320μs | 45.1690μs | 22.1391 KOps/s | 21.4140 KOps/s | |
test_set | 78.4220μs | 43.8273μs | 22.8168 KOps/s | 21.7069 KOps/s | |
test_set_shared | 92.8241ms | 63.5737μs | 15.7298 KOps/s | 17.3249 KOps/s | |
test_update | 89.5320μs | 53.0263μs | 18.8586 KOps/s | 17.7105 KOps/s | |
test_update_nested | 90.4520μs | 59.8253μs | 16.7153 KOps/s | 15.1220 KOps/s | |
test_update__nested | 98.7710μs | 61.0891μs | 16.3695 KOps/s | 14.6318 KOps/s | |
test_set_nested | 88.5910μs | 46.4047μs | 21.5495 KOps/s | 20.4075 KOps/s | |
test_set_nested_new | 85.2110μs | 47.9680μs | 20.8472 KOps/s | 17.9946 KOps/s | |
test_select | 0.1227ms | 64.7519μs | 15.4436 KOps/s | 14.6730 KOps/s | |
test_select_nested | 0.4925ms | 52.4386μs | 19.0699 KOps/s | 19.3167 KOps/s | |
test_exclude_nested | 98.8910μs | 69.3698μs | 14.4155 KOps/s | 14.3987 KOps/s | |
test_empty[True] | 0.3315ms | 0.2829ms | 3.5352 KOps/s | 3.5181 KOps/s | |
test_empty[False] | 2.9071μs | 0.8555μs | 1.1689 MOps/s | 1.1538 MOps/s | |
test_to | 55.5110μs | 27.0267μs | 37.0005 KOps/s | 35.3081 KOps/s | |
test_to_nonblocking | 54.2900μs | 26.3348μs | 37.9725 KOps/s | 37.7068 KOps/s | |
test_unbind_speed | 1.2973ms | 0.2979ms | 3.3573 KOps/s | 3.2047 KOps/s | |
test_unbind_speed_stack0 | 0.3442ms | 0.2958ms | 3.3805 KOps/s | 3.2325 KOps/s | |
test_unbind_speed_stack1 | 91.7662ms | 0.7696ms | 1.2994 KOps/s | 1.2594 KOps/s | |
test_split | 92.9094ms | 2.3549ms | 424.6432 Ops/s | 411.7045 Ops/s | |
test_chunk | 2.2977ms | 2.1641ms | 462.0962 Ops/s | 411.2591 Ops/s | |
test_creation[device0] | 0.1639ms | 0.1104ms | 9.0606 KOps/s | 9.2341 KOps/s | |
test_creation_from_tensor | 0.1664ms | 0.1082ms | 9.2400 KOps/s | 9.4295 KOps/s | |
test_add_one[memmap_tensor0] | 0.1512ms | 9.4687μs | 105.6106 KOps/s | 99.7784 KOps/s | |
test_contiguous[memmap_tensor0] | 24.4200μs | 2.2703μs | 440.4676 KOps/s | 424.9915 KOps/s | |
test_stack[memmap_tensor0] | 37.6600μs | 6.9396μs | 144.1005 KOps/s | 135.5578 KOps/s | |
test_memmaptd_index | 1.2806ms | 0.4546ms | 2.1999 KOps/s | 2.1374 KOps/s | |
test_memmaptd_index_astensor | 97.1417ms | 0.6090ms | 1.6419 KOps/s | 1.8697 KOps/s | |
test_memmaptd_index_op | 1.4804ms | 1.0898ms | 917.5655 Ops/s | 867.4427 Ops/s | |
test_serialize_model | 91.9756ms | 88.9556ms | 11.2416 Ops/s | 10.8865 Ops/s | |
test_serialize_model_pickle | 1.3488s | 1.2366s | 0.8087 Ops/s | 0.8080 Ops/s | |
test_serialize_weights | 89.8306ms | 85.7310ms | 11.6644 Ops/s | 11.0766 Ops/s | |
test_serialize_weights_returnearly | 58.5255ms | 53.0578ms | 18.8474 Ops/s | 14.9228 Ops/s | |
test_serialize_weights_pickle | 1.3508s | 1.2375s | 0.8081 Ops/s | 0.8083 Ops/s | |
test_reshape_pytree | 63.3320μs | 38.3906μs | 26.0481 KOps/s | 24.7657 KOps/s | |
test_reshape_td | 0.2486ms | 43.6455μs | 22.9119 KOps/s | 21.8843 KOps/s | |
test_view_pytree | 61.7710μs | 37.6279μs | 26.5761 KOps/s | 25.3308 KOps/s | |
test_view_td | 0.2593ms | 47.9099μs | 20.8725 KOps/s | 18.9791 KOps/s | |
test_unbind_pytree | 68.4820μs | 37.0180μs | 27.0139 KOps/s | 25.8622 KOps/s | |
test_unbind_td | 0.3654ms | 45.9114μs | 21.7811 KOps/s | 20.6732 KOps/s | |
test_split_pytree | 76.9310μs | 50.3647μs | 19.8552 KOps/s | 18.2497 KOps/s | |
test_split_td | 0.4517ms | 58.4309μs | 17.1142 KOps/s | 13.4623 KOps/s | |
test_add_pytree | 89.4720μs | 59.1530μs | 16.9053 KOps/s | 15.3729 KOps/s | |
test_add_td | 0.3129ms | 95.9315μs | 10.4241 KOps/s | 9.4713 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4244ms | 0.2203ms | 4.5390 KOps/s | 4.4948 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2664ms | 0.1746ms | 5.7283 KOps/s | 5.5315 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1936ms | 0.1552ms | 6.4437 KOps/s | 6.3200 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2466ms | 0.1929ms | 5.1848 KOps/s | 4.9598 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.1220μs | 22.6339μs | 44.1815 KOps/s | 43.7496 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 79.8410μs | 48.0632μs | 20.8059 KOps/s | 19.9822 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1108ms | 74.7653μs | 13.3752 KOps/s | 13.5879 KOps/s | |
test_compile_copy_nested[pytree-eager] | 84.0420μs | 59.3263μs | 16.8559 KOps/s | 16.8789 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.5053ms | 0.3463ms | 2.8873 KOps/s | 2.9084 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2803ms | 0.2219ms | 4.5055 KOps/s | 4.3551 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1873ms | 0.1384ms | 7.2261 KOps/s | 7.2020 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1230ms | 64.6867μs | 15.4591 KOps/s | 15.2609 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3958ms | 0.3449ms | 2.8994 KOps/s | 2.9019 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.7147ms | 0.6400ms | 1.5625 KOps/s | 1.5011 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3122ms | 0.2691ms | 3.7167 KOps/s | 3.6184 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3945ms | 0.3468ms | 2.8834 KOps/s | 2.8866 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1751ms | 76.9582μs | 12.9941 KOps/s | 12.8248 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1934ms | 0.1440ms | 6.9435 KOps/s | 7.1761 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6705ms | 0.5424ms | 1.8436 KOps/s | 1.7574 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4340ms | 0.3445ms | 2.9030 KOps/s | 2.8976 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 48.3510μs | 19.9274μs | 50.1822 KOps/s | 49.0181 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 64.7410μs | 31.6887μs | 31.5570 KOps/s | 30.4326 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1136ms | 76.8512μs | 13.0122 KOps/s | 12.9645 KOps/s | |
test_compile_copy_flat[pytree-eager] | 87.4720μs | 60.5115μs | 16.5258 KOps/s | 16.4745 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.4807ms | 0.8786ms | 1.1382 KOps/s | 1.0578 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6026ms | 3.3976ms | 294.3232 Ops/s | 285.1702 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.5032ms | 0.8813ms | 1.1347 KOps/s | 1.0700 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4935ms | 3.4550ms | 289.4320 Ops/s | 281.3667 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1814ms | 0.1184ms | 8.4445 KOps/s | 8.3998 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2250ms | 63.6984μs | 15.6990 KOps/s | 15.1795 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.1440ms | 0.1110ms | 9.0052 KOps/s | 8.9459 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1837ms | 46.5963μs | 21.4609 KOps/s | 20.9719 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1373ms | 0.1107ms | 9.0308 KOps/s | 8.6678 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 90.1720μs | 47.3374μs | 21.1249 KOps/s | 19.4909 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1880ms | 0.1481ms | 6.7537 KOps/s | 6.7029 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1727ms | 26.5483μs | 37.6672 KOps/s | 35.7572 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1705ms | 0.1404ms | 7.1220 KOps/s | 7.1204 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 46.6810μs | 22.9661μs | 43.5424 KOps/s | 41.9504 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2396ms | 0.1400ms | 7.1403 KOps/s | 7.0138 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 51.5500μs | 22.7697μs | 43.9180 KOps/s | 42.4636 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1838ms | 0.1488ms | 6.7221 KOps/s | 6.7328 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4862ms | 26.0994μs | 38.3151 KOps/s | 37.1112 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2480ms | 0.1397ms | 7.1558 KOps/s | 7.1301 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 47.0810μs | 23.0887μs | 43.3112 KOps/s | 42.7477 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1830ms | 0.1397ms | 7.1571 KOps/s | 7.1386 KOps/s | |
test_compile_indexing[int-pytree-eager] | 45.1910μs | 22.6792μs | 44.0933 KOps/s | 42.2548 KOps/s | |
test_mod_add[eager] | 67.8810μs | 33.0774μs | 30.2321 KOps/s | 28.8900 KOps/s | |
test_mod_add[compile] | 0.1034ms | 73.9123μs | 13.5296 KOps/s | 12.9670 KOps/s | |
test_mod_add[compile-overhead] | 0.2676ms | 0.1447ms | 6.9091 KOps/s | 6.3987 KOps/s | |
test_mod_wrap[eager] | 0.3517ms | 0.2510ms | 3.9834 KOps/s | 3.6498 KOps/s | |
test_mod_wrap[compile] | 1.2023ms | 0.2995ms | 3.3393 KOps/s | 3.1106 KOps/s | |
test_mod_wrap[compile-overhead] | 8.2204ms | 4.3193ms | 231.5172 Ops/s | 233.6075 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.5236ms | 1.3898ms | 719.5095 Ops/s | 674.1418 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7028ms | 1.3714ms | 729.1751 Ops/s | 665.9277 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4380ms | 0.9983ms | 1.0017 KOps/s | 959.7093 Ops/s | |
test_seq_add[eager] | 0.1594ms | 0.1032ms | 9.6885 KOps/s | 8.8873 KOps/s | |
test_seq_add[compile] | 0.1456ms | 87.0193μs | 11.4917 KOps/s | 11.3781 KOps/s | |
test_seq_add[compile-overhead] | 0.1595ms | 0.1235ms | 8.0952 KOps/s | 7.9855 KOps/s | |
test_seq_wrap[eager] | 0.4671ms | 0.3976ms | 2.5149 KOps/s | 2.4534 KOps/s | |
test_seq_wrap[compile] | 0.3811ms | 0.3246ms | 3.0808 KOps/s | 2.9971 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2989ms | 0.2375ms | 4.2097 KOps/s | 4.1618 KOps/s | |
test_func_call_runtime[False-eager] | 0.8490ms | 0.7582ms | 1.3189 KOps/s | 1.2462 KOps/s | |
test_func_call_runtime[False-compile] | 0.9139ms | 0.8030ms | 1.2453 KOps/s | 1.1902 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4217ms | 0.3828ms | 2.6121 KOps/s | 2.5870 KOps/s | |
test_func_call_runtime[True-eager] | 1.0709ms | 0.9401ms | 1.0637 KOps/s | 1.0366 KOps/s | |
test_func_call_runtime[True-compile] | 0.9291ms | 0.8509ms | 1.1752 KOps/s | 1.1305 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4777ms | 0.4240ms | 2.3587 KOps/s | 2.3184 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8739ms | 0.7522ms | 1.3295 KOps/s | 1.3048 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8843ms | 0.8035ms | 1.2445 KOps/s | 1.1867 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.4291ms | 0.3845ms | 2.6007 KOps/s | 2.5950 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1784ms | 1.0482ms | 954.0519 Ops/s | 925.8179 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.1282ms | 1.0275ms | 973.2354 Ops/s | 947.3746 Ops/s | |
test_func_call_cm_runtime[True-compile-overhead] | 1.1373ms | 1.0334ms | 967.6626 Ops/s | 946.6131 Ops/s | |
test_distributed | 2.2169ms | 71.0611μs | 14.0724 KOps/s | 14.3982 KOps/s | |
test_tdmodule | 0.1306ms | 16.1199μs | 62.0352 KOps/s | 57.3812 KOps/s | |
test_tdmodule_dispatch | 51.1400μs | 33.1497μs | 30.1661 KOps/s | 29.3308 KOps/s | |
test_tdseq | 31.4410μs | 16.3116μs | 61.3059 KOps/s | 55.3768 KOps/s | |
test_tdseq_dispatch | 62.4510μs | 34.5900μs | 28.9101 KOps/s | 27.4430 KOps/s | |
test_instantiation_functorch | 2.1484ms | 2.0231ms | 494.2865 Ops/s | 480.7596 Ops/s | |
test_instantiation_td | 1.9956ms | 1.3137ms | 761.2191 Ops/s | 741.9893 Ops/s | |
test_exec_functorch | 0.2882ms | 0.2332ms | 4.2877 KOps/s | 4.3083 KOps/s | |
test_exec_functional_call | 0.2756ms | 0.2245ms | 4.4544 KOps/s | 4.4359 KOps/s | |
test_exec_td | 0.2863ms | 0.2322ms | 4.3066 KOps/s | 4.2257 KOps/s | |
test_exec_td_decorator | 0.4203ms | 0.2842ms | 3.5184 KOps/s | 3.5008 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0818ms | 0.6871ms | 1.4555 KOps/s | 1.4842 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7460ms | 0.6777ms | 1.4757 KOps/s | 1.5217 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7069ms | 0.6034ms | 1.6573 KOps/s | 1.6750 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7562ms | 0.6046ms | 1.6539 KOps/s | 1.7382 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2762ms | 0.7253ms | 1.3787 KOps/s | 1.3920 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8631ms | 0.7230ms | 1.3831 KOps/s | 1.4004 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.8084ms | 0.6420ms | 1.5576 KOps/s | 1.6131 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7681ms | 0.6418ms | 1.5580 KOps/s | 1.6130 KOps/s | |
test_vmap_transformer_speed[True-True] | 9.7319ms | 9.0996ms | 109.8953 Ops/s | 112.4520 Ops/s | |
test_vmap_transformer_speed[True-False] | 9.3235ms | 9.1003ms | 109.8863 Ops/s | 112.5415 Ops/s | |
test_vmap_transformer_speed[False-True] | 9.3839ms | 9.0476ms | 110.5263 Ops/s | 113.6750 Ops/s | |
test_vmap_transformer_speed[False-False] | 9.4642ms | 9.0727ms | 110.2211 Ops/s | 113.6027 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 21.9784ms | 21.2502ms | 47.0584 Ops/s | 47.8920 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 22.2636ms | 21.2681ms | 47.0188 Ops/s | 47.7093 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 21.9077ms | 21.5709ms | 46.3589 Ops/s | 48.1983 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 22.4343ms | 21.5105ms | 46.4889 Ops/s | 48.0699 Ops/s | |
test_to_module_speed[True] | 1.2477ms | 1.1494ms | 870.0229 Ops/s | 864.9787 Ops/s | |
test_to_module_speed[False] | 1.6208ms | 1.1216ms | 891.6029 Ops/s | 890.6113 Ops/s | |
test_tc_init | 93.0110μs | 38.3116μs | 26.1018 KOps/s | 25.4783 KOps/s | |
test_tc_init_nested | 0.1114ms | 80.5705μs | 12.4115 KOps/s | 12.6318 KOps/s | |
test_tc_first_layer_tensor | 4.7368μs | 0.7915μs | 1.2633 MOps/s | 1.2772 MOps/s | |
test_tc_first_layer_nontensor | 15.6300μs | 2.5609μs | 390.4902 KOps/s | 392.9770 KOps/s | |
test_tc_second_layer_tensor | 6.4033μs | 1.6170μs | 618.4151 KOps/s | 619.2610 KOps/s | |
test_tc_second_layer_nontensor | 16.9910μs | 3.3483μs | 298.6625 KOps/s | 295.8309 KOps/s | |
test_unbind | 0.1813s | 10.5059ms | 95.1843 Ops/s | 64.0069 Ops/s | |
test_full_like | 0.1691s | 0.6604ms | 1.5142 KOps/s | 1.7340 KOps/s | |
test_zeros_like | 0.2650ms | 0.1974ms | 5.0652 KOps/s | 5.0590 KOps/s | |
test_ones_like | 0.2334ms | 0.1975ms | 5.0642 KOps/s | 5.0626 KOps/s | |
test_clone | 0.4427ms | 0.4144ms | 2.4130 KOps/s | 2.4134 KOps/s | |
test_squeeze | 29.2110μs | 10.6579μs | 93.8269 KOps/s | 88.6054 KOps/s | |
test_unsqueeze | 0.2146ms | 80.3084μs | 12.4520 KOps/s | 11.9587 KOps/s | |
test_split | 0.4437ms | 0.1734ms | 5.7681 KOps/s | 5.6765 KOps/s | |
test_permute | 0.2507ms | 0.1896ms | 5.2755 KOps/s | 5.2027 KOps/s | |
test_stack | 1.2481ms | 0.9055ms | 1.1043 KOps/s | 1.0808 KOps/s | |
test_cat | 1.2476ms | 1.2316ms | 811.9537 Ops/s | 811.8915 Ops/s |
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: f16f5593b780c1d4538c1115b0d84b8ff173d0c7 Pull Request resolved: #956
vmoens
added a commit
that referenced
this pull request
Aug 9, 2024
ghstack-source-id: f16f5593b780c1d4538c1115b0d84b8ff173d0c7 Pull Request resolved: #956
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
documentation
Improvements or additions to documentation
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):