-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Refactor] Make _set_dispatch_td_nn_modules compatible with compile #1084
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Nov 11, 2024
ghstack-source-id: 85a78cd6086233b414fcfe221dd8129e2e38f71c Pull Request resolved: #1084
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Nov 11, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 56.1350μs | 18.2674μs | 54.7424 KOps/s | 52.6051 KOps/s | |
test_plain_set_stack_nested | 54.9320μs | 18.2921μs | 54.6685 KOps/s | 52.2908 KOps/s | |
test_plain_set_nested_inplace | 60.7840μs | 19.5274μs | 51.2102 KOps/s | 47.7334 KOps/s | |
test_plain_set_stack_nested_inplace | 60.5330μs | 19.8163μs | 50.4636 KOps/s | 47.9882 KOps/s | |
test_items | 31.8300μs | 4.0950μs | 244.1983 KOps/s | 236.4556 KOps/s | |
test_items_nested | 0.5016ms | 0.3440ms | 2.9071 KOps/s | 2.9467 KOps/s | |
test_items_nested_locked | 0.5551ms | 0.3446ms | 2.9015 KOps/s | 2.8863 KOps/s | |
test_items_nested_leaf | 0.1401ms | 71.5893μs | 13.9686 KOps/s | 13.9227 KOps/s | |
test_items_stack_nested | 0.6900ms | 0.3440ms | 2.9066 KOps/s | 2.9304 KOps/s | |
test_items_stack_nested_leaf | 0.1555ms | 74.9646μs | 13.3396 KOps/s | 13.1065 KOps/s | |
test_items_stack_nested_locked | 0.7159ms | 0.3479ms | 2.8744 KOps/s | 2.9150 KOps/s | |
test_keys | 27.4010μs | 3.4793μs | 287.4106 KOps/s | 285.9070 KOps/s | |
test_keys_nested | 0.2308ms | 0.1358ms | 7.3626 KOps/s | 7.2917 KOps/s | |
test_keys_nested_locked | 1.9609ms | 0.1418ms | 7.0539 KOps/s | 7.0262 KOps/s | |
test_keys_nested_leaf | 0.2420ms | 0.1172ms | 8.5352 KOps/s | 8.5083 KOps/s | |
test_keys_stack_nested | 0.2693ms | 0.1354ms | 7.3870 KOps/s | 7.2711 KOps/s | |
test_keys_stack_nested_leaf | 0.1969ms | 0.1147ms | 8.7174 KOps/s | 8.4358 KOps/s | |
test_keys_stack_nested_locked | 0.2613ms | 0.1399ms | 7.1501 KOps/s | 7.0895 KOps/s | |
test_values | 6.1736μs | 1.0282μs | 972.5802 KOps/s | 963.4700 KOps/s | |
test_values_nested | 0.1185ms | 55.5497μs | 18.0019 KOps/s | 18.0522 KOps/s | |
test_values_nested_locked | 0.1084ms | 55.5991μs | 17.9859 KOps/s | 17.9743 KOps/s | |
test_values_nested_leaf | 0.1171ms | 60.9250μs | 16.4136 KOps/s | 16.3559 KOps/s | |
test_values_stack_nested | 0.1140ms | 57.6414μs | 17.3486 KOps/s | 17.5296 KOps/s | |
test_values_stack_nested_leaf | 0.1108ms | 60.1675μs | 16.6203 KOps/s | 16.3607 KOps/s | |
test_values_stack_nested_locked | 0.1121ms | 57.0579μs | 17.5261 KOps/s | 16.8257 KOps/s | |
test_membership | 15.0380μs | 0.8620μs | 1.1601 MOps/s | 1.1394 MOps/s | |
test_membership_nested | 16.7910μs | 2.7555μs | 362.9117 KOps/s | 367.6993 KOps/s | |
test_membership_nested_leaf | 29.0840μs | 2.7758μs | 360.2601 KOps/s | 364.2827 KOps/s | |
test_membership_stacked_nested | 21.7210μs | 2.7583μs | 362.5375 KOps/s | 362.8547 KOps/s | |
test_membership_stacked_nested_leaf | 23.8340μs | 2.7349μs | 365.6438 KOps/s | 358.8729 KOps/s | |
test_membership_nested_last | 38.7520μs | 4.1224μs | 242.5782 KOps/s | 246.9717 KOps/s | |
test_membership_nested_leaf_last | 29.4650μs | 4.0682μs | 245.8117 KOps/s | 244.6397 KOps/s | |
test_membership_stacked_nested_last | 49.6620μs | 13.1804μs | 75.8701 KOps/s | 247.3790 KOps/s | |
test_membership_stacked_nested_leaf_last | 40.2850μs | 12.9003μs | 77.5175 KOps/s | 247.4021 KOps/s | |
test_nested_getleaf | 35.6160μs | 10.7434μs | 93.0803 KOps/s | 92.5089 KOps/s | |
test_nested_get | 36.8980μs | 10.1276μs | 98.7397 KOps/s | 97.9351 KOps/s | |
test_stacked_getleaf | 36.8390μs | 10.5862μs | 94.4628 KOps/s | 92.6034 KOps/s | |
test_stacked_get | 30.5170μs | 10.0931μs | 99.0776 KOps/s | 97.7631 KOps/s | |
test_nested_getitemleaf | 37.3400μs | 11.1577μs | 89.6243 KOps/s | 88.6915 KOps/s | |
test_nested_getitem | 36.2370μs | 10.5097μs | 95.1506 KOps/s | 95.4455 KOps/s | |
test_stacked_getitemleaf | 47.3890μs | 11.2598μs | 88.8114 KOps/s | 86.6843 KOps/s | |
test_stacked_getitem | 35.5360μs | 10.3592μs | 96.5322 KOps/s | 94.3501 KOps/s | |
test_lock_nested | 2.7781ms | 0.4476ms | 2.2344 KOps/s | 1.8388 KOps/s | |
test_lock_stack_nested | 0.7311ms | 0.4038ms | 2.4766 KOps/s | 2.4102 KOps/s | |
test_unlock_nested | 0.7440ms | 0.3606ms | 2.7732 KOps/s | 2.7071 KOps/s | |
test_unlock_stack_nested | 0.4721ms | 0.3200ms | 3.1246 KOps/s | 2.9810 KOps/s | |
test_flatten_speed | 0.1846ms | 92.4299μs | 10.8190 KOps/s | 10.7926 KOps/s | |
test_unflatten_speed | 0.6366ms | 0.4812ms | 2.0783 KOps/s | 2.1347 KOps/s | |
test_common_ops | 1.5415ms | 0.7629ms | 1.3108 KOps/s | 1.2126 KOps/s | |
test_creation | 21.7110μs | 2.1419μs | 466.8822 KOps/s | 481.7156 KOps/s | |
test_creation_empty | 36.5780μs | 10.9703μs | 91.1554 KOps/s | 75.2768 KOps/s | |
test_creation_nested_1 | 42.8100μs | 13.7493μs | 72.7310 KOps/s | 62.1495 KOps/s | |
test_creation_nested_2 | 60.7530μs | 17.9737μs | 55.6369 KOps/s | 48.9896 KOps/s | |
test_clone | 79.7580μs | 13.1859μs | 75.8387 KOps/s | 76.0140 KOps/s | |
test_getitem[int] | 1.3457ms | 12.8439μs | 77.8581 KOps/s | 78.1964 KOps/s | |
test_getitem[slice_int] | 0.1393ms | 24.1681μs | 41.3769 KOps/s | 41.0851 KOps/s | |
test_getitem[range] | 0.1808ms | 48.2985μs | 20.7046 KOps/s | 20.7213 KOps/s | |
test_getitem[tuple] | 0.1504ms | 20.2769μs | 49.3171 KOps/s | 49.6313 KOps/s | |
test_getitem[list] | 0.1793ms | 44.4660μs | 22.4891 KOps/s | 22.7358 KOps/s | |
test_setitem_dim[int] | 56.0640μs | 26.2157μs | 38.1451 KOps/s | 37.5755 KOps/s | |
test_setitem_dim[slice_int] | 86.6610μs | 53.3631μs | 18.7395 KOps/s | 19.0240 KOps/s | |
test_setitem_dim[range] | 0.1205ms | 73.9736μs | 13.5183 KOps/s | 13.4412 KOps/s | |
test_setitem_dim[tuple] | 68.2670μs | 41.0504μs | 24.3603 KOps/s | 23.6853 KOps/s | |
test_setitem | 0.1313ms | 20.7145μs | 48.2754 KOps/s | 45.9047 KOps/s | |
test_set | 98.2330μs | 19.7738μs | 50.5720 KOps/s | 47.1724 KOps/s | |
test_set_shared | 3.2864ms | 0.1687ms | 5.9263 KOps/s | 5.9412 KOps/s | |
test_update | 0.8096ms | 22.4693μs | 44.5052 KOps/s | 40.3389 KOps/s | |
test_update_nested | 0.1306ms | 33.2430μs | 30.0815 KOps/s | 28.6179 KOps/s | |
test_update__nested | 0.1774ms | 33.7926μs | 29.5923 KOps/s | 30.9673 KOps/s | |
test_set_nested | 0.1017ms | 21.9456μs | 45.5673 KOps/s | 42.4397 KOps/s | |
test_set_nested_new | 81.6730μs | 26.8578μs | 37.2331 KOps/s | 35.4875 KOps/s | |
test_select | 0.2234ms | 42.5996μs | 23.4744 KOps/s | 22.4723 KOps/s | |
test_select_nested | 0.1349ms | 61.3189μs | 16.3082 KOps/s | 16.6308 KOps/s | |
test_exclude_nested | 0.1929ms | 77.2839μs | 12.9393 KOps/s | 13.2903 KOps/s | |
test_empty[True] | 0.5509ms | 0.3529ms | 2.8339 KOps/s | 2.8619 KOps/s | |
test_empty[False] | 10.8277μs | 1.3239μs | 755.3193 KOps/s | 755.1481 KOps/s | |
test_unbind_speed | 0.3521ms | 0.2661ms | 3.7586 KOps/s | 3.7083 KOps/s | |
test_unbind_speed_stack0 | 0.5181ms | 0.2552ms | 3.9188 KOps/s | 3.8043 KOps/s | |
test_unbind_speed_stack1 | 0.1023s | 0.7444ms | 1.3434 KOps/s | 1.4081 KOps/s | |
test_split | 2.5819ms | 1.6026ms | 623.9773 Ops/s | 563.3246 Ops/s | |
test_chunk | 0.1110s | 1.9461ms | 513.8536 Ops/s | 566.6031 Ops/s | |
test_consolidate_njt[False-None] | 11.1408ms | 8.1286ms | 123.0220 Ops/s | 123.2658 Ops/s | |
test_creation[device0] | 3.6947ms | 93.2979μs | 10.7184 KOps/s | 10.9292 KOps/s | |
test_creation_from_tensor | 0.2786ms | 92.6178μs | 10.7971 KOps/s | 10.7024 KOps/s | |
test_add_one[memmap_tensor0] | 0.2361ms | 4.7276μs | 211.5233 KOps/s | 203.4126 KOps/s | |
test_contiguous[memmap_tensor0] | 15.3890μs | 0.5157μs | 1.9389 MOps/s | 1.9576 MOps/s | |
test_stack[memmap_tensor0] | 29.7450μs | 3.2717μs | 305.6530 KOps/s | 290.7916 KOps/s | |
test_memmaptd_index | 1.0541ms | 0.2386ms | 4.1915 KOps/s | 4.1696 KOps/s | |
test_memmaptd_index_astensor | 0.5724ms | 0.3174ms | 3.1510 KOps/s | 3.1644 KOps/s | |
test_memmaptd_index_op | 0.9333ms | 0.5763ms | 1.7352 KOps/s | 1.5977 KOps/s | |
test_serialize_model | 0.1220s | 0.1167s | 8.5684 Ops/s | 7.4313 Ops/s | |
test_serialize_model_pickle | 0.5029s | 0.3903s | 2.5624 Ops/s | 2.5702 Ops/s | |
test_serialize_weights | 0.1175s | 0.1120s | 8.9286 Ops/s | 8.6973 Ops/s | |
test_serialize_weights_returnearly | 0.3521s | 0.1842s | 5.4294 Ops/s | 6.2473 Ops/s | |
test_serialize_weights_pickle | 1.0420s | 0.6750s | 1.4816 Ops/s | 2.4481 Ops/s | |
test_serialize_weights_filesystem | 0.1466s | 0.1430s | 6.9919 Ops/s | 6.3830 Ops/s | |
test_serialize_model_filesystem | 0.1487s | 0.1435s | 6.9677 Ops/s | 6.6023 Ops/s | |
test_reshape_pytree | 0.1188ms | 27.5915μs | 36.2430 KOps/s | 37.8629 KOps/s | |
test_reshape_td | 70.2910μs | 33.0757μs | 30.2337 KOps/s | 30.0614 KOps/s | |
test_view_pytree | 62.1360μs | 27.0018μs | 37.0346 KOps/s | 38.0179 KOps/s | |
test_view_td | 90.6390μs | 38.4528μs | 26.0059 KOps/s | 26.4979 KOps/s | |
test_unbind_pytree | 99.0040μs | 30.3155μs | 32.9864 KOps/s | 33.6151 KOps/s | |
test_unbind_td | 0.3525ms | 39.2897μs | 25.4520 KOps/s | 25.6194 KOps/s | |
test_split_pytree | 74.8190μs | 29.7768μs | 33.5832 KOps/s | 34.0818 KOps/s | |
test_split_td | 0.1093s | 55.7086μs | 17.9506 KOps/s | 22.1238 KOps/s | |
test_add_pytree | 81.5620μs | 34.9723μs | 28.5941 KOps/s | 27.8055 KOps/s | |
test_add_td | 0.1144ms | 53.0755μs | 18.8411 KOps/s | 16.7867 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1912ms | 64.0043μs | 15.6240 KOps/s | 15.8475 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3694ms | 0.1623ms | 6.1618 KOps/s | 6.2463 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1019ms | 46.7984μs | 21.3682 KOps/s | 21.8698 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2068ms | 0.1173ms | 8.5285 KOps/s | 8.4053 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 58.1980μs | 26.3172μs | 37.9979 KOps/s | 39.3175 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1027ms | 54.0959μs | 18.4857 KOps/s | 18.6664 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1787ms | 79.9248μs | 12.5118 KOps/s | 12.8657 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1368ms | 67.8369μs | 14.7412 KOps/s | 14.7272 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.1860ms | 0.1053ms | 9.4957 KOps/s | 9.5042 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.4292ms | 0.2032ms | 4.9223 KOps/s | 5.0107 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1375ms | 45.6190μs | 21.9207 KOps/s | 22.3365 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.4702ms | 63.7519μs | 15.6858 KOps/s | 16.2515 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.1868ms | 0.1039ms | 9.6261 KOps/s | 9.7459 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.3832ms | 0.2011ms | 4.9735 KOps/s | 4.8783 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3863ms | 0.2111ms | 4.7363 KOps/s | 4.7126 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2396ms | 0.1076ms | 9.2896 KOps/s | 9.5640 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.3595ms | 58.1955μs | 17.1835 KOps/s | 18.3894 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1240ms | 48.1319μs | 20.7762 KOps/s | 21.6987 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.6680ms | 0.1592ms | 6.2816 KOps/s | 6.2741 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.1917ms | 0.1056ms | 9.4723 KOps/s | 9.3446 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 84.7580μs | 22.2648μs | 44.9140 KOps/s | 47.8046 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1196ms | 59.4986μs | 16.8071 KOps/s | 17.3337 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1619ms | 81.3400μs | 12.2941 KOps/s | 12.5311 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1275ms | 68.4117μs | 14.6174 KOps/s | 14.7641 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.3037ms | 0.2107ms | 4.7457 KOps/s | 4.7428 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.5675ms | 1.3099ms | 763.4260 Ops/s | 771.0764 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.4097ms | 0.2089ms | 4.7872 KOps/s | 4.8978 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.4165ms | 0.7843ms | 1.2750 KOps/s | 1.2764 KOps/s | |
test_compile_assign_and_add_stack[compile] | 0.5736ms | 0.4607ms | 2.1704 KOps/s | 2.1695 KOps/s | |
test_compile_assign_and_add_stack[eager] | 3.7087ms | 2.5746ms | 388.4051 Ops/s | 360.9966 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 86.7710μs | 37.2154μs | 26.8706 KOps/s | 27.7346 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.5072ms | 32.0443μs | 31.2068 KOps/s | 29.6975 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 86.3110μs | 30.7627μs | 32.5069 KOps/s | 33.7074 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 70.0300μs | 22.7817μs | 43.8949 KOps/s | 41.9018 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 82.3030μs | 31.2648μs | 31.9848 KOps/s | 32.1741 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 84.7470μs | 23.1040μs | 43.2825 KOps/s | 42.2989 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1138ms | 51.6761μs | 19.3513 KOps/s | 19.3391 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.5266ms | 19.5174μs | 51.2363 KOps/s | 49.5024 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 97.8720μs | 44.3731μs | 22.5362 KOps/s | 22.8456 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1001ms | 18.9954μs | 52.6444 KOps/s | 52.8330 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1071ms | 45.7547μs | 21.8557 KOps/s | 22.2612 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 54.8020μs | 18.8761μs | 52.9770 KOps/s | 53.0848 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1735ms | 53.2082μs | 18.7941 KOps/s | 19.0330 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.9424ms | 19.3547μs | 51.6671 KOps/s | 50.4656 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1229ms | 45.8359μs | 21.8170 KOps/s | 22.3963 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 85.2680μs | 18.8996μs | 52.9113 KOps/s | 53.2338 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1202ms | 45.9164μs | 21.7787 KOps/s | 22.2282 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1083ms | 19.0111μs | 52.6010 KOps/s | 53.6024 KOps/s | |
test_mod_add[eager] | 73.1470μs | 26.6568μs | 37.5138 KOps/s | 35.4504 KOps/s | |
test_mod_add[compile] | 95.6080μs | 45.2552μs | 22.0969 KOps/s | 21.5891 KOps/s | |
test_mod_add[compile-overhead] | 0.1069ms | 45.0004μs | 22.2220 KOps/s | 21.8358 KOps/s | |
test_mod_wrap[eager] | 0.4248ms | 0.2160ms | 4.6300 KOps/s | 4.5688 KOps/s | |
test_mod_wrap[compile] | 1.7112ms | 0.2014ms | 4.9653 KOps/s | 4.8223 KOps/s | |
test_mod_wrap[compile-overhead] | 1.7755ms | 0.2001ms | 4.9963 KOps/s | 4.8799 KOps/s | |
test_mod_wrap_and_backward[eager] | 15.0048ms | 11.1929ms | 89.3423 Ops/s | 77.7969 Ops/s | |
test_mod_wrap_and_backward[compile] | 16.6407ms | 11.3919ms | 87.7814 Ops/s | 76.1291 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 17.4138ms | 11.9899ms | 83.4039 Ops/s | 73.9850 Ops/s | |
test_seq_add[eager] | 0.2235ms | 93.2954μs | 10.7186 KOps/s | 10.3658 KOps/s | |
test_seq_add[compile] | 0.1271ms | 59.6013μs | 16.7782 KOps/s | 16.9614 KOps/s | |
test_seq_add[compile-overhead] | 0.1122ms | 57.4730μs | 17.3995 KOps/s | 17.0058 KOps/s | |
test_seq_wrap[eager] | 0.7408ms | 0.3968ms | 2.5204 KOps/s | 2.4835 KOps/s | |
test_seq_wrap[compile] | 0.4460ms | 0.2250ms | 4.4452 KOps/s | 4.3903 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3709ms | 0.2254ms | 4.4373 KOps/s | 4.3980 KOps/s | |
test_func_call_runtime[False-eager] | 0.7806ms | 0.5580ms | 1.7923 KOps/s | 1.8341 KOps/s | |
test_func_call_runtime[False-compile] | 0.8699ms | 0.4317ms | 2.3162 KOps/s | 2.3583 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8060ms | 0.4287ms | 2.3325 KOps/s | 2.3393 KOps/s | |
test_func_call_runtime[True-eager] | 1.0603ms | 0.7691ms | 1.3002 KOps/s | 1.3189 KOps/s | |
test_func_call_runtime[True-compile] | 0.8935ms | 0.4663ms | 2.1445 KOps/s | 2.1440 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8509ms | 0.4645ms | 2.1526 KOps/s | 2.1524 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8241ms | 0.5513ms | 1.8138 KOps/s | 1.8469 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9320ms | 0.4270ms | 2.3422 KOps/s | 2.3356 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.7998ms | 0.4317ms | 2.3162 KOps/s | 2.3522 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3085ms | 0.9124ms | 1.0960 KOps/s | 1.1020 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.1178ms | 0.4946ms | 2.0220 KOps/s | 2.0109 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6091ms | 0.4956ms | 2.0177 KOps/s | 2.0393 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5856ms | 1.9125ms | 522.8853 Ops/s | 528.0477 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.7926ms | 0.5224ms | 1.9141 KOps/s | 1.9018 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.6947ms | 0.5218ms | 1.9165 KOps/s | 1.8536 KOps/s | |
test_distributed | 0.2734ms | 0.1277ms | 7.8295 KOps/s | 7.7423 KOps/s | |
test_tdmodule | 40.6350μs | 18.8986μs | 52.9139 KOps/s | 50.2387 KOps/s | |
test_tdmodule_dispatch | 64.0700μs | 36.5632μs | 27.3499 KOps/s | 25.4335 KOps/s | |
test_tdseq | 45.0950μs | 21.4472μs | 46.6261 KOps/s | 43.3396 KOps/s | |
test_tdseq_dispatch | 77.0140μs | 42.3937μs | 23.5884 KOps/s | 21.8855 KOps/s | |
test_instantiation_functorch | 1.7030ms | 1.5536ms | 643.6662 Ops/s | 644.1281 Ops/s | |
test_exec_functorch | 0.3240ms | 0.1791ms | 5.5836 KOps/s | 5.5623 KOps/s | |
test_exec_functional_call | 0.3063ms | 0.1774ms | 5.6364 KOps/s | 5.7575 KOps/s | |
test_exec_td_decorator | 0.4917ms | 0.2328ms | 4.2949 KOps/s | 4.3695 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9946ms | 0.6464ms | 1.5471 KOps/s | 1.5275 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.3531ms | 0.6438ms | 1.5533 KOps/s | 1.5678 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9965ms | 0.5348ms | 1.8697 KOps/s | 1.9271 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7328ms | 0.5284ms | 1.8927 KOps/s | 1.9262 KOps/s | |
test_to_module_speed[True] | 2.1881ms | 1.3058ms | 765.8382 Ops/s | 778.6525 Ops/s | |
test_to_module_speed[False] | 2.1386ms | 1.2949ms | 772.2884 Ops/s | 803.8592 Ops/s | |
test_tc_init | 88.4950μs | 44.2686μs | 22.5894 KOps/s | 19.7066 KOps/s | |
test_tc_init_nested | 0.1651ms | 87.9831μs | 11.3658 KOps/s | 9.8737 KOps/s | |
test_tc_first_layer_tensor | 27.6620μs | 1.5477μs | 646.1076 KOps/s | 646.4828 KOps/s | |
test_tc_first_layer_nontensor | 23.0830μs | 4.8016μs | 208.2627 KOps/s | 206.7035 KOps/s | |
test_tc_second_layer_tensor | 28.3730μs | 2.8793μs | 347.3042 KOps/s | 343.4473 KOps/s | |
test_tc_second_layer_nontensor | 57.3060μs | 6.2294μs | 160.5285 KOps/s | 159.8004 KOps/s | |
test_unbind | 0.2386s | 12.9622ms | 77.1476 Ops/s | 75.0869 Ops/s | |
test_full_like | 15.5450ms | 12.2456ms | 81.6620 Ops/s | 130.8345 Ops/s | |
test_zeros_like | 15.2822ms | 7.5922ms | 131.7133 Ops/s | 344.8787 Ops/s | |
test_ones_like | 12.1736ms | 7.6456ms | 130.7948 Ops/s | 288.8177 Ops/s | |
test_clone | 12.6744ms | 9.1825ms | 108.9022 Ops/s | 189.1269 Ops/s | |
test_squeeze | 57.2270μs | 11.8140μs | 84.6454 KOps/s | 86.2258 KOps/s | |
test_unsqueeze | 0.1803ms | 87.9719μs | 11.3673 KOps/s | 11.2393 KOps/s | |
test_split | 0.5017ms | 0.1936ms | 5.1657 KOps/s | 5.3684 KOps/s | |
test_permute | 0.3106ms | 0.2228ms | 4.4878 KOps/s | 4.5901 KOps/s | |
test_stack | 29.7855ms | 23.8272ms | 41.9688 Ops/s | 39.2103 Ops/s | |
test_cat | 25.0331ms | 23.5335ms | 42.4926 Ops/s | 39.5533 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 32.5000μs | 11.3883μs | 87.8097 KOps/s | 89.1362 KOps/s | |
test_plain_set_stack_nested | 0.1094ms | 11.5582μs | 86.5187 KOps/s | 87.3994 KOps/s | |
test_plain_set_nested_inplace | 49.9210μs | 12.3201μs | 81.1679 KOps/s | 81.2741 KOps/s | |
test_plain_set_stack_nested_inplace | 39.3200μs | 12.3157μs | 81.1972 KOps/s | 81.7740 KOps/s | |
test_items | 40.9810μs | 2.8940μs | 345.5433 KOps/s | 342.5475 KOps/s | |
test_items_nested | 0.3697ms | 0.3225ms | 3.1004 KOps/s | 3.0755 KOps/s | |
test_items_nested_locked | 0.3947ms | 0.3240ms | 3.0865 KOps/s | 3.0399 KOps/s | |
test_items_nested_leaf | 92.6820μs | 58.0128μs | 17.2376 KOps/s | 17.2242 KOps/s | |
test_items_stack_nested | 0.4935ms | 0.3233ms | 3.0927 KOps/s | 3.0097 KOps/s | |
test_items_stack_nested_leaf | 0.2338ms | 59.7639μs | 16.7325 KOps/s | 16.9906 KOps/s | |
test_items_stack_nested_locked | 0.5208ms | 0.3272ms | 3.0566 KOps/s | 3.0694 KOps/s | |
test_keys | 29.1810μs | 3.4657μs | 288.5386 KOps/s | 286.1515 KOps/s | |
test_keys_nested | 0.1025ms | 69.7941μs | 14.3278 KOps/s | 14.2681 KOps/s | |
test_keys_nested_locked | 2.8663ms | 75.2042μs | 13.2971 KOps/s | 13.0717 KOps/s | |
test_keys_nested_leaf | 91.5720μs | 60.8320μs | 16.4387 KOps/s | 16.3216 KOps/s | |
test_keys_stack_nested | 0.1064ms | 70.7589μs | 14.1325 KOps/s | 14.1295 KOps/s | |
test_keys_stack_nested_leaf | 0.2260ms | 61.6364μs | 16.2242 KOps/s | 16.1147 KOps/s | |
test_keys_stack_nested_locked | 0.2528ms | 75.8371μs | 13.1862 KOps/s | 13.1087 KOps/s | |
test_values | 28.5588μs | 0.8645μs | 1.1567 MOps/s | 1.1740 MOps/s | |
test_values_nested | 58.2020μs | 31.0739μs | 32.1814 KOps/s | 32.1401 KOps/s | |
test_values_nested_locked | 0.1163ms | 32.6200μs | 30.6560 KOps/s | 30.6457 KOps/s | |
test_values_nested_leaf | 0.1468ms | 33.7138μs | 29.6615 KOps/s | 29.8668 KOps/s | |
test_values_stack_nested | 59.4810μs | 31.5063μs | 31.7397 KOps/s | 31.6733 KOps/s | |
test_values_stack_nested_leaf | 89.5410μs | 34.4418μs | 29.0345 KOps/s | 29.3643 KOps/s | |
test_values_stack_nested_locked | 87.6510μs | 33.0175μs | 30.2869 KOps/s | 30.3634 KOps/s | |
test_membership | 5.1046μs | 0.5067μs | 1.9737 MOps/s | 1.9656 MOps/s | |
test_membership_nested | 17.0205μs | 1.8838μs | 530.8398 KOps/s | 540.8424 KOps/s | |
test_membership_nested_leaf | 16.7600μs | 1.8935μs | 528.1225 KOps/s | 532.3517 KOps/s | |
test_membership_stacked_nested | 40.4500μs | 2.0221μs | 494.5318 KOps/s | 508.8040 KOps/s | |
test_membership_stacked_nested_leaf | 29.0100μs | 1.9932μs | 501.7107 KOps/s | 516.9342 KOps/s | |
test_membership_nested_last | 27.7300μs | 2.7840μs | 359.1937 KOps/s | 360.7225 KOps/s | |
test_membership_nested_leaf_last | 45.7410μs | 2.7934μs | 357.9896 KOps/s | 356.0740 KOps/s | |
test_membership_stacked_nested_last | 26.5510μs | 3.2699μs | 305.8202 KOps/s | 361.8897 KOps/s | |
test_membership_stacked_nested_leaf_last | 98.8920μs | 3.2261μs | 309.9762 KOps/s | 361.3993 KOps/s | |
test_nested_getleaf | 1.6967ms | 6.0433μs | 165.4734 KOps/s | 167.2056 KOps/s | |
test_nested_get | 88.4820μs | 5.6562μs | 176.7983 KOps/s | 174.7884 KOps/s | |
test_stacked_getleaf | 39.7110μs | 5.9889μs | 166.9761 KOps/s | 167.0494 KOps/s | |
test_stacked_get | 38.0300μs | 5.6442μs | 177.1722 KOps/s | 175.6811 KOps/s | |
test_nested_getitemleaf | 34.7710μs | 6.0985μs | 163.9745 KOps/s | 164.2013 KOps/s | |
test_nested_getitem | 30.7610μs | 5.7551μs | 173.7598 KOps/s | 172.9363 KOps/s | |
test_stacked_getitemleaf | 61.6810μs | 6.1014μs | 163.8973 KOps/s | 164.5502 KOps/s | |
test_stacked_getitem | 32.9910μs | 5.7812μs | 172.9759 KOps/s | 173.7539 KOps/s | |
test_lock_nested | 8.1975ms | 0.3749ms | 2.6671 KOps/s | 2.7364 KOps/s | |
test_lock_stack_nested | 0.4625ms | 0.3314ms | 3.0178 KOps/s | 2.9871 KOps/s | |
test_unlock_nested | 0.6775ms | 0.3056ms | 3.2724 KOps/s | 3.2891 KOps/s | |
test_unlock_stack_nested | 0.3908ms | 0.2721ms | 3.6757 KOps/s | 3.6690 KOps/s | |
test_flatten_speed | 0.1452ms | 71.6516μs | 13.9564 KOps/s | 13.8196 KOps/s | |
test_unflatten_speed | 0.4895ms | 0.2929ms | 3.4139 KOps/s | 3.4324 KOps/s | |
test_common_ops | 1.8711ms | 0.6168ms | 1.6213 KOps/s | 1.6195 KOps/s | |
test_creation | 28.2300μs | 1.4660μs | 682.1208 KOps/s | 682.9713 KOps/s | |
test_creation_empty | 35.7910μs | 9.1000μs | 109.8907 KOps/s | 112.9340 KOps/s | |
test_creation_nested_1 | 0.2039ms | 10.7046μs | 93.4178 KOps/s | 97.2732 KOps/s | |
test_creation_nested_2 | 0.2041ms | 13.2008μs | 75.7527 KOps/s | 76.5733 KOps/s | |
test_clone | 55.2510μs | 10.3212μs | 96.8884 KOps/s | 89.7223 KOps/s | |
test_getitem[int] | 2.4140ms | 11.0408μs | 90.5730 KOps/s | 91.8136 KOps/s | |
test_getitem[slice_int] | 0.1438ms | 21.4285μs | 46.6669 KOps/s | 46.3757 KOps/s | |
test_getitem[range] | 0.1823ms | 38.4024μs | 26.0400 KOps/s | 26.0234 KOps/s | |
test_getitem[tuple] | 0.1189ms | 18.9865μs | 52.6689 KOps/s | 54.3388 KOps/s | |
test_getitem[list] | 0.1052s | 43.0739μs | 23.2159 KOps/s | 29.2813 KOps/s | |
test_setitem_dim[int] | 43.6510μs | 18.9388μs | 52.8015 KOps/s | 51.2410 KOps/s | |
test_setitem_dim[slice_int] | 0.1445ms | 38.0389μs | 26.2889 KOps/s | 25.8845 KOps/s | |
test_setitem_dim[range] | 83.2220μs | 52.7085μs | 18.9723 KOps/s | 18.2420 KOps/s | |
test_setitem_dim[tuple] | 53.2010μs | 31.3952μs | 31.8521 KOps/s | 30.2295 KOps/s | |
test_setitem | 0.1844ms | 15.8060μs | 63.2673 KOps/s | 61.5877 KOps/s | |
test_set | 0.1522ms | 14.9224μs | 67.0135 KOps/s | 64.4870 KOps/s | |
test_set_shared | 1.6343ms | 0.1476ms | 6.7751 KOps/s | 6.6528 KOps/s | |
test_update | 0.3991ms | 18.2917μs | 54.6695 KOps/s | 53.0100 KOps/s | |
test_update_nested | 0.2200ms | 23.6162μs | 42.3438 KOps/s | 42.0733 KOps/s | |
test_update__nested | 0.7529ms | 24.2966μs | 41.1581 KOps/s | 40.3456 KOps/s | |
test_set_nested | 0.1861ms | 16.0926μs | 62.1402 KOps/s | 59.2297 KOps/s | |
test_set_nested_new | 0.2179ms | 18.4091μs | 54.3210 KOps/s | 51.9380 KOps/s | |
test_select | 0.2168ms | 30.1267μs | 33.1932 KOps/s | 32.1104 KOps/s | |
test_select_nested | 0.2290ms | 41.2994μs | 24.2134 KOps/s | 23.8543 KOps/s | |
test_exclude_nested | 84.3120μs | 58.0883μs | 17.2152 KOps/s | 16.9448 KOps/s | |
test_empty[True] | 0.3113ms | 0.2515ms | 3.9766 KOps/s | 3.8838 KOps/s | |
test_empty[False] | 7.3741μs | 0.7421μs | 1.3475 MOps/s | 1.3376 MOps/s | |
test_to | 86.6520μs | 53.6085μs | 18.6538 KOps/s | 17.9424 KOps/s | |
test_to_nonblocking | 0.2160ms | 45.8219μs | 21.8236 KOps/s | 21.7702 KOps/s | |
test_unbind_speed | 0.3101ms | 0.2294ms | 4.3597 KOps/s | 4.3309 KOps/s | |
test_unbind_speed_stack0 | 0.2976ms | 0.2279ms | 4.3875 KOps/s | 4.2895 KOps/s | |
test_unbind_speed_stack1 | 0.1065s | 0.6482ms | 1.5428 KOps/s | 1.5196 KOps/s | |
test_split | 0.1119s | 1.6502ms | 605.9972 Ops/s | 554.8156 Ops/s | |
test_chunk | 0.1084s | 1.6418ms | 609.1046 Ops/s | 665.6010 Ops/s | |
test_consolidate[False-None] | 0.1128s | 2.8957ms | 345.3408 Ops/s | 322.5240 Ops/s | |
test_consolidate[default-None] | 1.9138ms | 1.7063ms | 586.0560 Ops/s | 591.7415 Ops/s | |
test_consolidate[reduce-overhead-None] | 1.9124ms | 1.7089ms | 585.1662 Ops/s | 585.8267 Ops/s | |
test_consolidate_njt[False-None] | 7.0360ms | 6.5145ms | 153.5040 Ops/s | 150.9491 Ops/s | |
test_to[False-False-None] | 1.9021ms | 1.6288ms | 613.9611 Ops/s | 600.7517 Ops/s | |
test_to[True-False-None] | 1.6549ms | 1.3063ms | 765.5494 Ops/s | 777.5490 Ops/s | |
test_to[within-False-None] | 0.3925s | 5.5652ms | 179.6889 Ops/s | 248.4570 Ops/s | |
test_to[True-default-None] | 5.3996ms | 5.0235ms | 199.0650 Ops/s | 196.9447 Ops/s | |
test_to_njt[False-False-None] | 7.3135ms | 6.8848ms | 145.2483 Ops/s | 143.8930 Ops/s | |
test_to_njt[True-False-None] | 5.9313ms | 5.4898ms | 182.1565 Ops/s | 182.3257 Ops/s | |
test_to_njt[within-False-None] | 12.8135ms | 12.2443ms | 81.6709 Ops/s | 82.7125 Ops/s | |
test_creation[device0] | 0.3767ms | 79.9599μs | 12.5063 KOps/s | 11.8890 KOps/s | |
test_creation_from_tensor | 0.4908ms | 82.7540μs | 12.0840 KOps/s | 11.7804 KOps/s | |
test_add_one[memmap_tensor0] | 0.7389ms | 7.2326μs | 138.2627 KOps/s | 134.8497 KOps/s | |
test_contiguous[memmap_tensor0] | 4.0631μs | 0.4167μs | 2.4001 MOps/s | 2.3970 MOps/s | |
test_stack[memmap_tensor0] | 0.1525ms | 4.7008μs | 212.7311 KOps/s | 210.3067 KOps/s | |
test_memmaptd_index | 2.1199ms | 0.2540ms | 3.9364 KOps/s | 3.8637 KOps/s | |
test_memmaptd_index_astensor | 0.8629ms | 0.3089ms | 3.2372 KOps/s | 3.1490 KOps/s | |
test_memmaptd_index_op | 1.1104ms | 0.6163ms | 1.6225 KOps/s | 1.5789 KOps/s | |
test_serialize_model | 0.1325s | 0.1316s | 7.5993 Ops/s | 5.0754 Ops/s | |
test_serialize_model_pickle | 1.4539s | 1.2080s | 0.8278 Ops/s | 0.8387 Ops/s | |
test_serialize_weights | 0.1329s | 0.1312s | 7.6196 Ops/s | 7.6034 Ops/s | |
test_serialize_weights_returnearly | 0.4422s | 72.9851ms | 13.7014 Ops/s | 23.6464 Ops/s | |
test_serialize_weights_pickle | 1.3756s | 1.2167s | 0.8219 Ops/s | 0.8218 Ops/s | |
test_reshape_pytree | 0.1644ms | 22.6936μs | 44.0652 KOps/s | 43.6354 KOps/s | |
test_reshape_td | 0.1020ms | 26.8232μs | 37.2811 KOps/s | 37.6644 KOps/s | |
test_view_pytree | 0.1628ms | 22.2938μs | 44.8554 KOps/s | 44.0888 KOps/s | |
test_view_td | 0.1501ms | 30.0140μs | 33.3178 KOps/s | 32.2702 KOps/s | |
test_unbind_pytree | 0.1830ms | 29.5896μs | 33.7956 KOps/s | 35.0241 KOps/s | |
test_unbind_td | 0.9932ms | 35.5801μs | 28.1056 KOps/s | 28.0141 KOps/s | |
test_split_pytree | 0.1284ms | 31.0076μs | 32.2502 KOps/s | 32.5514 KOps/s | |
test_split_td | 0.1762ms | 39.1792μs | 25.5238 KOps/s | 25.1869 KOps/s | |
test_add_pytree | 0.1678ms | 35.6635μs | 28.0399 KOps/s | 27.6133 KOps/s | |
test_add_td | 0.1605ms | 50.6014μs | 19.7623 KOps/s | 19.9620 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.2680ms | 0.1175ms | 8.5127 KOps/s | 8.1249 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3104ms | 0.1242ms | 8.0485 KOps/s | 7.8203 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2758ms | 97.8208μs | 10.2228 KOps/s | 10.3662 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 1.2083ms | 0.1487ms | 6.7233 KOps/s | 6.6740 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.2399ms | 29.5493μs | 33.8418 KOps/s | 47.1882 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1756ms | 26.9844μs | 37.0584 KOps/s | 36.9096 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.2505ms | 64.8102μs | 15.4297 KOps/s | 15.2210 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1969ms | 49.3607μs | 20.2590 KOps/s | 20.1023 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.2872ms | 0.1418ms | 7.0516 KOps/s | 6.8960 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3530ms | 0.2094ms | 4.7762 KOps/s | 4.8665 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2776ms | 99.5097μs | 10.0493 KOps/s | 10.2604 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2221ms | 52.6053μs | 19.0095 KOps/s | 18.9408 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.2840ms | 0.1360ms | 7.3529 KOps/s | 7.2734 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6575ms | 0.4793ms | 2.0863 KOps/s | 2.0723 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4277ms | 0.2462ms | 4.0613 KOps/s | 4.0658 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.2803ms | 0.1418ms | 7.0531 KOps/s | 6.9930 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2287ms | 62.9570μs | 15.8839 KOps/s | 15.7356 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2437ms | 98.2499μs | 10.1781 KOps/s | 10.1487 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5595ms | 0.3983ms | 2.5107 KOps/s | 2.5245 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.2640ms | 0.1363ms | 7.3371 KOps/s | 7.4219 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1227ms | 17.9296μs | 55.7737 KOps/s | 55.1572 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 63.0010μs | 27.2366μs | 36.7152 KOps/s | 37.0941 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1727ms | 69.8884μs | 14.3085 KOps/s | 14.3390 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1093ms | 51.6777μs | 19.3507 KOps/s | 19.3800 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 1.7104ms | 0.4669ms | 2.1420 KOps/s | 2.1924 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 2.8783ms | 2.5589ms | 390.7871 Ops/s | 387.8118 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 1.5960ms | 0.4360ms | 2.2934 KOps/s | 2.2146 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.0246ms | 2.6027ms | 384.2122 Ops/s | 379.3321 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.5426ms | 0.1181ms | 8.4697 KOps/s | 8.4597 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.6013ms | 82.4743μs | 12.1250 KOps/s | 12.4443 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.3029ms | 0.1101ms | 9.0830 KOps/s | 9.2111 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.5063ms | 70.3212μs | 14.2205 KOps/s | 13.6841 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.3202ms | 0.1119ms | 8.9369 KOps/s | 9.4773 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.2638ms | 70.2705μs | 14.2307 KOps/s | 14.6060 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2946ms | 0.1049ms | 9.5354 KOps/s | 9.8343 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.2158ms | 17.2269μs | 58.0489 KOps/s | 55.8557 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2749ms | 0.1014ms | 9.8638 KOps/s | 10.3116 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1435ms | 16.1843μs | 61.7882 KOps/s | 61.2369 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2738ms | 0.1025ms | 9.7606 KOps/s | 9.9729 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 0.1601ms | 16.2736μs | 61.4493 KOps/s | 61.8302 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2831ms | 0.1040ms | 9.6125 KOps/s | 9.6846 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.6707ms | 17.0136μs | 58.7765 KOps/s | 57.5651 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2398ms | 98.2744μs | 10.1756 KOps/s | 9.8469 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1592ms | 16.2935μs | 61.3740 KOps/s | 61.1839 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2743ms | 0.1019ms | 9.8100 KOps/s | 10.2262 KOps/s | |
test_compile_indexing[int-pytree-eager] | 0.1553ms | 16.1676μs | 61.8520 KOps/s | 61.9217 KOps/s | |
test_mod_add[eager] | 0.2091ms | 35.0275μs | 28.5490 KOps/s | 30.4882 KOps/s | |
test_mod_add[compile] | 0.2206ms | 77.0749μs | 12.9744 KOps/s | 12.9497 KOps/s | |
test_mod_add[compile-overhead] | 0.3308ms | 0.1755ms | 5.6993 KOps/s | 5.7671 KOps/s | |
test_mod_wrap[eager] | 0.3941ms | 0.2447ms | 4.0861 KOps/s | 4.0513 KOps/s | |
test_mod_wrap[compile] | 1.7697ms | 0.2933ms | 3.4095 KOps/s | 3.4830 KOps/s | |
test_mod_wrap[compile-overhead] | 7.6743ms | 4.0743ms | 245.4398 Ops/s | 238.1531 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.6147ms | 1.3410ms | 745.7308 Ops/s | 695.7402 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.5978ms | 1.2870ms | 776.9765 Ops/s | 723.7585 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4096ms | 0.9536ms | 1.0487 KOps/s | 949.4089 Ops/s | |
test_seq_add[eager] | 0.2911ms | 99.6050μs | 10.0397 KOps/s | 10.0313 KOps/s | |
test_seq_add[compile] | 0.2932ms | 87.2108μs | 11.4665 KOps/s | 11.1815 KOps/s | |
test_seq_add[compile-overhead] | 0.2770ms | 0.1277ms | 7.8298 KOps/s | 7.7505 KOps/s | |
test_seq_wrap[eager] | 0.5636ms | 0.3887ms | 2.5729 KOps/s | 2.5652 KOps/s | |
test_seq_wrap[compile] | 0.5118ms | 0.3070ms | 3.2576 KOps/s | 3.1396 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3725ms | 0.2247ms | 4.4508 KOps/s | 4.3020 KOps/s | |
test_func_call_runtime[False-eager] | 0.9010ms | 0.7374ms | 1.3561 KOps/s | 1.3236 KOps/s | |
test_func_call_runtime[False-compile] | 0.9146ms | 0.7506ms | 1.3324 KOps/s | 1.3203 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5126ms | 0.3644ms | 2.7439 KOps/s | 2.7241 KOps/s | |
test_func_call_runtime[True-eager] | 1.2079ms | 0.9069ms | 1.1027 KOps/s | 1.0989 KOps/s | |
test_func_call_runtime[True-compile] | 0.9648ms | 0.7772ms | 1.2866 KOps/s | 1.2845 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5319ms | 0.3856ms | 2.5934 KOps/s | 2.6090 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.9782ms | 0.7913ms | 1.2638 KOps/s | 1.3509 KOps/s | |
test_func_call_cm_runtime[False-compile] | 1.0223ms | 0.7706ms | 1.2978 KOps/s | 1.3192 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.5093ms | 0.3660ms | 2.7324 KOps/s | 2.7032 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.1619ms | 1.0000ms | 1.0000 KOps/s | 989.7729 Ops/s | |
test_func_call_cm_runtime[True-compile] | 1.0028ms | 0.8065ms | 1.2400 KOps/s | 1.2313 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.6001ms | 0.4113ms | 2.4312 KOps/s | 2.4091 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5033ms | 2.0557ms | 486.4499 Ops/s | 478.5895 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9882ms | 0.8112ms | 1.2328 KOps/s | 1.2236 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5745ms | 0.4142ms | 2.4142 KOps/s | 2.3688 KOps/s | |
test_distributed | 1.7962ms | 0.1261ms | 7.9298 KOps/s | 8.3300 KOps/s | |
test_tdmodule | 0.1286ms | 15.7193μs | 63.6160 KOps/s | 69.5759 KOps/s | |
test_tdmodule_dispatch | 0.1348ms | 30.4377μs | 32.8540 KOps/s | 35.8400 KOps/s | |
test_tdseq | 36.5910μs | 16.6913μs | 59.9114 KOps/s | 63.8164 KOps/s | |
test_tdseq_dispatch | 56.4110μs | 33.3502μs | 29.9848 KOps/s | 31.4026 KOps/s | |
test_instantiation_functorch | 1.7225ms | 1.5626ms | 639.9490 Ops/s | 641.5820 Ops/s | |
test_exec_functorch | 0.2374ms | 0.1465ms | 6.8252 KOps/s | 6.6015 KOps/s | |
test_exec_functional_call | 0.3119ms | 0.1448ms | 6.9072 KOps/s | 6.7344 KOps/s | |
test_exec_td_decorator | 0.3776ms | 0.1871ms | 5.3460 KOps/s | 5.2221 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8811ms | 0.6915ms | 1.4460 KOps/s | 1.4844 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.8963ms | 0.6898ms | 1.4497 KOps/s | 1.4792 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7787ms | 0.6050ms | 1.6530 KOps/s | 1.6947 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8053ms | 0.6090ms | 1.6420 KOps/s | 1.6950 KOps/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.6310ms | 19.0673ms | 52.4457 Ops/s | 52.3863 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.5216ms | 19.1161ms | 52.3119 Ops/s | 52.3085 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.5443ms | 19.0206ms | 52.5746 Ops/s | 52.8920 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 19.6738ms | 18.9483ms | 52.7753 Ops/s | 52.5685 Ops/s | |
test_to_module_speed[True] | 2.2845ms | 0.9297ms | 1.0756 KOps/s | 1.0799 KOps/s | |
test_to_module_speed[False] | 1.0255ms | 0.9047ms | 1.1053 KOps/s | 1.1051 KOps/s | |
test_tc_init | 0.1829ms | 35.8820μs | 27.8691 KOps/s | 27.3924 KOps/s | |
test_tc_init_nested | 0.2379ms | 72.4052μs | 13.8112 KOps/s | 13.4962 KOps/s | |
test_tc_first_layer_tensor | 11.9373μs | 0.7015μs | 1.4256 MOps/s | 1.4374 MOps/s | |
test_tc_first_layer_nontensor | 34.1910μs | 2.2974μs | 435.2687 KOps/s | 427.0781 KOps/s | |
test_tc_second_layer_tensor | 9.0103μs | 1.4068μs | 710.8218 KOps/s | 713.0397 KOps/s | |
test_tc_second_layer_nontensor | 25.0510μs | 3.0231μs | 330.7841 KOps/s | 328.0502 KOps/s | |
test_unbind | 6.8783ms | 6.6514ms | 150.3449 Ops/s | 150.6938 Ops/s | |
test_full_like | 13.4280ms | 10.9989ms | 90.9182 Ops/s | 89.3423 Ops/s | |
test_zeros_like | 6.2730ms | 4.9244ms | 203.0706 Ops/s | 209.4616 Ops/s | |
test_ones_like | 6.1918ms | 4.8955ms | 204.2675 Ops/s | 202.6660 Ops/s | |
test_clone | 9.5622ms | 7.9227ms | 126.2203 Ops/s | 124.6901 Ops/s | |
test_squeeze | 67.6610μs | 9.2297μs | 108.3460 KOps/s | 106.3564 KOps/s | |
test_unsqueeze | 0.1218ms | 72.1058μs | 13.8685 KOps/s | 14.2932 KOps/s | |
test_split | 0.2674s | 0.2338ms | 4.2766 KOps/s | 6.2824 KOps/s | |
test_permute | 0.3614ms | 0.1803ms | 5.5464 KOps/s | 5.6320 KOps/s | |
test_stack | 55.2714ms | 53.7744ms | 18.5962 Ops/s | 18.7371 Ops/s | |
test_cat | 58.4928ms | 53.9091ms | 18.5497 Ops/s | 18.6760 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):