-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] grad and data for tensorclasses #904
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 19, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 51.0260μs | 21.1951μs | 47.1807 KOps/s | 46.3557 KOps/s | |
test_plain_set_stack_nested | 54.4310μs | 21.5088μs | 46.4926 KOps/s | 46.1894 KOps/s | |
test_plain_set_nested_inplace | 66.0230μs | 23.4562μs | 42.6327 KOps/s | 42.1527 KOps/s | |
test_plain_set_stack_nested_inplace | 79.0180μs | 23.4014μs | 42.7324 KOps/s | 42.4058 KOps/s | |
test_items | 29.3850μs | 2.6703μs | 374.4874 KOps/s | 384.9355 KOps/s | |
test_items_nested | 0.5178ms | 0.3658ms | 2.7335 KOps/s | 2.7672 KOps/s | |
test_items_nested_locked | 0.4740ms | 0.3648ms | 2.7409 KOps/s | 2.7505 KOps/s | |
test_items_nested_leaf | 0.1751ms | 87.4353μs | 11.4370 KOps/s | 11.5714 KOps/s | |
test_items_stack_nested | 0.5989ms | 0.3632ms | 2.7532 KOps/s | 2.7484 KOps/s | |
test_items_stack_nested_leaf | 0.1525ms | 88.0684μs | 11.3548 KOps/s | 11.3414 KOps/s | |
test_items_stack_nested_locked | 1.4865ms | 0.3666ms | 2.7275 KOps/s | 2.7348 KOps/s | |
test_keys | 29.5650μs | 3.8694μs | 258.4408 KOps/s | 250.0362 KOps/s | |
test_keys_nested | 0.2478ms | 0.1438ms | 6.9538 KOps/s | 6.9634 KOps/s | |
test_keys_nested_locked | 0.8173ms | 0.1498ms | 6.6751 KOps/s | 6.6394 KOps/s | |
test_keys_nested_leaf | 0.2116ms | 0.1226ms | 8.1594 KOps/s | 8.1674 KOps/s | |
test_keys_stack_nested | 0.2522ms | 0.1451ms | 6.8899 KOps/s | 6.9555 KOps/s | |
test_keys_stack_nested_leaf | 0.2143ms | 0.1229ms | 8.1343 KOps/s | 8.1141 KOps/s | |
test_keys_stack_nested_locked | 0.2411ms | 0.1494ms | 6.6944 KOps/s | 6.6516 KOps/s | |
test_values | 8.6060μs | 1.1520μs | 868.0416 KOps/s | 856.5357 KOps/s | |
test_values_nested | 89.9570μs | 50.1450μs | 19.9422 KOps/s | 19.8873 KOps/s | |
test_values_nested_locked | 0.1297ms | 50.4074μs | 19.8384 KOps/s | 19.9931 KOps/s | |
test_values_nested_leaf | 82.2340μs | 45.3495μs | 22.0510 KOps/s | 22.3714 KOps/s | |
test_values_stack_nested | 94.5470μs | 50.6952μs | 19.7257 KOps/s | 19.7640 KOps/s | |
test_values_stack_nested_leaf | 0.1070ms | 45.7251μs | 21.8698 KOps/s | 22.2294 KOps/s | |
test_values_stack_nested_locked | 0.1336ms | 50.6324μs | 19.7502 KOps/s | 19.7601 KOps/s | |
test_membership | 2.4485μs | 0.7288μs | 1.3722 MOps/s | 1.1091 MOps/s | |
test_membership_nested | 29.0350μs | 2.6812μs | 372.9733 KOps/s | 366.3534 KOps/s | |
test_membership_nested_leaf | 49.6520μs | 2.6898μs | 371.7808 KOps/s | 365.2147 KOps/s | |
test_membership_stacked_nested | 23.2440μs | 2.6834μs | 372.6587 KOps/s | 359.7256 KOps/s | |
test_membership_stacked_nested_leaf | 38.2210μs | 2.7370μs | 365.3602 KOps/s | 328.8534 KOps/s | |
test_membership_nested_last | 30.6070μs | 4.0451μs | 247.2103 KOps/s | 249.4387 KOps/s | |
test_membership_nested_leaf_last | 23.7240μs | 4.0440μs | 247.2775 KOps/s | 247.2145 KOps/s | |
test_membership_stacked_nested_last | 35.7370μs | 4.5713μs | 218.7583 KOps/s | 252.3663 KOps/s | |
test_membership_stacked_nested_leaf_last | 24.2450μs | 4.6727μs | 214.0097 KOps/s | 248.4814 KOps/s | |
test_nested_getleaf | 34.1540μs | 10.9142μs | 91.6240 KOps/s | 90.8065 KOps/s | |
test_nested_get | 41.3370μs | 10.5260μs | 95.0032 KOps/s | 97.2059 KOps/s | |
test_stacked_getleaf | 36.4580μs | 11.0031μs | 90.8832 KOps/s | 91.6695 KOps/s | |
test_stacked_get | 35.2860μs | 10.3175μs | 96.9229 KOps/s | 96.7558 KOps/s | |
test_nested_getitemleaf | 43.5710μs | 11.3649μs | 87.9899 KOps/s | 87.5971 KOps/s | |
test_nested_getitem | 46.4670μs | 10.5228μs | 95.0315 KOps/s | 95.7963 KOps/s | |
test_stacked_getitemleaf | 48.4410μs | 11.4198μs | 87.5670 KOps/s | 89.6079 KOps/s | |
test_stacked_getitem | 47.2780μs | 10.5456μs | 94.8266 KOps/s | 97.2471 KOps/s | |
test_lock_nested | 0.9992ms | 0.5075ms | 1.9703 KOps/s | 1.7058 KOps/s | |
test_lock_stack_nested | 0.8167ms | 0.4819ms | 2.0752 KOps/s | 2.0634 KOps/s | |
test_unlock_nested | 0.8084ms | 0.4281ms | 2.3357 KOps/s | 1.9573 KOps/s | |
test_unlock_stack_nested | 0.6438ms | 0.3975ms | 2.5157 KOps/s | 2.5124 KOps/s | |
test_flatten_speed | 0.2405ms | 0.1081ms | 9.2478 KOps/s | 9.4442 KOps/s | |
test_unflatten_speed | 0.5470ms | 0.4442ms | 2.2513 KOps/s | 2.2572 KOps/s | |
test_common_ops | 1.8211ms | 1.0812ms | 924.8665 Ops/s | 897.3385 Ops/s | |
test_creation | 92.2520μs | 2.4792μs | 403.3505 KOps/s | 396.9578 KOps/s | |
test_creation_empty | 54.6820μs | 17.4540μs | 57.2933 KOps/s | 53.9922 KOps/s | |
test_creation_nested_1 | 63.0080μs | 21.0167μs | 47.5813 KOps/s | 46.1294 KOps/s | |
test_creation_nested_2 | 57.7680μs | 24.7647μs | 40.3801 KOps/s | 38.9029 KOps/s | |
test_clone | 72.2440μs | 16.9607μs | 58.9598 KOps/s | 57.2911 KOps/s | |
test_getitem[int] | 0.9338ms | 12.6450μs | 79.0827 KOps/s | 78.5365 KOps/s | |
test_getitem[slice_int] | 0.1222ms | 31.8758μs | 31.3718 KOps/s | 30.8818 KOps/s | |
test_getitem[range] | 0.2701ms | 55.9735μs | 17.8656 KOps/s | 17.5680 KOps/s | |
test_getitem[tuple] | 0.1838ms | 26.2366μs | 38.1147 KOps/s | 37.7443 KOps/s | |
test_getitem[list] | 0.3431ms | 49.0621μs | 20.3823 KOps/s | 19.1984 KOps/s | |
test_setitem_dim[int] | 55.9240μs | 31.0047μs | 32.2531 KOps/s | 31.1486 KOps/s | |
test_setitem_dim[slice_int] | 0.1184ms | 66.9752μs | 14.9309 KOps/s | 14.1100 KOps/s | |
test_setitem_dim[range] | 0.1307ms | 87.5227μs | 11.4256 KOps/s | 11.0484 KOps/s | |
test_setitem_dim[tuple] | 85.1590μs | 54.7042μs | 18.2801 KOps/s | 17.3749 KOps/s | |
test_setitem | 0.1104ms | 28.1436μs | 35.5320 KOps/s | 33.2765 KOps/s | |
test_set | 0.1823ms | 27.3329μs | 36.5859 KOps/s | 34.6009 KOps/s | |
test_set_shared | 3.3364ms | 0.2165ms | 4.6200 KOps/s | 4.6547 KOps/s | |
test_update | 0.1985ms | 33.6879μs | 29.6843 KOps/s | 27.8567 KOps/s | |
test_update_nested | 0.1578ms | 43.2339μs | 23.1300 KOps/s | 22.0045 KOps/s | |
test_update__nested | 0.1581ms | 34.0066μs | 29.4061 KOps/s | 28.9460 KOps/s | |
test_set_nested | 0.1393ms | 30.1530μs | 33.1642 KOps/s | 31.3991 KOps/s | |
test_set_nested_new | 0.1797ms | 34.8941μs | 28.6582 KOps/s | 27.1920 KOps/s | |
test_select | 0.1221ms | 52.0430μs | 19.2149 KOps/s | 18.7216 KOps/s | |
test_select_nested | 0.1528ms | 60.7521μs | 16.4603 KOps/s | 16.7550 KOps/s | |
test_exclude_nested | 0.1570ms | 80.4873μs | 12.4243 KOps/s | 12.6809 KOps/s | |
test_empty[True] | 0.7512ms | 0.3436ms | 2.9103 KOps/s | 2.9749 KOps/s | |
test_empty[False] | 13.7390μs | 1.2425μs | 804.8024 KOps/s | 796.8952 KOps/s | |
test_unbind_speed | 0.5130ms | 0.3222ms | 3.1040 KOps/s | 3.1136 KOps/s | |
test_unbind_speed_stack0 | 0.7449ms | 0.3205ms | 3.1198 KOps/s | 3.1544 KOps/s | |
test_unbind_speed_stack1 | 83.6750ms | 0.8287ms | 1.2067 KOps/s | 1.3022 KOps/s | |
test_split | 76.4589ms | 2.2146ms | 451.5464 Ops/s | 442.7136 Ops/s | |
test_chunk | 78.5357ms | 2.2177ms | 450.9204 Ops/s | 410.3240 Ops/s | |
test_creation[device0] | 4.1083ms | 0.1223ms | 8.1791 KOps/s | 8.2124 KOps/s | |
test_creation_from_tensor | 0.2574ms | 0.1185ms | 8.4391 KOps/s | 8.3805 KOps/s | |
test_add_one[memmap_tensor0] | 0.1606ms | 7.8955μs | 126.6552 KOps/s | 124.8608 KOps/s | |
test_contiguous[memmap_tensor0] | 23.3440μs | 2.2221μs | 450.0332 KOps/s | 467.9648 KOps/s | |
test_stack[memmap_tensor0] | 78.0850μs | 5.9195μs | 168.9344 KOps/s | 169.4848 KOps/s | |
test_memmaptd_index | 1.2972ms | 0.4319ms | 2.3154 KOps/s | 2.3388 KOps/s | |
test_memmaptd_index_astensor | 0.7470ms | 0.5026ms | 1.9897 KOps/s | 1.9637 KOps/s | |
test_memmaptd_index_op | 1.4077ms | 1.0214ms | 979.0341 Ops/s | 950.8310 Ops/s | |
test_serialize_model | 0.2044s | 0.1407s | 7.1056 Ops/s | 7.8946 Ops/s | |
test_serialize_model_pickle | 0.4515s | 0.3966s | 2.5214 Ops/s | 2.4897 Ops/s | |
test_serialize_weights | 0.1298s | 0.1245s | 8.0341 Ops/s | 7.1043 Ops/s | |
test_serialize_weights_returnearly | 0.1852s | 0.1678s | 5.9585 Ops/s | 5.9026 Ops/s | |
test_serialize_weights_pickle | 1.0533s | 0.7427s | 1.3464 Ops/s | 2.4235 Ops/s | |
test_serialize_weights_filesystem | 0.1550s | 0.1434s | 6.9743 Ops/s | 6.9419 Ops/s | |
test_serialize_model_filesystem | 0.1548s | 0.1457s | 6.8647 Ops/s | 6.0947 Ops/s | |
test_reshape_pytree | 85.7710μs | 39.6912μs | 25.1945 KOps/s | 26.0755 KOps/s | |
test_reshape_td | 0.1050ms | 50.5104μs | 19.7979 KOps/s | 20.1431 KOps/s | |
test_view_pytree | 88.3160μs | 39.6711μs | 25.2073 KOps/s | 25.1947 KOps/s | |
test_view_td | 0.1440ms | 57.5327μs | 17.3814 KOps/s | 17.2109 KOps/s | |
test_unbind_pytree | 97.2110μs | 36.0554μs | 27.7351 KOps/s | 27.3301 KOps/s | |
test_unbind_td | 0.3761ms | 47.4005μs | 21.0968 KOps/s | 20.8849 KOps/s | |
test_split_pytree | 80.1400μs | 38.7093μs | 25.8336 KOps/s | 26.2727 KOps/s | |
test_split_td | 0.5560ms | 61.3841μs | 16.2909 KOps/s | 16.4323 KOps/s | |
test_add_pytree | 90.5190μs | 44.1528μs | 22.6486 KOps/s | 22.4159 KOps/s | |
test_add_td | 0.1481ms | 80.3172μs | 12.4506 KOps/s | 11.8108 KOps/s | |
test_distributed | 1.4906ms | 0.1315ms | 7.6017 KOps/s | 7.3497 KOps/s | |
test_tdmodule | 55.6340μs | 15.8907μs | 62.9297 KOps/s | 57.0194 KOps/s | |
test_tdmodule_dispatch | 57.8780μs | 33.9663μs | 29.4409 KOps/s | 27.1012 KOps/s | |
test_tdseq | 34.0540μs | 17.8646μs | 55.9766 KOps/s | 52.1978 KOps/s | |
test_tdseq_dispatch | 64.8910μs | 37.9470μs | 26.3525 KOps/s | 23.9626 KOps/s | |
test_instantiation_functorch | 1.8680ms | 1.5864ms | 630.3631 Ops/s | 623.8587 Ops/s | |
test_instantiation_td | 81.1004ms | 1.2600ms | 793.6748 Ops/s | 856.6215 Ops/s | |
test_exec_functorch | 0.3169ms | 0.1820ms | 5.4955 KOps/s | 5.3777 KOps/s | |
test_exec_functional_call | 0.3317ms | 0.1730ms | 5.7795 KOps/s | 5.8070 KOps/s | |
test_exec_td | 0.2820ms | 0.1733ms | 5.7687 KOps/s | 5.4209 KOps/s | |
test_exec_td_decorator | 1.0195ms | 0.2565ms | 3.8988 KOps/s | 3.8601 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.0644ms | 0.6027ms | 1.6592 KOps/s | 1.6300 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8950ms | 0.5963ms | 1.6770 KOps/s | 1.6479 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7310ms | 0.4995ms | 2.0019 KOps/s | 1.9776 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7960ms | 0.4956ms | 2.0178 KOps/s | 1.9839 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.0688ms | 0.6886ms | 1.4523 KOps/s | 1.4249 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.0353ms | 0.6887ms | 1.4521 KOps/s | 1.4359 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 1.0194ms | 0.5774ms | 1.7318 KOps/s | 1.7132 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.9246ms | 0.5804ms | 1.7229 KOps/s | 1.7162 KOps/s | |
test_to_module_speed[True] | 2.8180ms | 1.7979ms | 556.2123 Ops/s | 557.2429 Ops/s | |
test_to_module_speed[False] | 2.0787ms | 1.7650ms | 566.5763 Ops/s | 571.7137 Ops/s | |
test_tc_init | 88.7970μs | 44.0693μs | 22.6915 KOps/s | 23.5764 KOps/s | |
test_tc_init_nested | 0.1585ms | 87.0463μs | 11.4881 KOps/s | 11.4958 KOps/s | |
test_tc_first_layer_tensor | 34.0440μs | 9.2608μs | 107.9824 KOps/s | 111.7676 KOps/s | |
test_tc_first_layer_nontensor | 32.6210μs | 9.1976μs | 108.7246 KOps/s | 110.2235 KOps/s | |
test_tc_second_layer_tensor | 32.0700μs | 2.8612μs | 349.5059 KOps/s | 360.9619 KOps/s | |
test_tc_second_layer_nontensor | 37.0490μs | 10.3962μs | 96.1885 KOps/s | 99.4335 KOps/s | |
test_unbind | 97.3084ms | 12.8342ms | 77.9171 Ops/s | 73.0647 Ops/s | |
test_full_like | 11.7384ms | 7.9320ms | 126.0722 Ops/s | 134.4754 Ops/s | |
test_zeros_like | 11.5188ms | 7.6050ms | 131.4928 Ops/s | 143.3878 Ops/s | |
test_ones_like | 12.0563ms | 7.5870ms | 131.8052 Ops/s | 126.5863 Ops/s | |
test_clone | 12.8221ms | 8.8912ms | 112.4714 Ops/s | 111.5478 Ops/s | |
test_squeeze | 70.1910μs | 14.0237μs | 71.3080 KOps/s | 70.5200 KOps/s | |
test_unsqueeze | 0.2108ms | 97.0056μs | 10.3087 KOps/s | 10.1014 KOps/s | |
test_split | 0.4477ms | 0.2092ms | 4.7790 KOps/s | 4.8188 KOps/s | |
test_permute | 0.3574ms | 0.2290ms | 4.3674 KOps/s | 4.4706 KOps/s | |
test_stack | 28.5600ms | 24.4114ms | 40.9644 Ops/s | 40.4560 Ops/s | |
test_cat | 29.5394ms | 24.0248ms | 41.6236 Ops/s | 41.0165 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 40.1010μs | 16.6648μs | 60.0066 KOps/s | 56.9594 KOps/s | |
test_plain_set_stack_nested | 0.1402ms | 16.7211μs | 59.8047 KOps/s | 56.7335 KOps/s | |
test_plain_set_nested_inplace | 38.8410μs | 17.9962μs | 55.5673 KOps/s | 53.1535 KOps/s | |
test_plain_set_stack_nested_inplace | 39.8310μs | 17.9466μs | 55.7208 KOps/s | 53.6446 KOps/s | |
test_items | 17.4100μs | 4.7546μs | 210.3225 KOps/s | 211.0559 KOps/s | |
test_items_nested | 0.4440ms | 0.4008ms | 2.4952 KOps/s | 2.5619 KOps/s | |
test_items_nested_locked | 0.4222ms | 0.3998ms | 2.5012 KOps/s | 2.5249 KOps/s | |
test_items_nested_leaf | 0.1060ms | 85.7015μs | 11.6684 KOps/s | 11.5463 KOps/s | |
test_items_stack_nested | 0.4458ms | 0.3936ms | 2.5408 KOps/s | 2.5276 KOps/s | |
test_items_stack_nested_leaf | 0.1016ms | 86.5591μs | 11.5528 KOps/s | 11.5199 KOps/s | |
test_items_stack_nested_locked | 0.4302ms | 0.4021ms | 2.4869 KOps/s | 2.5312 KOps/s | |
test_keys | 17.2800μs | 4.3639μs | 229.1506 KOps/s | 228.7743 KOps/s | |
test_keys_nested | 85.9730μs | 67.4913μs | 14.8167 KOps/s | 15.1929 KOps/s | |
test_keys_nested_locked | 0.9158ms | 72.3816μs | 13.8157 KOps/s | 13.5941 KOps/s | |
test_keys_nested_leaf | 76.7720μs | 56.9047μs | 17.5732 KOps/s | 17.7862 KOps/s | |
test_keys_stack_nested | 0.2399ms | 66.5337μs | 15.0300 KOps/s | 15.0114 KOps/s | |
test_keys_stack_nested_leaf | 0.2236ms | 56.2978μs | 17.7627 KOps/s | 17.3896 KOps/s | |
test_keys_stack_nested_locked | 0.2599ms | 71.9810μs | 13.8926 KOps/s | 13.8236 KOps/s | |
test_values | 62.4450μs | 1.7493μs | 571.6460 KOps/s | 562.6350 KOps/s | |
test_values_nested | 49.1610μs | 33.8985μs | 29.4998 KOps/s | 29.6296 KOps/s | |
test_values_nested_locked | 0.2309ms | 35.6621μs | 28.0410 KOps/s | 27.9598 KOps/s | |
test_values_nested_leaf | 52.3610μs | 30.1087μs | 33.2129 KOps/s | 33.0408 KOps/s | |
test_values_stack_nested | 57.6730μs | 34.6930μs | 28.8243 KOps/s | 29.1207 KOps/s | |
test_values_stack_nested_leaf | 0.1650ms | 30.8026μs | 32.4648 KOps/s | 32.6667 KOps/s | |
test_values_stack_nested_locked | 0.1367ms | 36.6219μs | 27.3061 KOps/s | 27.7118 KOps/s | |
test_membership | 1.3275μs | 0.5392μs | 1.8548 MOps/s | 1.8464 MOps/s | |
test_membership_nested | 15.2210μs | 2.0978μs | 476.6923 KOps/s | 480.6010 KOps/s | |
test_membership_nested_leaf | 10.2505μs | 2.0172μs | 495.7457 KOps/s | 492.4195 KOps/s | |
test_membership_stacked_nested | 19.5010μs | 2.0681μs | 483.5318 KOps/s | 478.4099 KOps/s | |
test_membership_stacked_nested_leaf | 15.1300μs | 2.0815μs | 480.4138 KOps/s | 480.5678 KOps/s | |
test_membership_nested_last | 27.4010μs | 2.9802μs | 335.5507 KOps/s | 330.7265 KOps/s | |
test_membership_nested_leaf_last | 27.6300μs | 2.9697μs | 336.7400 KOps/s | 330.8515 KOps/s | |
test_membership_stacked_nested_last | 28.2210μs | 9.1427μs | 109.3775 KOps/s | 288.0346 KOps/s | |
test_membership_stacked_nested_leaf_last | 25.1810μs | 9.2178μs | 108.4861 KOps/s | 290.7614 KOps/s | |
test_nested_getleaf | 26.3300μs | 8.0977μs | 123.4926 KOps/s | 124.3144 KOps/s | |
test_nested_get | 23.1710μs | 7.6213μs | 131.2110 KOps/s | 132.2257 KOps/s | |
test_stacked_getleaf | 24.9810μs | 8.0739μs | 123.8556 KOps/s | 123.5019 KOps/s | |
test_stacked_get | 29.6410μs | 7.5431μs | 132.5716 KOps/s | 132.4665 KOps/s | |
test_nested_getitemleaf | 64.6610μs | 8.1834μs | 122.1988 KOps/s | 122.3977 KOps/s | |
test_nested_getitem | 23.2510μs | 7.7221μs | 129.4992 KOps/s | 129.8298 KOps/s | |
test_stacked_getitemleaf | 24.3410μs | 8.2016μs | 121.9271 KOps/s | 121.4170 KOps/s | |
test_stacked_getitem | 31.0110μs | 7.7392μs | 129.2129 KOps/s | 129.1854 KOps/s | |
test_lock_nested | 1.0553ms | 0.4738ms | 2.1108 KOps/s | 2.1170 KOps/s | |
test_lock_stack_nested | 0.5442ms | 0.4218ms | 2.3707 KOps/s | 2.2934 KOps/s | |
test_unlock_nested | 0.8169ms | 0.3921ms | 2.5507 KOps/s | 2.5439 KOps/s | |
test_unlock_stack_nested | 0.5283ms | 0.3418ms | 2.9260 KOps/s | 2.8395 KOps/s | |
test_flatten_speed | 0.2164ms | 0.1049ms | 9.5337 KOps/s | 9.4190 KOps/s | |
test_unflatten_speed | 0.3125ms | 0.2914ms | 3.4314 KOps/s | 3.3808 KOps/s | |
test_common_ops | 1.5964ms | 1.3379ms | 747.4414 Ops/s | 725.8187 Ops/s | |
test_creation | 16.8510μs | 1.9599μs | 510.2184 KOps/s | 514.7404 KOps/s | |
test_creation_empty | 33.7910μs | 17.1420μs | 58.3362 KOps/s | 53.0736 KOps/s | |
test_creation_nested_1 | 0.1149ms | 19.1035μs | 52.3463 KOps/s | 47.7541 KOps/s | |
test_creation_nested_2 | 41.7710μs | 22.0749μs | 45.3003 KOps/s | 42.3061 KOps/s | |
test_clone | 0.1774ms | 30.3060μs | 32.9968 KOps/s | 32.7868 KOps/s | |
test_getitem[int] | 1.2204ms | 16.5929μs | 60.2668 KOps/s | 59.7231 KOps/s | |
test_getitem[slice_int] | 0.2035ms | 29.4975μs | 33.9012 KOps/s | 33.0570 KOps/s | |
test_getitem[range] | 0.2385ms | 0.1123ms | 8.9043 KOps/s | 8.8276 KOps/s | |
test_getitem[tuple] | 0.1704ms | 25.0368μs | 39.9411 KOps/s | 37.9400 KOps/s | |
test_getitem[list] | 0.2503ms | 0.1024ms | 9.7613 KOps/s | 9.1077 KOps/s | |
test_setitem_dim[int] | 0.1795ms | 53.7644μs | 18.5997 KOps/s | 16.9572 KOps/s | |
test_setitem_dim[slice_int] | 0.2281ms | 82.2584μs | 12.1568 KOps/s | 11.7789 KOps/s | |
test_setitem_dim[range] | 0.3050ms | 0.1467ms | 6.8177 KOps/s | 6.6574 KOps/s | |
test_setitem_dim[tuple] | 0.2371ms | 73.9016μs | 13.5315 KOps/s | 12.9827 KOps/s | |
test_setitem | 0.2289ms | 47.7231μs | 20.9542 KOps/s | 20.4589 KOps/s | |
test_set | 0.2172ms | 46.4533μs | 21.5270 KOps/s | 20.7360 KOps/s | |
test_set_shared | 0.4194ms | 54.6707μs | 18.2913 KOps/s | 17.7905 KOps/s | |
test_update | 0.2011ms | 51.0447μs | 19.5907 KOps/s | 17.6888 KOps/s | |
test_update_nested | 0.2461ms | 63.3124μs | 15.7947 KOps/s | 15.2540 KOps/s | |
test_update__nested | 0.2423ms | 65.6895μs | 15.2231 KOps/s | 14.7663 KOps/s | |
test_set_nested | 0.2234ms | 48.6427μs | 20.5581 KOps/s | 19.7251 KOps/s | |
test_set_nested_new | 0.2316ms | 52.2692μs | 19.1317 KOps/s | 18.3201 KOps/s | |
test_select | 0.2298ms | 68.3825μs | 14.6236 KOps/s | 14.2181 KOps/s | |
test_select_nested | 0.1777ms | 53.3003μs | 18.7616 KOps/s | 18.4679 KOps/s | |
test_exclude_nested | 0.1959ms | 72.6556μs | 13.7636 KOps/s | 13.5351 KOps/s | |
test_empty[True] | 0.3540ms | 0.3030ms | 3.3001 KOps/s | 3.3263 KOps/s | |
test_empty[False] | 2.2810μs | 0.9304μs | 1.0748 MOps/s | 1.0942 MOps/s | |
test_to | 0.1481ms | 38.5576μs | 25.9353 KOps/s | 26.1420 KOps/s | |
test_to_nonblocking | 0.1095ms | 24.7754μs | 40.3626 KOps/s | 42.0701 KOps/s | |
test_unbind_speed | 0.5051ms | 0.3078ms | 3.2485 KOps/s | 3.3027 KOps/s | |
test_unbind_speed_stack0 | 0.3999ms | 0.2946ms | 3.3944 KOps/s | 3.3024 KOps/s | |
test_unbind_speed_stack1 | 93.2608ms | 0.8276ms | 1.2083 KOps/s | 1.2804 KOps/s | |
test_split | 92.1612ms | 2.3225ms | 430.5648 Ops/s | 435.0392 Ops/s | |
test_chunk | 2.3739ms | 2.1285ms | 469.8204 Ops/s | 430.9545 Ops/s | |
test_creation[device0] | 0.2900ms | 0.1038ms | 9.6372 KOps/s | 9.5877 KOps/s | |
test_creation_from_tensor | 0.3145ms | 0.1060ms | 9.4317 KOps/s | 9.9745 KOps/s | |
test_add_one[memmap_tensor0] | 21.2510μs | 8.6594μs | 115.4819 KOps/s | 115.7026 KOps/s | |
test_contiguous[memmap_tensor0] | 0.1150ms | 2.1593μs | 463.1094 KOps/s | 461.2617 KOps/s | |
test_stack[memmap_tensor0] | 55.8410μs | 6.5297μs | 153.1463 KOps/s | 152.4144 KOps/s | |
test_memmaptd_index | 1.3826ms | 0.4227ms | 2.3659 KOps/s | 2.3942 KOps/s | |
test_memmaptd_index_astensor | 0.8574ms | 0.4873ms | 2.0523 KOps/s | 2.0741 KOps/s | |
test_memmaptd_index_op | 1.4819ms | 1.0346ms | 966.5912 Ops/s | 965.4060 Ops/s | |
test_serialize_model | 0.1009s | 97.0387ms | 10.3052 Ops/s | 10.0918 Ops/s | |
test_serialize_model_pickle | 1.3475s | 1.2375s | 0.8081 Ops/s | 0.8072 Ops/s | |
test_serialize_weights | 96.0035ms | 92.7856ms | 10.7775 Ops/s | 9.1239 Ops/s | |
test_serialize_weights_returnearly | 89.4512ms | 72.8553ms | 13.7258 Ops/s | 14.0488 Ops/s | |
test_serialize_weights_pickle | 1.3513s | 1.2237s | 0.8172 Ops/s | 0.8182 Ops/s | |
test_reshape_pytree | 0.1838ms | 38.8866μs | 25.7158 KOps/s | 25.8300 KOps/s | |
test_reshape_td | 84.6120μs | 45.8119μs | 21.8284 KOps/s | 21.8215 KOps/s | |
test_view_pytree | 0.2745ms | 38.6121μs | 25.8986 KOps/s | 26.1689 KOps/s | |
test_view_td | 0.2486ms | 54.0185μs | 18.5122 KOps/s | 18.4220 KOps/s | |
test_unbind_pytree | 0.1593ms | 36.6378μs | 27.2942 KOps/s | 26.2123 KOps/s | |
test_unbind_td | 0.3751ms | 45.5833μs | 21.9379 KOps/s | 21.9666 KOps/s | |
test_split_pytree | 0.3469ms | 51.8724μs | 19.2781 KOps/s | 19.4907 KOps/s | |
test_split_td | 91.0628ms | 70.1126μs | 14.2628 KOps/s | 14.3498 KOps/s | |
test_add_pytree | 0.2050ms | 58.5318μs | 17.0847 KOps/s | 16.6386 KOps/s | |
test_add_td | 0.4172ms | 0.1038ms | 9.6298 KOps/s | 9.6474 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4137ms | 0.2062ms | 4.8489 KOps/s | 4.8347 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.3188ms | 0.1758ms | 5.6875 KOps/s | 5.7340 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.2845ms | 0.1439ms | 6.9495 KOps/s | 6.9284 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3647ms | 0.1942ms | 5.1503 KOps/s | 5.2010 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 0.1515ms | 21.5861μs | 46.3260 KOps/s | 45.8068 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1885ms | 48.4879μs | 20.6237 KOps/s | 20.6822 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1598ms | 72.5545μs | 13.7828 KOps/s | 13.8260 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1208ms | 59.5846μs | 16.7829 KOps/s | 16.6599 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4343ms | 0.3242ms | 3.0849 KOps/s | 3.0942 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3410ms | 0.2228ms | 4.4887 KOps/s | 4.4784 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.2926ms | 0.1344ms | 7.4425 KOps/s | 7.7399 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.2516ms | 66.5338μs | 15.0299 KOps/s | 15.7708 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.4283ms | 0.3244ms | 3.0825 KOps/s | 3.1017 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.8663ms | 0.6635ms | 1.5071 KOps/s | 1.6174 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.4752ms | 0.2758ms | 3.6259 KOps/s | 3.6601 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.4904ms | 0.3276ms | 3.0529 KOps/s | 3.0578 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.2813ms | 79.7626μs | 12.5372 KOps/s | 12.6710 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2977ms | 0.1340ms | 7.4638 KOps/s | 7.4508 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.7553ms | 0.5441ms | 1.8378 KOps/s | 1.8762 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.4458ms | 0.3221ms | 3.1042 KOps/s | 3.0826 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.1419ms | 18.6080μs | 53.7404 KOps/s | 51.2897 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 67.1420μs | 32.9510μs | 30.3481 KOps/s | 30.7431 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1097ms | 74.9015μs | 13.3509 KOps/s | 13.0662 KOps/s | |
test_compile_copy_flat[pytree-eager] | 92.4730μs | 60.4134μs | 16.5526 KOps/s | 16.3672 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.7822ms | 0.9734ms | 1.0273 KOps/s | 1.0475 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.6439ms | 3.3066ms | 302.4240 Ops/s | 308.8549 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.5985ms | 0.9305ms | 1.0747 KOps/s | 1.0666 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.4319ms | 3.1719ms | 315.2719 Ops/s | 314.4910 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.2813ms | 0.1128ms | 8.8664 KOps/s | 9.1573 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.2567ms | 66.7888μs | 14.9726 KOps/s | 16.2945 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2508ms | 0.1022ms | 9.7803 KOps/s | 9.8404 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 0.1952ms | 44.8062μs | 22.3184 KOps/s | 22.4887 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.2494ms | 0.1024ms | 9.7612 KOps/s | 9.3526 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 0.1923ms | 45.0072μs | 22.2187 KOps/s | 21.1833 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2761ms | 0.1383ms | 7.2308 KOps/s | 7.2459 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1999ms | 26.2953μs | 38.0295 KOps/s | 38.7162 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.2741ms | 0.1294ms | 7.7304 KOps/s | 7.7279 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 0.1351ms | 22.5439μs | 44.3578 KOps/s | 44.9007 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.2802ms | 0.1294ms | 7.7274 KOps/s | 7.5000 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 53.8010μs | 22.8799μs | 43.7065 KOps/s | 44.3448 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.2874ms | 0.1381ms | 7.2410 KOps/s | 7.2132 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.5210ms | 26.5643μs | 37.6445 KOps/s | 39.2206 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2741ms | 0.1294ms | 7.7294 KOps/s | 7.6199 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 0.1382ms | 23.0498μs | 43.3844 KOps/s | 44.9965 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.2809ms | 0.1293ms | 7.7350 KOps/s | 7.4246 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.3610μs | 22.7091μs | 44.0352 KOps/s | 43.9705 KOps/s | |
test_mod_add[eager] | 0.1835ms | 37.4056μs | 26.7339 KOps/s | 26.4104 KOps/s | |
test_mod_add[compile] | 0.2450ms | 68.4863μs | 14.6015 KOps/s | 14.7263 KOps/s | |
test_mod_add[compile-overhead] | 0.2624ms | 0.1450ms | 6.8959 KOps/s | 6.6611 KOps/s | |
test_mod_wrap[eager] | 0.4181ms | 0.2506ms | 3.9911 KOps/s | 4.0149 KOps/s | |
test_mod_wrap[compile] | 0.4621ms | 0.2984ms | 3.3511 KOps/s | 3.3508 KOps/s | |
test_mod_wrap[compile-overhead] | 8.1936ms | 4.3655ms | 229.0689 Ops/s | 233.5367 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.7401ms | 1.4185ms | 704.9905 Ops/s | 700.1131 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.7857ms | 1.4743ms | 678.2800 Ops/s | 675.1183 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.4662ms | 0.9955ms | 1.0046 KOps/s | 992.5771 Ops/s | |
test_seq_add[eager] | 0.2524ms | 0.1088ms | 9.1880 KOps/s | 8.9020 KOps/s | |
test_seq_add[compile] | 0.2637ms | 87.6502μs | 11.4090 KOps/s | 11.6768 KOps/s | |
test_seq_add[compile-overhead] | 0.3037ms | 0.1263ms | 7.9160 KOps/s | 8.2274 KOps/s | |
test_seq_wrap[eager] | 0.6386ms | 0.4416ms | 2.2643 KOps/s | 2.3535 KOps/s | |
test_seq_wrap[compile] | 1.5464ms | 0.3401ms | 2.9402 KOps/s | 3.0192 KOps/s | |
test_seq_wrap[compile-overhead] | 0.3066s | 0.1467s | 6.8159 Ops/s | 6.7561 Ops/s | |
test_func_call_runtime[False-eager] | 0.9729ms | 0.7410ms | 1.3495 KOps/s | 1.3618 KOps/s | |
test_func_call_runtime[False-compile] | 0.9901ms | 0.8278ms | 1.2080 KOps/s | 1.2049 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.5346ms | 0.3713ms | 2.6935 KOps/s | 2.6920 KOps/s | |
test_func_call_runtime[True-eager] | 1.2591ms | 0.9927ms | 1.0074 KOps/s | 1.0121 KOps/s | |
test_func_call_runtime[True-compile] | 1.0333ms | 0.8712ms | 1.1478 KOps/s | 1.1567 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.5792ms | 0.4128ms | 2.4225 KOps/s | 2.4325 KOps/s | |
test_distributed | 2.5607ms | 72.9434μs | 13.7093 KOps/s | 14.2064 KOps/s | |
test_tdmodule | 38.7410μs | 16.6809μs | 59.9489 KOps/s | 59.0896 KOps/s | |
test_tdmodule_dispatch | 53.0410μs | 33.7473μs | 29.6320 KOps/s | 29.0043 KOps/s | |
test_tdseq | 32.8810μs | 16.9850μs | 58.8753 KOps/s | 57.6200 KOps/s | |
test_tdseq_dispatch | 54.2610μs | 35.9868μs | 27.7880 KOps/s | 27.2601 KOps/s | |
test_instantiation_functorch | 2.2367ms | 2.0166ms | 495.8894 Ops/s | 504.9812 Ops/s | |
test_instantiation_td | 2.0427ms | 1.3104ms | 763.1180 Ops/s | 767.2721 Ops/s | |
test_exec_functorch | 0.3969ms | 0.2303ms | 4.3419 KOps/s | 4.5002 KOps/s | |
test_exec_functional_call | 0.4288ms | 0.2303ms | 4.3413 KOps/s | 4.6229 KOps/s | |
test_exec_td | 0.4263ms | 0.2313ms | 4.3240 KOps/s | 4.6336 KOps/s | |
test_exec_td_decorator | 0.5029ms | 0.3053ms | 3.2754 KOps/s | 3.4324 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.1326ms | 0.6959ms | 1.4371 KOps/s | 1.5111 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8748ms | 0.6935ms | 1.4419 KOps/s | 1.5160 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7970ms | 0.6104ms | 1.6382 KOps/s | 1.7002 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.7994ms | 0.6070ms | 1.6474 KOps/s | 1.6603 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.1838ms | 0.7565ms | 1.3219 KOps/s | 1.3534 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9612ms | 0.7567ms | 1.3215 KOps/s | 1.3580 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.9286ms | 0.6454ms | 1.5495 KOps/s | 1.5403 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8328ms | 0.6400ms | 1.5625 KOps/s | 1.5365 KOps/s | |
test_vmap_transformer_speed[True-True] | 9.2810ms | 8.6221ms | 115.9814 Ops/s | 116.6427 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.7915ms | 8.5221ms | 117.3422 Ops/s | 116.9780 Ops/s | |
test_vmap_transformer_speed[False-True] | 9.1360ms | 8.5545ms | 116.8972 Ops/s | 117.5954 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.6737ms | 8.4344ms | 118.5621 Ops/s | 117.2465 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 21.4650ms | 20.4838ms | 48.8192 Ops/s | 48.5833 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 20.9730ms | 20.4156ms | 48.9822 Ops/s | 48.7020 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 20.8915ms | 20.1900ms | 49.5294 Ops/s | 49.3271 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.0686ms | 20.2677ms | 49.3395 Ops/s | 48.9703 Ops/s | |
test_to_module_speed[True] | 1.6163ms | 1.4898ms | 671.2368 Ops/s | 670.0949 Ops/s | |
test_to_module_speed[False] | 1.5940ms | 1.4652ms | 682.4839 Ops/s | 678.1156 Ops/s | |
test_tc_init | 56.6120μs | 36.9251μs | 27.0818 KOps/s | 25.2378 KOps/s | |
test_tc_init_nested | 0.1845ms | 76.5208μs | 13.0683 KOps/s | 12.1407 KOps/s | |
test_tc_first_layer_tensor | 19.8310μs | 3.9787μs | 251.3371 KOps/s | 251.4609 KOps/s | |
test_tc_first_layer_nontensor | 26.4600μs | 3.9895μs | 250.6607 KOps/s | 248.2448 KOps/s | |
test_tc_second_layer_tensor | 6.1252μs | 1.3051μs | 766.1978 KOps/s | 776.4833 KOps/s | |
test_tc_second_layer_nontensor | 20.2700μs | 4.6085μs | 216.9886 KOps/s | 216.0545 KOps/s | |
test_unbind | 0.3207s | 13.0766ms | 76.4727 Ops/s | 76.0125 Ops/s | |
test_full_like | 0.7636ms | 0.5769ms | 1.7333 KOps/s | 1.7290 KOps/s | |
test_zeros_like | 0.3487ms | 0.1979ms | 5.0531 KOps/s | 5.0469 KOps/s | |
test_ones_like | 0.3595ms | 0.1979ms | 5.0520 KOps/s | 5.0510 KOps/s | |
test_clone | 0.5687ms | 0.4143ms | 2.4136 KOps/s | 2.4034 KOps/s | |
test_squeeze | 0.1366ms | 11.6507μs | 85.8314 KOps/s | 84.8725 KOps/s | |
test_unsqueeze | 0.2810ms | 85.9941μs | 11.6287 KOps/s | 11.6640 KOps/s | |
test_split | 0.4912ms | 0.1850ms | 5.4044 KOps/s | 5.4564 KOps/s | |
test_permute | 0.3748ms | 0.2020ms | 4.9515 KOps/s | 5.0250 KOps/s | |
test_stack | 1.3801ms | 0.8985ms | 1.1130 KOps/s | 1.1143 KOps/s | |
test_cat | 1.3710ms | 1.2320ms | 811.6871 Ops/s | 811.6402 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
enhancement
New feature or request
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.