-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Faster dispatch #487
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Jul 11, 2023
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 30.5260μs | 15.6958μs | 63.7114 KOps/s | 61.9288 KOps/s | |
test_plain_set_stack_nested | 0.2428ms | 0.1414ms | 7.0709 KOps/s | 6.8254 KOps/s | |
test_plain_set_nested_inplace | 43.5120μs | 18.2504μs | 54.7933 KOps/s | 53.7699 KOps/s | |
test_plain_set_stack_nested_inplace | 0.3471ms | 0.1747ms | 5.7255 KOps/s | 5.5595 KOps/s | |
test_items | 18.3650μs | 2.4134μs | 414.3484 KOps/s | 414.9788 KOps/s | |
test_items_nested | 0.4152ms | 0.2658ms | 3.7619 KOps/s | 3.7083 KOps/s | |
test_items_nested_locked | 1.0286ms | 0.2663ms | 3.7547 KOps/s | 3.6711 KOps/s | |
test_items_nested_leaf | 0.5890ms | 0.1659ms | 6.0278 KOps/s | 5.9852 KOps/s | |
test_items_stack_nested | 2.5178ms | 1.5200ms | 657.8966 Ops/s | 675.0379 Ops/s | |
test_items_stack_nested_leaf | 2.2768ms | 1.3650ms | 732.5766 Ops/s | 737.5453 Ops/s | |
test_items_stack_nested_locked | 2.0061ms | 0.7595ms | 1.3166 KOps/s | 1.2946 KOps/s | |
test_keys | 24.5250μs | 3.8789μs | 257.8063 KOps/s | 257.8284 KOps/s | |
test_keys_nested | 0.5844ms | 0.1397ms | 7.1596 KOps/s | 6.6960 KOps/s | |
test_keys_nested_locked | 0.2592ms | 0.1383ms | 7.2302 KOps/s | 7.1588 KOps/s | |
test_keys_nested_leaf | 0.3842ms | 0.1390ms | 7.1939 KOps/s | 7.1261 KOps/s | |
test_keys_stack_nested | 2.1886ms | 1.4125ms | 707.9519 Ops/s | 710.0191 Ops/s | |
test_keys_stack_nested_leaf | 1.5480ms | 1.4064ms | 711.0130 Ops/s | 713.1970 Ops/s | |
test_keys_stack_nested_locked | 0.8370ms | 0.6745ms | 1.4827 KOps/s | 1.4768 KOps/s | |
test_values | 18.8150μs | 1.1682μs | 856.0403 KOps/s | 878.3878 KOps/s | |
test_values_nested | 0.1498ms | 48.9412μs | 20.4327 KOps/s | 19.2628 KOps/s | |
test_values_nested_locked | 0.1098ms | 49.1303μs | 20.3541 KOps/s | 19.1211 KOps/s | |
test_values_nested_leaf | 58.0880μs | 43.8428μs | 22.8087 KOps/s | 21.9550 KOps/s | |
test_values_stack_nested | 1.3629ms | 1.1946ms | 837.1037 Ops/s | 826.4048 Ops/s | |
test_values_stack_nested_leaf | 1.4554ms | 1.1922ms | 838.7543 Ops/s | 837.9293 Ops/s | |
test_values_stack_nested_locked | 0.9492ms | 0.5069ms | 1.9726 KOps/s | 1.9646 KOps/s | |
test_membership | 40.1150μs | 1.3232μs | 755.7500 KOps/s | 746.0375 KOps/s | |
test_membership_nested | 25.7680μs | 2.7471μs | 364.0214 KOps/s | 352.7022 KOps/s | |
test_membership_nested_leaf | 25.7980μs | 2.7653μs | 361.6307 KOps/s | 353.6573 KOps/s | |
test_membership_stacked_nested | 44.2320μs | 11.6669μs | 85.7128 KOps/s | 83.8660 KOps/s | |
test_membership_stacked_nested_leaf | 51.8570μs | 11.7869μs | 84.8400 KOps/s | 83.6280 KOps/s | |
test_membership_nested_last | 24.9360μs | 5.9095μs | 169.2190 KOps/s | 153.2159 KOps/s | |
test_membership_nested_leaf_last | 31.4980μs | 5.9197μs | 168.9267 KOps/s | 158.0767 KOps/s | |
test_membership_stacked_nested_last | 0.2346ms | 0.1668ms | 5.9958 KOps/s | 5.7948 KOps/s | |
test_membership_stacked_nested_leaf_last | 38.1810μs | 13.8143μs | 72.3890 KOps/s | 71.1061 KOps/s | |
test_nested_getleaf | 36.6390μs | 10.5426μs | 94.8532 KOps/s | 93.8267 KOps/s | |
test_nested_get | 37.6610μs | 10.0242μs | 99.7582 KOps/s | 99.1306 KOps/s | |
test_stacked_getleaf | 1.1108ms | 0.6408ms | 1.5605 KOps/s | 1.5404 KOps/s | |
test_stacked_get | 0.7868ms | 0.6117ms | 1.6348 KOps/s | 1.6227 KOps/s | |
test_nested_getitemleaf | 33.9340μs | 10.7571μs | 92.9623 KOps/s | 92.7375 KOps/s | |
test_nested_getitem | 37.3290μs | 10.1335μs | 98.6824 KOps/s | 98.1794 KOps/s | |
test_stacked_getitemleaf | 0.7437ms | 0.6406ms | 1.5611 KOps/s | 1.5355 KOps/s | |
test_stacked_getitem | 0.7179ms | 0.6128ms | 1.6319 KOps/s | 1.6195 KOps/s | |
test_lock_nested | 68.1233ms | 0.6250ms | 1.6000 KOps/s | 1.7374 KOps/s | |
test_lock_stack_nested | 10.0283ms | 5.1915ms | 192.6216 Ops/s | 188.7329 Ops/s | |
test_unlock_nested | 1.1159ms | 0.4455ms | 2.2446 KOps/s | 2.2012 KOps/s | |
test_unlock_stack_nested | 88.1510ms | 7.5359ms | 132.6974 Ops/s | 130.0962 Ops/s | |
test_flatten_speed | 0.5608ms | 0.2655ms | 3.7665 KOps/s | 3.7330 KOps/s | |
test_unflatten_speed | 0.5305ms | 0.4518ms | 2.2133 KOps/s | 2.1607 KOps/s | |
test_common_ops | 4.2245ms | 0.6770ms | 1.4772 KOps/s | 1.3824 KOps/s | |
test_creation | 26.3390μs | 2.4319μs | 411.2070 KOps/s | 397.7633 KOps/s | |
test_creation_empty | 24.0850μs | 8.5216μs | 117.3491 KOps/s | 112.6013 KOps/s | |
test_creation_nested_1 | 32.8110μs | 11.7469μs | 85.1286 KOps/s | 83.1161 KOps/s | |
test_creation_nested_2 | 41.7480μs | 15.5627μs | 64.2560 KOps/s | 64.0741 KOps/s | |
test_clone | 63.8900μs | 13.0330μs | 76.7284 KOps/s | 72.7030 KOps/s | |
test_getitem[int] | 34.4840μs | 13.1768μs | 75.8909 KOps/s | 74.9631 KOps/s | |
test_getitem[slice_int] | 61.9260μs | 25.9399μs | 38.5507 KOps/s | 38.4083 KOps/s | |
test_getitem[range] | 0.1022ms | 44.3927μs | 22.5262 KOps/s | 22.0677 KOps/s | |
test_getitem[tuple] | 68.7390μs | 20.0260μs | 49.9352 KOps/s | 48.5040 KOps/s | |
test_getitem[list] | 0.1069ms | 40.2161μs | 24.8656 KOps/s | 24.5739 KOps/s | |
test_setitem_dim[int] | 51.7870μs | 28.1146μs | 35.5687 KOps/s | 33.4296 KOps/s | |
test_setitem_dim[slice_int] | 80.5610μs | 53.0344μs | 18.8557 KOps/s | 18.6405 KOps/s | |
test_setitem_dim[range] | 0.1095ms | 69.6835μs | 14.3506 KOps/s | 13.5184 KOps/s | |
test_setitem_dim[tuple] | 66.0830μs | 41.2589μs | 24.2372 KOps/s | 23.5753 KOps/s | |
test_setitem | 0.1347ms | 18.3469μs | 54.5051 KOps/s | 52.3642 KOps/s | |
test_set | 0.1048ms | 17.7077μs | 56.4727 KOps/s | 53.4342 KOps/s | |
test_set_shared | 3.3297ms | 0.1443ms | 6.9281 KOps/s | 6.6625 KOps/s | |
test_update | 0.1722ms | 19.2014μs | 52.0796 KOps/s | 48.5285 KOps/s | |
test_update_nested | 0.1163ms | 26.7969μs | 37.3177 KOps/s | 34.9906 KOps/s | |
test_set_nested | 0.1045ms | 19.4948μs | 51.2958 KOps/s | 48.4564 KOps/s | |
test_set_nested_new | 0.1165ms | 24.5027μs | 40.8119 KOps/s | 37.6817 KOps/s | |
test_select | 0.1525ms | 50.3697μs | 19.8532 KOps/s | 19.3081 KOps/s | |
test_unbind_speed | 0.5041ms | 0.3696ms | 2.7058 KOps/s | 2.6641 KOps/s | |
test_unbind_speed_stack0 | 77.6024ms | 4.8405ms | 206.5923 Ops/s | 215.0144 Ops/s | |
test_unbind_speed_stack1 | 1.9647μs | 0.6467μs | 1.5464 MOps/s | 1.4957 MOps/s | |
test_split | 67.3025ms | 1.7939ms | 557.4445 Ops/s | 539.8391 Ops/s | |
test_chunk | 69.5471ms | 1.7569ms | 569.1771 Ops/s | 554.8260 Ops/s | |
test_creation[device0] | 0.7357ms | 0.2995ms | 3.3387 KOps/s | 3.3473 KOps/s | |
test_creation_from_tensor | 4.8286ms | 0.3399ms | 2.9423 KOps/s | 2.9828 KOps/s | |
test_add_one[memmap_tensor0] | 95.2280μs | 25.8432μs | 38.6949 KOps/s | 38.7606 KOps/s | |
test_contiguous[memmap_tensor0] | 24.6160μs | 5.6784μs | 176.1065 KOps/s | 175.9022 KOps/s | |
test_stack[memmap_tensor0] | 0.1246ms | 19.4475μs | 51.4204 KOps/s | 51.9102 KOps/s | |
test_memmaptd_index | 0.2746ms | 0.2013ms | 4.9665 KOps/s | 4.9756 KOps/s | |
test_memmaptd_index_astensor | 0.3811ms | 0.2624ms | 3.8107 KOps/s | 3.8317 KOps/s | |
test_memmaptd_index_op | 0.6397ms | 0.5149ms | 1.9420 KOps/s | 1.9159 KOps/s | |
test_reshape_pytree | 54.9520μs | 22.6726μs | 44.1061 KOps/s | 43.2106 KOps/s | |
test_reshape_td | 0.1001ms | 31.9581μs | 31.2909 KOps/s | 30.6752 KOps/s | |
test_view_pytree | 53.1800μs | 22.6555μs | 44.1394 KOps/s | 43.0792 KOps/s | |
test_view_td | 27.7820μs | 4.8826μs | 204.8097 KOps/s | 204.5462 KOps/s | |
test_unbind_pytree | 1.5129ms | 26.3819μs | 37.9048 KOps/s | 38.3672 KOps/s | |
test_unbind_td | 0.1220ms | 59.1526μs | 16.9054 KOps/s | 16.3627 KOps/s | |
test_split_pytree | 70.8620μs | 26.0919μs | 38.3261 KOps/s | 38.4914 KOps/s | |
test_split_td | 0.1065ms | 47.0455μs | 21.2560 KOps/s | 21.1228 KOps/s | |
test_add_pytree | 81.0010μs | 32.4108μs | 30.8539 KOps/s | 31.2902 KOps/s | |
test_add_td | 95.9280μs | 46.6120μs | 21.4537 KOps/s | 20.8235 KOps/s | |
test_distributed | 24.3650μs | 6.2148μs | 160.9066 KOps/s | 164.4790 KOps/s | |
test_tdmodule | 0.1357ms | 20.6830μs | 48.3488 KOps/s | 46.2319 KOps/s | |
test_tdmodule_dispatch | 0.2028ms | 40.1882μs | 24.8830 KOps/s | 24.7973 KOps/s | |
test_tdseq | 56.9760μs | 23.5455μs | 42.4710 KOps/s | 40.1976 KOps/s | |
test_tdseq_dispatch | 0.4909ms | 43.5487μs | 22.9628 KOps/s | 22.5591 KOps/s | |
test_instantiation_functorch | 1.3817ms | 1.2620ms | 792.3864 Ops/s | 761.6305 Ops/s | |
test_instantiation_td | 2.1840ms | 1.0061ms | 993.9611 Ops/s | 967.8178 Ops/s | |
test_exec_functorch | 0.2471ms | 0.1566ms | 6.3860 KOps/s | 6.1369 KOps/s | |
test_exec_functional_call | 0.2240ms | 0.1469ms | 6.8061 KOps/s | 6.6538 KOps/s | |
test_exec_td | 0.2255ms | 0.1430ms | 6.9937 KOps/s | 6.6993 KOps/s | |
test_exec_td_decorator | 78.4193ms | 0.2002ms | 4.9950 KOps/s | 5.3828 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2088ms | 0.8917ms | 1.1215 KOps/s | 1.1007 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.6983ms | 0.4650ms | 2.1504 KOps/s | 2.0878 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.0660ms | 0.7752ms | 1.2900 KOps/s | 1.2435 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.5836ms | 0.3890ms | 2.5710 KOps/s | 2.5353 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 2.7496ms | 1.7888ms | 559.0249 Ops/s | 544.0713 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1699ms | 0.5139ms | 1.9458 KOps/s | 1.8694 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.0782ms | 1.4733ms | 678.7629 Ops/s | 638.5780 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.2733ms | 0.4019ms | 2.4881 KOps/s | 2.4594 KOps/s |
# Conflicts: # tensordict/nn/common.py # tensordict/tensordict.py
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.5137ms | 12.7531μs | 78.4123 KOps/s | 78.5245 KOps/s | |
test_plain_set_stack_nested | 0.1564ms | 0.1159ms | 8.6262 KOps/s | 8.3392 KOps/s | |
test_plain_set_nested_inplace | 35.7100μs | 14.0447μs | 71.2011 KOps/s | 70.6553 KOps/s | |
test_plain_set_stack_nested_inplace | 0.1721ms | 0.1446ms | 6.9136 KOps/s | 6.8927 KOps/s | |
test_items | 60.4200μs | 4.7425μs | 210.8606 KOps/s | 213.0502 KOps/s | |
test_items_nested | 0.3832ms | 0.3358ms | 2.9779 KOps/s | 2.9476 KOps/s | |
test_items_nested_locked | 0.3889ms | 0.3408ms | 2.9345 KOps/s | 2.9129 KOps/s | |
test_items_nested_leaf | 0.2187ms | 0.1984ms | 5.0405 KOps/s | 4.9850 KOps/s | |
test_items_stack_nested | 1.7790ms | 1.4971ms | 667.9760 Ops/s | 670.9979 Ops/s | |
test_items_stack_nested_leaf | 1.3719ms | 1.3198ms | 757.6999 Ops/s | 763.0325 Ops/s | |
test_items_stack_nested_locked | 0.8804ms | 0.8397ms | 1.1909 KOps/s | 1.1831 KOps/s | |
test_keys | 19.7700μs | 4.6033μs | 217.2354 KOps/s | 216.0365 KOps/s | |
test_keys_nested | 3.3810ms | 90.6469μs | 11.0318 KOps/s | 11.1171 KOps/s | |
test_keys_nested_locked | 0.1204ms | 90.0355μs | 11.1067 KOps/s | 11.1980 KOps/s | |
test_keys_nested_leaf | 41.1171ms | 87.2258μs | 11.4645 KOps/s | 12.2559 KOps/s | |
test_keys_stack_nested | 1.3705ms | 1.2988ms | 769.9358 Ops/s | 759.5057 Ops/s | |
test_keys_stack_nested_leaf | 1.3339ms | 1.2860ms | 777.6083 Ops/s | 775.1688 Ops/s | |
test_keys_stack_nested_locked | 0.6835ms | 0.6386ms | 1.5659 KOps/s | 1.5579 KOps/s | |
test_values | 11.6567μs | 1.8739μs | 533.6542 KOps/s | 527.7192 KOps/s | |
test_values_nested | 63.0410μs | 43.3559μs | 23.0649 KOps/s | 23.1825 KOps/s | |
test_values_nested_locked | 71.4210μs | 45.7255μs | 21.8696 KOps/s | 21.9513 KOps/s | |
test_values_nested_leaf | 58.4310μs | 37.6098μs | 26.5888 KOps/s | 26.6280 KOps/s | |
test_values_stack_nested | 1.1903ms | 1.1546ms | 866.1262 Ops/s | 874.1020 Ops/s | |
test_values_stack_nested_leaf | 1.2249ms | 1.1322ms | 883.2507 Ops/s | 888.0929 Ops/s | |
test_values_stack_nested_locked | 0.5473ms | 0.5133ms | 1.9483 KOps/s | 1.9729 KOps/s | |
test_membership | 5.1202μs | 0.9479μs | 1.0550 MOps/s | 1.0326 MOps/s | |
test_membership_nested | 51.5510μs | 2.2511μs | 444.2255 KOps/s | 445.8706 KOps/s | |
test_membership_nested_leaf | 21.5305μs | 2.1482μs | 465.5068 KOps/s | 465.2635 KOps/s | |
test_membership_stacked_nested | 53.7410μs | 11.1687μs | 89.5361 KOps/s | 89.7294 KOps/s | |
test_membership_stacked_nested_leaf | 61.8910μs | 11.1228μs | 89.9054 KOps/s | 90.0321 KOps/s | |
test_membership_nested_last | 31.2300μs | 4.6532μs | 214.9038 KOps/s | 216.9416 KOps/s | |
test_membership_nested_leaf_last | 31.5300μs | 4.6738μs | 213.9578 KOps/s | 215.4698 KOps/s | |
test_membership_stacked_nested_last | 0.1603ms | 0.1345ms | 7.4359 KOps/s | 7.4054 KOps/s | |
test_membership_stacked_nested_leaf_last | 46.4400μs | 13.0688μs | 76.5183 KOps/s | 77.2260 KOps/s | |
test_nested_getleaf | 56.2810μs | 8.3933μs | 119.1424 KOps/s | 118.6573 KOps/s | |
test_nested_get | 23.3000μs | 7.9481μs | 125.8166 KOps/s | 125.6936 KOps/s | |
test_stacked_getleaf | 0.5993ms | 0.5702ms | 1.7538 KOps/s | 1.7798 KOps/s | |
test_stacked_get | 0.7201ms | 0.5291ms | 1.8899 KOps/s | 1.8945 KOps/s | |
test_nested_getitemleaf | 30.9810μs | 8.4792μs | 117.9350 KOps/s | 118.3127 KOps/s | |
test_nested_getitem | 32.6800μs | 8.0201μs | 124.6874 KOps/s | 125.1163 KOps/s | |
test_stacked_getitemleaf | 0.6217ms | 0.5633ms | 1.7752 KOps/s | 1.7780 KOps/s | |
test_stacked_getitem | 0.5866ms | 0.5315ms | 1.8815 KOps/s | 1.8997 KOps/s | |
test_lock_nested | 3.1556ms | 0.5502ms | 1.8175 KOps/s | 1.8188 KOps/s | |
test_lock_stack_nested | 81.1771ms | 7.1552ms | 139.7585 Ops/s | 138.2470 Ops/s | |
test_unlock_nested | 2.3154ms | 0.4301ms | 2.3249 KOps/s | 2.3308 KOps/s | |
test_unlock_stack_nested | 66.8682ms | 6.2050ms | 161.1606 Ops/s | 163.1291 Ops/s | |
test_flatten_speed | 0.2235ms | 0.1860ms | 5.3775 KOps/s | 5.3501 KOps/s | |
test_unflatten_speed | 0.4021ms | 0.3656ms | 2.7349 KOps/s | 2.7563 KOps/s | |
test_common_ops | 1.1094ms | 0.6193ms | 1.6146 KOps/s | 1.6127 KOps/s | |
test_creation | 51.3110μs | 2.1167μs | 472.4301 KOps/s | 474.3712 KOps/s | |
test_creation_empty | 25.0200μs | 7.2725μs | 137.5042 KOps/s | 139.5736 KOps/s | |
test_creation_nested_1 | 30.0900μs | 9.6389μs | 103.7462 KOps/s | 105.4041 KOps/s | |
test_creation_nested_2 | 34.5300μs | 12.3994μs | 80.6488 KOps/s | 82.4804 KOps/s | |
test_clone | 86.5710μs | 14.8482μs | 67.3481 KOps/s | 66.2146 KOps/s | |
test_getitem[int] | 32.4200μs | 12.1158μs | 82.5366 KOps/s | 80.1325 KOps/s | |
test_getitem[slice_int] | 68.3000μs | 23.6910μs | 42.2101 KOps/s | 40.4622 KOps/s | |
test_getitem[range] | 69.3700μs | 42.4412μs | 23.5620 KOps/s | 24.0487 KOps/s | |
test_getitem[tuple] | 59.5110μs | 20.7867μs | 48.1078 KOps/s | 48.2770 KOps/s | |
test_getitem[list] | 0.2847ms | 38.3327μs | 26.0874 KOps/s | 26.5028 KOps/s | |
test_setitem_dim[int] | 56.5820μs | 28.1540μs | 35.5190 KOps/s | 37.0973 KOps/s | |
test_setitem_dim[slice_int] | 67.0910μs | 48.1196μs | 20.7816 KOps/s | 21.2626 KOps/s | |
test_setitem_dim[range] | 83.3910μs | 64.7650μs | 15.4404 KOps/s | 15.6291 KOps/s | |
test_setitem_dim[tuple] | 57.9910μs | 41.1404μs | 24.3070 KOps/s | 24.7801 KOps/s | |
test_setitem | 83.3700μs | 18.8871μs | 52.9463 KOps/s | 52.5325 KOps/s | |
test_set | 82.7210μs | 18.4901μs | 54.0830 KOps/s | 53.2258 KOps/s | |
test_set_shared | 2.8528ms | 0.1061ms | 9.4288 KOps/s | 8.5721 KOps/s | |
test_update | 0.1100ms | 19.7668μs | 50.5898 KOps/s | 50.1241 KOps/s | |
test_update_nested | 85.1210μs | 26.3728μs | 37.9178 KOps/s | 38.0203 KOps/s | |
test_set_nested | 77.7010μs | 19.4939μs | 51.2981 KOps/s | 47.5042 KOps/s | |
test_set_nested_new | 75.0910μs | 23.9359μs | 41.7782 KOps/s | 38.8219 KOps/s | |
test_select | 0.1628ms | 47.3357μs | 21.1257 KOps/s | 21.2659 KOps/s | |
test_to | 76.0810μs | 55.2496μs | 18.0997 KOps/s | 18.5178 KOps/s | |
test_to_nonblocking | 0.1728ms | 36.0692μs | 27.7245 KOps/s | 28.3106 KOps/s | |
test_unbind_speed | 0.4036ms | 0.3608ms | 2.7713 KOps/s | 2.8308 KOps/s | |
test_unbind_speed_stack0 | 62.3145ms | 4.2709ms | 234.1453 Ops/s | 249.6919 Ops/s | |
test_unbind_speed_stack1 | 1.6065μs | 0.5276μs | 1.8954 MOps/s | 1.9174 MOps/s | |
test_split | 54.0970ms | 1.7695ms | 565.1360 Ops/s | 568.3826 Ops/s | |
test_chunk | 53.5724ms | 1.7511ms | 571.0843 Ops/s | 573.1520 Ops/s | |
test_creation[device0] | 0.5353ms | 0.3103ms | 3.2232 KOps/s | 3.2419 KOps/s | |
test_creation[device1] | 54.9833ms | 0.3362ms | 2.9748 KOps/s | 3.2061 KOps/s | |
test_creation_from_tensor | 0.6700ms | 0.3381ms | 2.9577 KOps/s | 2.9667 KOps/s | |
test_add_one[memmap_tensor0] | 0.1535ms | 24.7970μs | 40.3275 KOps/s | 40.4631 KOps/s | |
test_add_one[memmap_tensor1] | 0.2042ms | 74.3051μs | 13.4580 KOps/s | 13.7965 KOps/s | |
test_contiguous[memmap_tensor0] | 32.0900μs | 6.1196μs | 163.4092 KOps/s | 167.1838 KOps/s | |
test_contiguous[memmap_tensor1] | 0.1662ms | 22.9552μs | 43.5631 KOps/s | 44.7876 KOps/s | |
test_stack[memmap_tensor0] | 52.0590μs | 20.1529μs | 49.6206 KOps/s | 50.0033 KOps/s | |
test_stack[memmap_tensor1] | 0.1647ms | 74.2727μs | 13.4639 KOps/s | 13.5986 KOps/s | |
test_memmaptd_index | 0.2666ms | 0.2416ms | 4.1399 KOps/s | 4.2706 KOps/s | |
test_memmaptd_index_astensor | 0.3339ms | 0.2959ms | 3.3792 KOps/s | 3.4285 KOps/s | |
test_memmaptd_index_op | 0.6462ms | 0.5885ms | 1.6991 KOps/s | 1.7267 KOps/s | |
test_reshape_pytree | 63.3090μs | 20.9864μs | 47.6500 KOps/s | 47.3137 KOps/s | |
test_reshape_td | 52.2290μs | 31.1779μs | 32.0740 KOps/s | 32.3355 KOps/s | |
test_view_pytree | 49.5190μs | 20.6910μs | 48.3301 KOps/s | 48.0140 KOps/s | |
test_view_td | 17.3690μs | 4.0640μs | 246.0654 KOps/s | 246.8623 KOps/s | |
test_unbind_pytree | 46.4500μs | 26.2314μs | 38.1222 KOps/s | 38.6842 KOps/s | |
test_unbind_td | 89.0410μs | 56.7894μs | 17.6089 KOps/s | 17.4791 KOps/s | |
test_split_pytree | 44.1600μs | 24.6879μs | 40.5056 KOps/s | 41.0953 KOps/s | |
test_split_td | 71.2610μs | 44.1815μs | 22.6339 KOps/s | 22.3686 KOps/s | |
test_add_pytree | 88.7010μs | 33.8527μs | 29.5397 KOps/s | 30.1407 KOps/s | |
test_add_td | 0.1640ms | 47.5740μs | 21.0199 KOps/s | 21.2159 KOps/s | |
test_distributed | 22.3810μs | 5.4322μs | 184.0890 KOps/s | 183.2855 KOps/s | |
test_tdmodule | 88.3210μs | 17.2330μs | 58.0282 KOps/s | 59.8419 KOps/s | |
test_tdmodule_dispatch | 0.2007ms | 33.9577μs | 29.4484 KOps/s | 30.0871 KOps/s | |
test_tdseq | 50.2100μs | 20.4098μs | 48.9961 KOps/s | 50.6939 KOps/s | |
test_tdseq_dispatch | 56.8100μs | 36.9858μs | 27.0374 KOps/s | 27.5299 KOps/s | |
test_instantiation_functorch | 1.7363ms | 1.6881ms | 592.3811 Ops/s | 591.2192 Ops/s | |
test_instantiation_td | 1.6609ms | 1.1927ms | 838.4273 Ops/s | 837.4719 Ops/s | |
test_exec_functorch | 0.2163ms | 0.1612ms | 6.2038 KOps/s | 6.1525 KOps/s | |
test_exec_functional_call | 0.2102ms | 0.1602ms | 6.2426 KOps/s | 6.2531 KOps/s | |
test_exec_td | 0.1842ms | 0.1479ms | 6.7623 KOps/s | 6.7717 KOps/s | |
test_exec_td_decorator | 0.8268ms | 0.1870ms | 5.3479 KOps/s | 5.3063 KOps/s | |
test_vmap_mlp_speed[True-True] | 1.2408ms | 1.0615ms | 942.0459 Ops/s | 953.3747 Ops/s | |
test_vmap_mlp_speed[True-False] | 0.7547ms | 0.6058ms | 1.6508 KOps/s | 1.6662 KOps/s | |
test_vmap_mlp_speed[False-True] | 1.1140ms | 0.9684ms | 1.0327 KOps/s | 1.0407 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6589ms | 0.5365ms | 1.8639 KOps/s | 1.8994 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 67.3704ms | 2.1700ms | 460.8278 Ops/s | 498.7410 Ops/s | |
test_vmap_mlp_speed_decorator[True-False] | 1.1773ms | 0.6521ms | 1.5335 KOps/s | 1.5518 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 2.2404ms | 1.7650ms | 566.5815 Ops/s | 575.9941 Ops/s | |
test_vmap_mlp_speed_decorator[False-False] | 1.0433ms | 0.5538ms | 1.8058 KOps/s | 1.8436 KOps/s | |
test_vmap_transformer_speed[True-True] | 12.6508ms | 12.5271ms | 79.8270 Ops/s | 81.4172 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.3593ms | 8.2163ms | 121.7087 Ops/s | 123.9095 Ops/s | |
test_vmap_transformer_speed[False-True] | 14.2216ms | 12.5113ms | 79.9276 Ops/s | 81.5993 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.3206ms | 8.1458ms | 122.7623 Ops/s | 125.1145 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 65.3222ms | 64.2980ms | 15.5526 Ops/s | 14.6537 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 98.4854ms | 21.4509ms | 46.6181 Ops/s | 50.9391 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 59.5347ms | 58.4805ms | 17.0997 Ops/s | 17.3815 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 21.6782ms | 19.5073ms | 51.2627 Ops/s | 52.1782 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.