-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Performance] Minor efficiency improvements #703
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Mar 8, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 53.2790μs | 16.7948μs | 59.5421 KOps/s | 59.3768 KOps/s | |
test_plain_set_stack_nested | 49.0590μs | 16.8181μs | 59.4598 KOps/s | 59.8188 KOps/s | |
test_plain_set_nested_inplace | 46.6270μs | 19.1037μs | 52.3457 KOps/s | 52.5112 KOps/s | |
test_plain_set_stack_nested_inplace | 59.3610μs | 19.0214μs | 52.5724 KOps/s | 52.5418 KOps/s | |
test_items | 34.4040μs | 2.3238μs | 430.3339 KOps/s | 429.5628 KOps/s | |
test_items_nested | 0.4760ms | 0.2699ms | 3.7055 KOps/s | 3.7804 KOps/s | |
test_items_nested_locked | 0.3463ms | 0.2700ms | 3.7035 KOps/s | 3.7906 KOps/s | |
test_items_nested_leaf | 0.3395ms | 0.1654ms | 6.0472 KOps/s | 6.1905 KOps/s | |
test_items_stack_nested | 0.3429ms | 0.2711ms | 3.6884 KOps/s | 3.8277 KOps/s | |
test_items_stack_nested_leaf | 0.3059ms | 0.1663ms | 6.0124 KOps/s | 6.0949 KOps/s | |
test_items_stack_nested_locked | 0.3726ms | 0.2712ms | 3.6874 KOps/s | 3.8388 KOps/s | |
test_keys | 29.3440μs | 3.7209μs | 268.7534 KOps/s | 267.0487 KOps/s | |
test_keys_nested | 0.8200ms | 0.1439ms | 6.9501 KOps/s | 6.9868 KOps/s | |
test_keys_nested_locked | 0.2412ms | 0.1464ms | 6.8286 KOps/s | 6.7330 KOps/s | |
test_keys_nested_leaf | 37.2150ms | 0.1300ms | 7.6947 KOps/s | 8.0764 KOps/s | |
test_keys_stack_nested | 0.2113ms | 0.1437ms | 6.9614 KOps/s | 6.8573 KOps/s | |
test_keys_stack_nested_leaf | 0.2240ms | 0.1275ms | 7.8420 KOps/s | 7.7884 KOps/s | |
test_keys_stack_nested_locked | 0.2172ms | 0.1533ms | 6.5247 KOps/s | 6.7255 KOps/s | |
test_values | 6.8446μs | 1.1218μs | 891.4559 KOps/s | 880.9146 KOps/s | |
test_values_nested | 96.0890μs | 50.8660μs | 19.6595 KOps/s | 20.2135 KOps/s | |
test_values_nested_locked | 95.1760μs | 50.6384μs | 19.7478 KOps/s | 19.6699 KOps/s | |
test_values_nested_leaf | 84.9880μs | 45.7654μs | 21.8506 KOps/s | 22.0703 KOps/s | |
test_values_stack_nested | 0.1005ms | 51.0894μs | 19.5735 KOps/s | 19.9743 KOps/s | |
test_values_stack_nested_leaf | 93.1330μs | 45.5138μs | 21.9714 KOps/s | 22.3772 KOps/s | |
test_values_stack_nested_locked | 0.1183ms | 51.7904μs | 19.3086 KOps/s | 19.5553 KOps/s | |
test_membership | 28.9730μs | 1.3292μs | 752.3521 KOps/s | 769.7910 KOps/s | |
test_membership_nested | 27.7020μs | 3.3711μs | 296.6392 KOps/s | 291.9265 KOps/s | |
test_membership_nested_leaf | 34.6040μs | 3.4143μs | 292.8817 KOps/s | 289.8442 KOps/s | |
test_membership_stacked_nested | 26.9800μs | 3.4044μs | 293.7408 KOps/s | 241.0081 KOps/s | |
test_membership_stacked_nested_leaf | 26.1180μs | 3.3952μs | 294.5316 KOps/s | 283.1872 KOps/s | |
test_membership_nested_last | 33.5620μs | 4.1875μs | 238.8080 KOps/s | 233.0439 KOps/s | |
test_membership_nested_leaf_last | 36.0870μs | 4.2018μs | 237.9932 KOps/s | 235.0371 KOps/s | |
test_membership_stacked_nested_last | 23.5830μs | 4.1911μs | 238.6009 KOps/s | 240.3064 KOps/s | |
test_membership_stacked_nested_leaf_last | 32.7110μs | 4.1781μs | 239.3413 KOps/s | 237.6010 KOps/s | |
test_nested_getleaf | 47.2370μs | 10.5358μs | 94.9145 KOps/s | 97.3747 KOps/s | |
test_nested_get | 41.0060μs | 9.8264μs | 101.7662 KOps/s | 100.5304 KOps/s | |
test_stacked_getleaf | 51.7540μs | 10.9491μs | 91.3314 KOps/s | 94.1009 KOps/s | |
test_stacked_get | 27.3410μs | 9.9294μs | 100.7115 KOps/s | 103.9720 KOps/s | |
test_nested_getitemleaf | 35.4360μs | 10.9399μs | 91.4085 KOps/s | 91.2843 KOps/s | |
test_nested_getitem | 30.7370μs | 10.4969μs | 95.2663 KOps/s | 96.6390 KOps/s | |
test_stacked_getitemleaf | 30.7770μs | 10.9587μs | 91.2518 KOps/s | 91.8325 KOps/s | |
test_stacked_getitem | 45.3540μs | 10.3934μs | 96.2151 KOps/s | 96.8220 KOps/s | |
test_lock_nested | 0.7058ms | 0.3246ms | 3.0808 KOps/s | 3.0284 KOps/s | |
test_lock_stack_nested | 0.4135ms | 0.2957ms | 3.3815 KOps/s | 3.3492 KOps/s | |
test_unlock_nested | 74.9622ms | 0.4062ms | 2.4621 KOps/s | 2.4357 KOps/s | |
test_unlock_stack_nested | 0.5010ms | 0.3047ms | 3.2814 KOps/s | 3.2523 KOps/s | |
test_flatten_speed | 0.5951ms | 0.2832ms | 3.5315 KOps/s | 3.6646 KOps/s | |
test_unflatten_speed | 0.6101ms | 0.4025ms | 2.4846 KOps/s | 2.5987 KOps/s | |
test_common_ops | 4.9187ms | 0.7067ms | 1.4150 KOps/s | 1.4967 KOps/s | |
test_creation | 16.3200μs | 1.7963μs | 556.7074 KOps/s | 570.5859 KOps/s | |
test_creation_empty | 48.0990μs | 10.9198μs | 91.5770 KOps/s | 105.6021 KOps/s | |
test_creation_nested_1 | 36.2770μs | 13.4935μs | 74.1098 KOps/s | 80.7965 KOps/s | |
test_creation_nested_2 | 44.8630μs | 16.5979μs | 60.2487 KOps/s | 63.7997 KOps/s | |
test_clone | 59.5810μs | 12.7096μs | 78.6806 KOps/s | 76.9423 KOps/s | |
test_getitem[int] | 37.7200μs | 10.6934μs | 93.5153 KOps/s | 90.8638 KOps/s | |
test_getitem[slice_int] | 60.3920μs | 22.0614μs | 45.3280 KOps/s | 44.9791 KOps/s | |
test_getitem[range] | 0.1532ms | 43.4105μs | 23.0359 KOps/s | 24.1646 KOps/s | |
test_getitem[tuple] | 49.3120μs | 18.0811μs | 55.3065 KOps/s | 53.9962 KOps/s | |
test_getitem[list] | 0.1613ms | 37.4906μs | 26.6734 KOps/s | 27.8949 KOps/s | |
test_setitem_dim[int] | 67.1940μs | 36.2458μs | 27.5894 KOps/s | 31.1813 KOps/s | |
test_setitem_dim[slice_int] | 0.1030ms | 63.1862μs | 15.8262 KOps/s | 17.4055 KOps/s | |
test_setitem_dim[range] | 0.1147ms | 83.2869μs | 12.0067 KOps/s | 12.6599 KOps/s | |
test_setitem_dim[tuple] | 87.0320μs | 51.5092μs | 19.4140 KOps/s | 20.3265 KOps/s | |
test_setitem | 73.4970μs | 20.0347μs | 49.9133 KOps/s | 53.5022 KOps/s | |
test_set | 70.9820μs | 19.4956μs | 51.2935 KOps/s | 55.0577 KOps/s | |
test_set_shared | 3.4358ms | 0.1388ms | 7.2039 KOps/s | 7.1631 KOps/s | |
test_update | 87.7030μs | 22.8901μs | 43.6869 KOps/s | 46.3136 KOps/s | |
test_update_nested | 84.5670μs | 29.7726μs | 33.5879 KOps/s | 35.6086 KOps/s | |
test_set_nested | 83.8750μs | 21.5288μs | 46.4494 KOps/s | 49.7746 KOps/s | |
test_set_nested_new | 60.9830μs | 24.8120μs | 40.3031 KOps/s | 42.3334 KOps/s | |
test_select | 86.7010μs | 39.1428μs | 25.5475 KOps/s | 26.8230 KOps/s | |
test_select_nested | 0.1586ms | 58.1795μs | 17.1882 KOps/s | 17.4385 KOps/s | |
test_exclude_nested | 0.2672ms | 0.1168ms | 8.5590 KOps/s | 8.8690 KOps/s | |
test_empty[True] | 0.7846ms | 0.4040ms | 2.4754 KOps/s | 2.4846 KOps/s | |
test_empty[False] | 5.0374μs | 1.0205μs | 979.9268 KOps/s | 1.0069 MOps/s | |
test_unbind_speed | 0.5116ms | 0.2383ms | 4.1959 KOps/s | 4.1918 KOps/s | |
test_unbind_speed_stack0 | 0.3919ms | 0.2385ms | 4.1926 KOps/s | 4.1322 KOps/s | |
test_unbind_speed_stack1 | 0.1211s | 0.6744ms | 1.4827 KOps/s | 1.4973 KOps/s | |
test_split | 0.1052s | 1.5880ms | 629.7209 Ops/s | 629.3406 Ops/s | |
test_chunk | 1.5112ms | 1.3812ms | 723.9841 Ops/s | 714.4765 Ops/s | |
test_creation[device0] | 3.3523ms | 0.1024ms | 9.7633 KOps/s | 9.8519 KOps/s | |
test_creation_from_tensor | 0.1711ms | 80.5514μs | 12.4144 KOps/s | 12.2989 KOps/s | |
test_add_one[memmap_tensor0] | 92.6520μs | 5.2961μs | 188.8179 KOps/s | 188.1702 KOps/s | |
test_contiguous[memmap_tensor0] | 10.8910μs | 0.6134μs | 1.6302 MOps/s | 1.6020 MOps/s | |
test_stack[memmap_tensor0] | 21.0890μs | 3.4342μs | 291.1913 KOps/s | 269.0497 KOps/s | |
test_memmaptd_index | 1.1414ms | 0.2298ms | 4.3517 KOps/s | 4.1371 KOps/s | |
test_memmaptd_index_astensor | 0.6284ms | 0.2922ms | 3.4218 KOps/s | 3.3254 KOps/s | |
test_memmaptd_index_op | 1.1774ms | 0.5923ms | 1.6884 KOps/s | 1.7039 KOps/s | |
test_serialize_model | 0.2081s | 0.1115s | 8.9663 Ops/s | 8.8171 Ops/s | |
test_serialize_model_pickle | 0.4524s | 0.3753s | 2.6643 Ops/s | 2.6335 Ops/s | |
test_serialize_weights | 0.1085s | 0.1003s | 9.9728 Ops/s | 10.2583 Ops/s | |
test_serialize_weights_returnearly | 0.2344s | 0.1314s | 7.6086 Ops/s | 7.2721 Ops/s | |
test_serialize_weights_pickle | 1.0109s | 0.5860s | 1.7065 Ops/s | 2.4168 Ops/s | |
test_serialize_weights_filesystem | 0.1019s | 91.5194ms | 10.9266 Ops/s | 10.6056 Ops/s | |
test_serialize_model_filesystem | 95.2376ms | 89.9942ms | 11.1118 Ops/s | 10.7953 Ops/s | |
test_reshape_pytree | 45.2840μs | 20.6860μs | 48.3420 KOps/s | 50.3734 KOps/s | |
test_reshape_td | 81.8830μs | 30.8450μs | 32.4202 KOps/s | 33.2837 KOps/s | |
test_view_pytree | 67.8360μs | 20.6098μs | 48.5206 KOps/s | 50.1535 KOps/s | |
test_view_td | 0.1202s | 58.1472μs | 17.1977 KOps/s | 17.9954 KOps/s | |
test_unbind_pytree | 71.5730μs | 24.2345μs | 41.2635 KOps/s | 44.7255 KOps/s | |
test_unbind_td | 0.4184ms | 35.6790μs | 28.0277 KOps/s | 26.5691 KOps/s | |
test_split_pytree | 69.3490μs | 23.8176μs | 41.9857 KOps/s | 43.5307 KOps/s | |
test_split_td | 0.1480ms | 38.9018μs | 25.7058 KOps/s | 26.1930 KOps/s | |
test_add_pytree | 66.8140μs | 29.5203μs | 33.8750 KOps/s | 35.5674 KOps/s | |
test_add_td | 0.1644ms | 56.0561μs | 17.8393 KOps/s | 19.6850 KOps/s | |
test_distributed | 0.3782ms | 98.5473μs | 10.1474 KOps/s | 10.3650 KOps/s | |
test_tdmodule | 68.4770μs | 18.1236μs | 55.1766 KOps/s | 58.1555 KOps/s | |
test_tdmodule_dispatch | 60.7220μs | 34.1705μs | 29.2650 KOps/s | 31.7089 KOps/s | |
test_tdseq | 49.4720μs | 20.8243μs | 48.0209 KOps/s | 52.0778 KOps/s | |
test_tdseq_dispatch | 72.6850μs | 39.7685μs | 25.1455 KOps/s | 26.4804 KOps/s | |
test_instantiation_functorch | 2.0346ms | 1.3031ms | 767.4175 Ops/s | 790.6928 Ops/s | |
test_instantiation_td | 6.3793ms | 0.9908ms | 1.0093 KOps/s | 1.0459 KOps/s | |
test_exec_functorch | 0.2480ms | 0.1574ms | 6.3537 KOps/s | 6.4934 KOps/s | |
test_exec_functional_call | 0.2908ms | 0.1496ms | 6.6836 KOps/s | 7.2649 KOps/s | |
test_exec_td | 0.3435ms | 0.1415ms | 7.0648 KOps/s | 7.3531 KOps/s | |
test_exec_td_decorator | 0.5375ms | 0.1941ms | 5.1531 KOps/s | 5.4406 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.6527ms | 0.4758ms | 2.1018 KOps/s | 2.2168 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8998ms | 0.4759ms | 2.1012 KOps/s | 2.2206 KOps/s | |
test_vmap_mlp_speed[False-True] | 2.8308ms | 0.3911ms | 2.5568 KOps/s | 2.7174 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6258ms | 0.3854ms | 2.5944 KOps/s | 2.7411 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.9554ms | 0.4963ms | 2.0147 KOps/s | 2.1279 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7598ms | 0.4954ms | 2.0187 KOps/s | 2.1263 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.5445ms | 0.4004ms | 2.4977 KOps/s | 2.6137 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7839ms | 0.4021ms | 2.4871 KOps/s | 2.6271 KOps/s | |
test_to_module_speed[True] | 2.1659ms | 1.3673ms | 731.3583 Ops/s | 756.6089 Ops/s | |
test_to_module_speed[False] | 1.8075ms | 1.3624ms | 733.9827 Ops/s | 746.4541 Ops/s |
vmoens
added a commit
that referenced
this pull request
Mar 24, 2024
vmoens
added a commit
that referenced
this pull request
Mar 24, 2024
(cherry picked from commit be7c991)
vmoens
added a commit
that referenced
this pull request
Mar 25, 2024
(cherry picked from commit be7c991)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Performance
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.