-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] consistent use of non_blocking in tensordict and torch.Tensor #734
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Apr 18, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 46.0660μs | 16.0714μs | 62.2224 KOps/s | 60.1791 KOps/s | |
test_plain_set_stack_nested | 38.6630μs | 15.9613μs | 62.6517 KOps/s | 58.9656 KOps/s | |
test_plain_set_nested_inplace | 51.2960μs | 18.7123μs | 53.4408 KOps/s | 52.3316 KOps/s | |
test_plain_set_stack_nested_inplace | 40.0650μs | 19.0158μs | 52.5877 KOps/s | 51.9327 KOps/s | |
test_items | 28.6340μs | 2.4990μs | 400.1585 KOps/s | 392.8515 KOps/s | |
test_items_nested | 0.4895ms | 0.2690ms | 3.7170 KOps/s | 3.7101 KOps/s | |
test_items_nested_locked | 6.6776ms | 0.2693ms | 3.7130 KOps/s | 3.6773 KOps/s | |
test_items_nested_leaf | 0.1488ms | 76.7337μs | 13.0321 KOps/s | 12.7672 KOps/s | |
test_items_stack_nested | 1.3256ms | 0.2715ms | 3.6830 KOps/s | 3.6542 KOps/s | |
test_items_stack_nested_leaf | 0.1525ms | 78.5861μs | 12.7249 KOps/s | 13.1298 KOps/s | |
test_items_stack_nested_locked | 0.3991ms | 0.2722ms | 3.6743 KOps/s | 3.6420 KOps/s | |
test_keys | 32.2300μs | 3.9301μs | 254.4457 KOps/s | 253.7211 KOps/s | |
test_keys_nested | 0.2377ms | 0.1358ms | 7.3632 KOps/s | 7.3512 KOps/s | |
test_keys_nested_locked | 0.8783ms | 0.1408ms | 7.1007 KOps/s | 7.1427 KOps/s | |
test_keys_nested_leaf | 0.1984ms | 0.1152ms | 8.6780 KOps/s | 8.7017 KOps/s | |
test_keys_stack_nested | 0.2581ms | 0.1370ms | 7.3008 KOps/s | 7.4877 KOps/s | |
test_keys_stack_nested_leaf | 0.1944ms | 0.1149ms | 8.7055 KOps/s | 8.8714 KOps/s | |
test_keys_stack_nested_locked | 0.2690ms | 0.1410ms | 7.0911 KOps/s | 7.2576 KOps/s | |
test_values | 7.7620μs | 1.1585μs | 863.1550 KOps/s | 862.4270 KOps/s | |
test_values_nested | 0.1050ms | 50.6321μs | 19.7503 KOps/s | 19.5438 KOps/s | |
test_values_nested_locked | 91.2610μs | 50.5478μs | 19.7832 KOps/s | 19.6746 KOps/s | |
test_values_nested_leaf | 93.5150μs | 45.7974μs | 21.8353 KOps/s | 21.8742 KOps/s | |
test_values_stack_nested | 0.1432ms | 51.7879μs | 19.3095 KOps/s | 19.2234 KOps/s | |
test_values_stack_nested_leaf | 0.1004ms | 45.7943μs | 21.8368 KOps/s | 22.2885 KOps/s | |
test_values_stack_nested_locked | 83.7670μs | 51.3221μs | 19.4848 KOps/s | 19.4272 KOps/s | |
test_membership | 17.1720μs | 1.3590μs | 735.8101 KOps/s | 762.7415 KOps/s | |
test_membership_nested | 28.9140μs | 3.4539μs | 289.5249 KOps/s | 288.9927 KOps/s | |
test_membership_nested_leaf | 43.7820μs | 3.4698μs | 288.1975 KOps/s | 282.4631 KOps/s | |
test_membership_stacked_nested | 29.4950μs | 3.4930μs | 286.2843 KOps/s | 281.2370 KOps/s | |
test_membership_stacked_nested_leaf | 32.0000μs | 3.4276μs | 291.7473 KOps/s | 289.8649 KOps/s | |
test_membership_nested_last | 29.2550μs | 4.2474μs | 235.4355 KOps/s | 234.3747 KOps/s | |
test_membership_nested_leaf_last | 56.4390μs | 4.2939μs | 232.8889 KOps/s | 231.6506 KOps/s | |
test_membership_stacked_nested_last | 28.3530μs | 7.2851μs | 137.2668 KOps/s | 73.9942 KOps/s | |
test_membership_stacked_nested_leaf_last | 43.5820μs | 7.3169μs | 136.6702 KOps/s | 74.1721 KOps/s | |
test_nested_getleaf | 40.3460μs | 10.7366μs | 93.1393 KOps/s | 92.5121 KOps/s | |
test_nested_get | 32.0700μs | 10.0224μs | 99.7768 KOps/s | 96.6559 KOps/s | |
test_stacked_getleaf | 44.4640μs | 10.6192μs | 94.1689 KOps/s | 94.4023 KOps/s | |
test_stacked_get | 95.0380μs | 10.1014μs | 98.9959 KOps/s | 98.3904 KOps/s | |
test_nested_getitemleaf | 48.2500μs | 11.2256μs | 89.0819 KOps/s | 87.9032 KOps/s | |
test_nested_getitem | 37.2190μs | 10.3996μs | 96.1577 KOps/s | 96.2683 KOps/s | |
test_stacked_getitemleaf | 29.5860μs | 11.1651μs | 89.5651 KOps/s | 88.5772 KOps/s | |
test_stacked_getitem | 94.5070μs | 10.4661μs | 95.5462 KOps/s | 92.4303 KOps/s | |
test_lock_nested | 50.7443ms | 0.3951ms | 2.5308 KOps/s | 2.9129 KOps/s | |
test_lock_stack_nested | 0.4254ms | 0.3025ms | 3.3053 KOps/s | 3.4111 KOps/s | |
test_unlock_nested | 99.8771ms | 0.4499ms | 2.2227 KOps/s | 2.2003 KOps/s | |
test_unlock_stack_nested | 0.4525ms | 0.3123ms | 3.2016 KOps/s | 3.2945 KOps/s | |
test_flatten_speed | 0.5859ms | 92.5311μs | 10.8072 KOps/s | 10.8323 KOps/s | |
test_unflatten_speed | 0.6103ms | 0.4097ms | 2.4409 KOps/s | 2.4842 KOps/s | |
test_common_ops | 4.8074ms | 0.6764ms | 1.4785 KOps/s | 1.4060 KOps/s | |
test_creation | 14.6780μs | 1.9086μs | 523.9375 KOps/s | 537.1130 KOps/s | |
test_creation_empty | 33.4820μs | 8.7742μs | 113.9710 KOps/s | 99.6015 KOps/s | |
test_creation_nested_1 | 40.3060μs | 11.6237μs | 86.0311 KOps/s | 79.3169 KOps/s | |
test_creation_nested_2 | 41.2570μs | 14.8691μs | 67.2534 KOps/s | 61.4990 KOps/s | |
test_clone | 0.1097ms | 13.6995μs | 72.9953 KOps/s | 73.7388 KOps/s | |
test_getitem[int] | 29.6250μs | 11.4072μs | 87.6642 KOps/s | 86.9499 KOps/s | |
test_getitem[slice_int] | 93.9290μs | 22.7518μs | 43.9525 KOps/s | 42.7840 KOps/s | |
test_getitem[range] | 82.2040μs | 42.5656μs | 23.4932 KOps/s | 23.6840 KOps/s | |
test_getitem[tuple] | 63.1480μs | 18.8821μs | 52.9602 KOps/s | 53.6705 KOps/s | |
test_getitem[list] | 0.4537ms | 38.3524μs | 26.0740 KOps/s | 26.0151 KOps/s | |
test_setitem_dim[int] | 67.1860μs | 32.3917μs | 30.8721 KOps/s | 28.8167 KOps/s | |
test_setitem_dim[slice_int] | 0.1050ms | 57.0606μs | 17.5252 KOps/s | 16.2082 KOps/s | |
test_setitem_dim[range] | 0.1391ms | 75.3641μs | 13.2689 KOps/s | 12.7165 KOps/s | |
test_setitem_dim[tuple] | 91.6820μs | 48.2449μs | 20.7276 KOps/s | 20.2070 KOps/s | |
test_setitem | 0.1495ms | 19.3586μs | 51.6565 KOps/s | 48.5863 KOps/s | |
test_set | 0.1772ms | 19.1330μs | 52.2658 KOps/s | 49.2695 KOps/s | |
test_set_shared | 1.6746ms | 0.1416ms | 7.0620 KOps/s | 6.9757 KOps/s | |
test_update | 0.1423ms | 19.7734μs | 50.5729 KOps/s | 44.9133 KOps/s | |
test_update_nested | 0.1542ms | 27.6864μs | 36.1188 KOps/s | 32.5868 KOps/s | |
test_update__nested | 0.1385ms | 26.0721μs | 38.3551 KOps/s | 40.3502 KOps/s | |
test_set_nested | 0.1208ms | 20.1564μs | 49.6121 KOps/s | 45.1859 KOps/s | |
test_set_nested_new | 0.1459ms | 24.5539μs | 40.7267 KOps/s | 38.5447 KOps/s | |
test_select | 1.1191ms | 39.3876μs | 25.3887 KOps/s | 24.8411 KOps/s | |
test_select_nested | 0.1194ms | 59.3850μs | 16.8393 KOps/s | 16.6388 KOps/s | |
test_exclude_nested | 0.2700ms | 0.1199ms | 8.3425 KOps/s | 8.4730 KOps/s | |
test_empty[True] | 0.5607ms | 0.3903ms | 2.5620 KOps/s | 2.5637 KOps/s | |
test_empty[False] | 9.6030μs | 1.0612μs | 942.2968 KOps/s | 959.8328 KOps/s | |
test_unbind_speed | 1.8589ms | 0.2507ms | 3.9890 KOps/s | 4.0156 KOps/s | |
test_unbind_speed_stack0 | 0.4808ms | 0.2440ms | 4.0985 KOps/s | 4.1840 KOps/s | |
test_unbind_speed_stack1 | 0.1297s | 0.6927ms | 1.4436 KOps/s | 1.4856 KOps/s | |
test_split | 1.7122ms | 1.4950ms | 668.9110 Ops/s | 576.0954 Ops/s | |
test_chunk | 0.1318s | 1.6966ms | 589.4188 Ops/s | 663.9852 Ops/s | |
test_creation[device0] | 0.2467ms | 0.1020ms | 9.8000 KOps/s | 9.3962 KOps/s | |
test_creation_from_tensor | 5.7008ms | 83.0387μs | 12.0426 KOps/s | 11.8533 KOps/s | |
test_add_one[memmap_tensor0] | 0.1032ms | 5.5758μs | 179.3458 KOps/s | 171.6828 KOps/s | |
test_contiguous[memmap_tensor0] | 7.8950μs | 0.6455μs | 1.5493 MOps/s | 1.5418 MOps/s | |
test_stack[memmap_tensor0] | 39.7040μs | 3.5487μs | 281.7901 KOps/s | 271.3755 KOps/s | |
test_memmaptd_index | 0.9921ms | 0.2443ms | 4.0939 KOps/s | 4.1867 KOps/s | |
test_memmaptd_index_astensor | 0.7291ms | 0.3080ms | 3.2464 KOps/s | 3.3141 KOps/s | |
test_memmaptd_index_op | 1.1796ms | 0.5805ms | 1.7226 KOps/s | 1.6667 KOps/s | |
test_serialize_model | 0.1147s | 0.1037s | 9.6418 Ops/s | 8.2787 Ops/s | |
test_serialize_model_pickle | 0.4480s | 0.3778s | 2.6467 Ops/s | 2.6246 Ops/s | |
test_serialize_weights | 0.1102s | 0.1019s | 9.8149 Ops/s | 9.6875 Ops/s | |
test_serialize_weights_returnearly | 0.1355s | 0.1237s | 8.0819 Ops/s | 6.9568 Ops/s | |
test_serialize_weights_pickle | 0.5801s | 0.4506s | 2.2194 Ops/s | 2.3231 Ops/s | |
test_serialize_weights_filesystem | 0.2307s | 0.1127s | 8.8702 Ops/s | 10.9648 Ops/s | |
test_serialize_model_filesystem | 0.1006s | 93.4403ms | 10.7020 Ops/s | 10.3244 Ops/s | |
test_reshape_pytree | 66.5250μs | 21.1236μs | 47.3404 KOps/s | 47.0566 KOps/s | |
test_reshape_td | 74.2600μs | 33.5718μs | 29.7869 KOps/s | 30.0081 KOps/s | |
test_view_pytree | 54.7720μs | 21.0341μs | 47.5418 KOps/s | 46.5941 KOps/s | |
test_view_td | 0.1375s | 68.4285μs | 14.6138 KOps/s | 15.8680 KOps/s | |
test_unbind_pytree | 69.4900μs | 24.5754μs | 40.6911 KOps/s | 41.1327 KOps/s | |
test_unbind_td | 0.1706s | 51.6720μs | 19.3529 KOps/s | 27.6053 KOps/s | |
test_split_pytree | 56.9470μs | 23.9957μs | 41.6742 KOps/s | 42.1310 KOps/s | |
test_split_td | 0.1274ms | 41.4795μs | 24.1083 KOps/s | 24.3467 KOps/s | |
test_add_pytree | 0.1071ms | 30.0780μs | 33.2469 KOps/s | 32.0342 KOps/s | |
test_add_td | 0.1173ms | 51.3217μs | 19.4849 KOps/s | 18.6348 KOps/s | |
test_distributed | 0.2440ms | 0.1007ms | 9.9274 KOps/s | 9.7970 KOps/s | |
test_tdmodule | 81.9540μs | 16.7677μs | 59.6385 KOps/s | 57.5032 KOps/s | |
test_tdmodule_dispatch | 57.3970μs | 33.0593μs | 30.2486 KOps/s | 29.4662 KOps/s | |
test_tdseq | 46.5670μs | 19.4814μs | 51.3310 KOps/s | 49.7710 KOps/s | |
test_tdseq_dispatch | 74.1290μs | 37.7541μs | 26.4872 KOps/s | 25.7731 KOps/s | |
test_instantiation_functorch | 1.5830ms | 1.3047ms | 766.4729 Ops/s | 763.5343 Ops/s | |
test_instantiation_td | 1.6662ms | 1.0233ms | 977.2028 Ops/s | 994.6530 Ops/s | |
test_exec_functorch | 0.4182ms | 0.1646ms | 6.0738 KOps/s | 6.2512 KOps/s | |
test_exec_functional_call | 0.3173ms | 0.1479ms | 6.7591 KOps/s | 6.8374 KOps/s | |
test_exec_td | 0.2577ms | 0.1451ms | 6.8904 KOps/s | 7.0272 KOps/s | |
test_exec_td_decorator | 0.8829ms | 0.2010ms | 4.9750 KOps/s | 5.0514 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7057ms | 0.4720ms | 2.1188 KOps/s | 2.0863 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7506ms | 0.4685ms | 2.1343 KOps/s | 2.0826 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6248ms | 0.3842ms | 2.6029 KOps/s | 2.5656 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6194ms | 0.3859ms | 2.5912 KOps/s | 2.5525 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 0.8924ms | 0.4911ms | 2.0364 KOps/s | 1.9885 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7966ms | 0.4940ms | 2.0243 KOps/s | 1.9802 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6869ms | 0.4211ms | 2.3749 KOps/s | 2.4422 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.6685ms | 0.4034ms | 2.4789 KOps/s | 2.4403 KOps/s | |
test_to_module_speed[True] | 2.5723ms | 1.5608ms | 640.6998 Ops/s | 714.8989 Ops/s | |
test_to_module_speed[False] | 2.8347ms | 1.4103ms | 709.0915 Ops/s | 724.3699 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.