-
Notifications
You must be signed in to change notification settings - Fork 74
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BugFix] Fix td device sync when error is raised #988
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
vmoens
added a commit
that referenced
this pull request
Sep 12, 2024
ghstack-source-id: d0e810c71ca1c9945561ca5a9e71cb71445095e4 Pull Request resolved: #988
facebook-github-bot
added
the
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
label
Sep 12, 2024
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 54.7420μs | 20.5189μs | 48.7355 KOps/s | 48.0005 KOps/s | |
test_plain_set_stack_nested | 45.1240μs | 20.7673μs | 48.1526 KOps/s | 47.9990 KOps/s | |
test_plain_set_nested_inplace | 77.2840μs | 22.5869μs | 44.2734 KOps/s | 44.3374 KOps/s | |
test_plain_set_stack_nested_inplace | 78.1240μs | 22.3068μs | 44.8293 KOps/s | 43.9829 KOps/s | |
test_items | 21.5610μs | 4.2193μs | 237.0056 KOps/s | 238.4673 KOps/s | |
test_items_nested | 0.4905ms | 0.3275ms | 3.0533 KOps/s | 3.0304 KOps/s | |
test_items_nested_locked | 0.5218ms | 0.3273ms | 3.0553 KOps/s | 2.9969 KOps/s | |
test_items_nested_leaf | 0.1467ms | 85.5178μs | 11.6935 KOps/s | 11.9260 KOps/s | |
test_items_stack_nested | 0.5091ms | 0.3320ms | 3.0119 KOps/s | 2.9874 KOps/s | |
test_items_stack_nested_leaf | 0.1469ms | 83.1171μs | 12.0312 KOps/s | 11.6643 KOps/s | |
test_items_stack_nested_locked | 0.4709ms | 0.3316ms | 3.0161 KOps/s | 2.9518 KOps/s | |
test_keys | 22.7820μs | 3.6360μs | 275.0237 KOps/s | 284.0990 KOps/s | |
test_keys_nested | 0.1728ms | 96.3639μs | 10.3773 KOps/s | 10.1298 KOps/s | |
test_keys_nested_locked | 1.6647ms | 0.1038ms | 9.6312 KOps/s | 9.7220 KOps/s | |
test_keys_nested_leaf | 0.1566ms | 83.4445μs | 11.9840 KOps/s | 12.1161 KOps/s | |
test_keys_stack_nested | 0.1771ms | 95.9299μs | 10.4243 KOps/s | 10.3688 KOps/s | |
test_keys_stack_nested_leaf | 0.1622ms | 80.3740μs | 12.4418 KOps/s | 12.1579 KOps/s | |
test_keys_stack_nested_locked | 0.1679ms | 99.1356μs | 10.0872 KOps/s | 9.8312 KOps/s | |
test_values | 5.3800μs | 1.0825μs | 923.7746 KOps/s | 902.1684 KOps/s | |
test_values_nested | 95.9290μs | 47.7403μs | 20.9467 KOps/s | 21.0168 KOps/s | |
test_values_nested_locked | 90.6200μs | 47.6515μs | 20.9857 KOps/s | 20.9511 KOps/s | |
test_values_nested_leaf | 83.2560μs | 42.3830μs | 23.5944 KOps/s | 23.4996 KOps/s | |
test_values_stack_nested | 94.0860μs | 47.4899μs | 21.0571 KOps/s | 20.8287 KOps/s | |
test_values_stack_nested_leaf | 80.4400μs | 41.2398μs | 24.2484 KOps/s | 23.9887 KOps/s | |
test_values_stack_nested_locked | 98.1940μs | 47.8084μs | 20.9168 KOps/s | 20.5985 KOps/s | |
test_membership | 2.6315μs | 0.6805μs | 1.4695 MOps/s | 1.2244 MOps/s | |
test_membership_nested | 22.0510μs | 2.6874μs | 372.1018 KOps/s | 377.6579 KOps/s | |
test_membership_nested_leaf | 39.4140μs | 2.6865μs | 372.2378 KOps/s | 377.8384 KOps/s | |
test_membership_stacked_nested | 25.7880μs | 2.6886μs | 371.9460 KOps/s | 381.8723 KOps/s | |
test_membership_stacked_nested_leaf | 30.5070μs | 2.7030μs | 369.9598 KOps/s | 380.5307 KOps/s | |
test_membership_nested_last | 42.3410μs | 3.8186μs | 261.8777 KOps/s | 263.0406 KOps/s | |
test_membership_nested_leaf_last | 44.3230μs | 3.8904μs | 257.0431 KOps/s | 260.3649 KOps/s | |
test_membership_stacked_nested_last | 33.8630μs | 12.9106μs | 77.4558 KOps/s | 259.7614 KOps/s | |
test_membership_stacked_nested_leaf_last | 63.1880μs | 12.8078μs | 78.0772 KOps/s | 260.4866 KOps/s | |
test_nested_getleaf | 53.6010μs | 10.8064μs | 92.5378 KOps/s | 90.8269 KOps/s | |
test_nested_get | 52.7090μs | 10.2751μs | 97.3230 KOps/s | 96.0310 KOps/s | |
test_stacked_getleaf | 55.6760μs | 10.7996μs | 92.5964 KOps/s | 94.2640 KOps/s | |
test_stacked_get | 52.8090μs | 10.3581μs | 96.5427 KOps/s | 98.4362 KOps/s | |
test_nested_getitemleaf | 56.3350μs | 11.2502μs | 88.8873 KOps/s | 89.0524 KOps/s | |
test_nested_getitem | 44.4640μs | 10.4940μs | 95.2927 KOps/s | 96.1441 KOps/s | |
test_stacked_getitemleaf | 50.7150μs | 11.1777μs | 89.4639 KOps/s | 90.2674 KOps/s | |
test_stacked_getitem | 41.6680μs | 10.5202μs | 95.0549 KOps/s | 96.8811 KOps/s | |
test_lock_nested | 79.6208ms | 0.5526ms | 1.8097 KOps/s | 2.1453 KOps/s | |
test_lock_stack_nested | 0.6781ms | 0.4287ms | 2.3325 KOps/s | 2.2621 KOps/s | |
test_unlock_nested | 83.2885ms | 0.4807ms | 2.0804 KOps/s | 2.5126 KOps/s | |
test_unlock_stack_nested | 0.5348ms | 0.3495ms | 2.8611 KOps/s | 2.7414 KOps/s | |
test_flatten_speed | 0.1901ms | 0.1040ms | 9.6180 KOps/s | 9.6642 KOps/s | |
test_unflatten_speed | 0.8218ms | 0.4580ms | 2.1834 KOps/s | 2.1849 KOps/s | |
test_common_ops | 4.0124ms | 1.0845ms | 922.0705 Ops/s | 919.5362 Ops/s | |
test_creation | 19.5670μs | 2.1154μs | 472.7249 KOps/s | 492.9698 KOps/s | |
test_creation_empty | 49.2930μs | 17.8537μs | 56.0109 KOps/s | 54.8276 KOps/s | |
test_creation_nested_1 | 49.7130μs | 20.9418μs | 47.7513 KOps/s | 46.6025 KOps/s | |
test_creation_nested_2 | 57.5380μs | 24.9276μs | 40.1162 KOps/s | 38.6895 KOps/s | |
test_clone | 77.4440μs | 16.5460μs | 60.4376 KOps/s | 60.1801 KOps/s | |
test_getitem[int] | 1.0814ms | 16.9326μs | 59.0575 KOps/s | 59.5212 KOps/s | |
test_getitem[slice_int] | 0.1338ms | 31.0954μs | 32.1591 KOps/s | 32.8472 KOps/s | |
test_getitem[range] | 0.1926ms | 58.0244μs | 17.2341 KOps/s | 18.1600 KOps/s | |
test_getitem[tuple] | 0.1305ms | 25.0993μs | 39.8417 KOps/s | 39.7057 KOps/s | |
test_getitem[list] | 0.2121ms | 54.1091μs | 18.4812 KOps/s | 19.6785 KOps/s | |
test_setitem_dim[int] | 60.5530μs | 32.3482μs | 30.9136 KOps/s | 30.8577 KOps/s | |
test_setitem_dim[slice_int] | 0.1127ms | 60.9405μs | 16.4094 KOps/s | 17.0364 KOps/s | |
test_setitem_dim[range] | 0.1437ms | 85.0727μs | 11.7547 KOps/s | 12.3442 KOps/s | |
test_setitem_dim[tuple] | 75.3810μs | 48.6891μs | 20.5385 KOps/s | 21.0064 KOps/s | |
test_setitem | 0.1096ms | 29.4906μs | 33.9091 KOps/s | 33.6360 KOps/s | |
test_set | 0.1163ms | 28.9492μs | 34.5432 KOps/s | 35.0636 KOps/s | |
test_set_shared | 3.0329ms | 0.2092ms | 4.7797 KOps/s | 4.7277 KOps/s | |
test_update | 0.1473ms | 34.9365μs | 28.6233 KOps/s | 28.0573 KOps/s | |
test_update_nested | 0.1228ms | 45.5543μs | 21.9518 KOps/s | 21.6739 KOps/s | |
test_update__nested | 89.5780μs | 33.1523μs | 30.1638 KOps/s | 29.1284 KOps/s | |
test_set_nested | 92.8840μs | 31.1212μs | 32.1325 KOps/s | 31.4579 KOps/s | |
test_set_nested_new | 0.1158ms | 36.4982μs | 27.3986 KOps/s | 27.1990 KOps/s | |
test_select | 0.2076ms | 54.8420μs | 18.2342 KOps/s | 18.3415 KOps/s | |
test_select_nested | 0.1345ms | 59.1843μs | 16.8964 KOps/s | 16.5042 KOps/s | |
test_exclude_nested | 0.1563ms | 76.2837μs | 13.1090 KOps/s | 13.1098 KOps/s | |
test_empty[True] | 0.4850ms | 0.3104ms | 3.2220 KOps/s | 3.1090 KOps/s | |
test_empty[False] | 10.8930μs | 1.2818μs | 780.1828 KOps/s | 821.3826 KOps/s | |
test_unbind_speed | 0.5340ms | 0.2978ms | 3.3579 KOps/s | 3.4009 KOps/s | |
test_unbind_speed_stack0 | 0.5589ms | 0.2844ms | 3.5163 KOps/s | 3.4628 KOps/s | |
test_unbind_speed_stack1 | 86.9151ms | 0.7581ms | 1.3191 KOps/s | 1.3923 KOps/s | |
test_split | 2.1831ms | 2.0384ms | 490.5774 Ops/s | 461.7836 Ops/s | |
test_chunk | 87.1319ms | 2.2013ms | 454.2785 Ops/s | 463.0762 Ops/s | |
test_creation[device0] | 0.2178ms | 0.1164ms | 8.5933 KOps/s | 8.5431 KOps/s | |
test_creation_from_tensor | 3.1833ms | 0.1176ms | 8.5012 KOps/s | 8.5174 KOps/s | |
test_add_one[memmap_tensor0] | 0.1517ms | 7.5495μs | 132.4589 KOps/s | 133.0456 KOps/s | |
test_contiguous[memmap_tensor0] | 19.3960μs | 1.8624μs | 536.9508 KOps/s | 529.6316 KOps/s | |
test_stack[memmap_tensor0] | 46.7170μs | 5.9816μs | 167.1793 KOps/s | 174.0996 KOps/s | |
test_memmaptd_index | 1.1287ms | 0.3999ms | 2.5008 KOps/s | 2.5687 KOps/s | |
test_memmaptd_index_astensor | 0.8051ms | 0.4801ms | 2.0828 KOps/s | 2.1346 KOps/s | |
test_memmaptd_index_op | 1.7907ms | 1.0153ms | 984.9176 Ops/s | 983.2601 Ops/s | |
test_serialize_model | 0.2167s | 0.1284s | 7.7878 Ops/s | 8.3014 Ops/s | |
test_serialize_model_pickle | 0.5104s | 0.4066s | 2.4592 Ops/s | 2.5058 Ops/s | |
test_serialize_weights | 0.1218s | 0.1134s | 8.8209 Ops/s | 7.5571 Ops/s | |
test_serialize_weights_returnearly | 0.2368s | 0.1699s | 5.8845 Ops/s | 6.3495 Ops/s | |
test_serialize_weights_pickle | 0.6323s | 0.4458s | 2.2434 Ops/s | 2.3844 Ops/s | |
test_serialize_weights_filesystem | 0.1448s | 0.1386s | 7.2159 Ops/s | 7.1621 Ops/s | |
test_serialize_model_filesystem | 0.1540s | 0.1437s | 6.9598 Ops/s | 5.8337 Ops/s | |
test_reshape_pytree | 87.9440μs | 39.2741μs | 25.4621 KOps/s | 26.2823 KOps/s | |
test_reshape_td | 0.1157ms | 44.5145μs | 22.4646 KOps/s | 21.9343 KOps/s | |
test_view_pytree | 99.4860μs | 38.0850μs | 26.2570 KOps/s | 26.5735 KOps/s | |
test_view_td | 0.3465ms | 49.5518μs | 20.1809 KOps/s | 19.1426 KOps/s | |
test_unbind_pytree | 93.1040μs | 36.0144μs | 27.7667 KOps/s | 27.7847 KOps/s | |
test_unbind_td | 0.3031ms | 44.0363μs | 22.7085 KOps/s | 22.6118 KOps/s | |
test_split_pytree | 0.2604ms | 40.9397μs | 24.4262 KOps/s | 26.7127 KOps/s | |
test_split_td | 0.4852ms | 57.7773μs | 17.3078 KOps/s | 17.7818 KOps/s | |
test_add_pytree | 0.1184ms | 45.2140μs | 22.1170 KOps/s | 22.7474 KOps/s | |
test_add_td | 0.1757ms | 80.3765μs | 12.4414 KOps/s | 12.8211 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.1154ms | 54.9512μs | 18.1980 KOps/s | 17.8780 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2978ms | 0.1831ms | 5.4625 KOps/s | 5.4585 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1096ms | 55.7051μs | 17.9517 KOps/s | 17.9316 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.3415ms | 0.1415ms | 7.0680 KOps/s | 7.0849 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 59.9120μs | 20.4981μs | 48.7849 KOps/s | 48.8895 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 0.1521ms | 66.7523μs | 14.9808 KOps/s | 15.3682 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.1625ms | 76.6109μs | 13.0530 KOps/s | 13.0389 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1533ms | 68.3531μs | 14.6299 KOps/s | 14.7775 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.5801ms | 0.1748ms | 5.7200 KOps/s | 5.8046 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.3501ms | 0.1886ms | 5.3033 KOps/s | 5.2928 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1017ms | 45.3924μs | 22.0301 KOps/s | 21.6167 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.5213ms | 67.2109μs | 14.8785 KOps/s | 14.2203 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3932ms | 0.1726ms | 5.7931 KOps/s | 5.7126 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6326ms | 0.2966ms | 3.3714 KOps/s | 3.4275 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3501ms | 0.2018ms | 4.9566 KOps/s | 4.9360 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3433ms | 0.1722ms | 5.8065 KOps/s | 5.8101 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1309ms | 62.4921μs | 16.0020 KOps/s | 15.9607 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.1145ms | 46.7061μs | 21.4105 KOps/s | 21.2532 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.4879ms | 0.2335ms | 4.2829 KOps/s | 4.2996 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3418ms | 0.1773ms | 5.6403 KOps/s | 5.7639 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 0.2041ms | 0.1014ms | 9.8655 KOps/s | 9.8780 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1217ms | 57.7168μs | 17.3260 KOps/s | 17.3664 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1427ms | 74.9923μs | 13.3347 KOps/s | 12.9857 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1684ms | 67.6593μs | 14.7799 KOps/s | 14.4175 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 0.2896ms | 0.1953ms | 5.1208 KOps/s | 5.1048 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 1.8342ms | 1.6418ms | 609.1020 Ops/s | 615.4773 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 0.3927ms | 0.1958ms | 5.1061 KOps/s | 5.1399 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 1.2408ms | 1.1110ms | 900.1267 Ops/s | 920.3571 Ops/s | |
test_compile_assign_and_add_stack[compile] | 0.8119ms | 0.4147ms | 2.4112 KOps/s | 2.4008 KOps/s | |
test_compile_assign_and_add_stack[eager] | 4.0468ms | 3.7243ms | 268.5061 Ops/s | 261.8371 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1021ms | 33.2488μs | 30.0763 KOps/s | 29.4254 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 1.0230ms | 49.8845μs | 20.0463 KOps/s | 21.8330 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 86.4420μs | 28.5852μs | 34.9832 KOps/s | 34.3898 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 87.9340μs | 29.1185μs | 34.3425 KOps/s | 35.9451 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 99.4160μs | 28.9139μs | 34.5854 KOps/s | 35.1161 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 73.0570μs | 28.9691μs | 34.5195 KOps/s | 35.7150 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.1615ms | 72.7969μs | 13.7368 KOps/s | 13.7553 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.3306ms | 29.2920μs | 34.1390 KOps/s | 38.1520 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1379ms | 67.5257μs | 14.8092 KOps/s | 14.5380 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 82.2840μs | 23.9973μs | 41.6713 KOps/s | 44.4864 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1370ms | 67.2969μs | 14.8595 KOps/s | 15.0067 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 89.5570μs | 23.3950μs | 42.7441 KOps/s | 44.0442 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1503ms | 71.0132μs | 14.0819 KOps/s | 13.8051 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.7651ms | 28.3622μs | 35.2582 KOps/s | 38.6481 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.1443ms | 67.2360μs | 14.8730 KOps/s | 15.0254 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 64.9410μs | 23.1968μs | 43.1094 KOps/s | 44.8198 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1465ms | 67.7560μs | 14.7588 KOps/s | 15.0701 KOps/s | |
test_compile_indexing[int-pytree-eager] | 63.7400μs | 23.0279μs | 43.4255 KOps/s | 44.1298 KOps/s | |
test_mod_add[eager] | 85.0990μs | 23.9574μs | 41.7408 KOps/s | 42.8291 KOps/s | |
test_mod_add[compile] | 0.1108ms | 38.5543μs | 25.9374 KOps/s | 25.8387 KOps/s | |
test_mod_add[compile-overhead] | 0.1171ms | 38.8462μs | 25.7425 KOps/s | 26.0543 KOps/s | |
test_mod_wrap[eager] | 0.4121ms | 0.2100ms | 4.7622 KOps/s | 4.9320 KOps/s | |
test_mod_wrap[compile] | 0.3199ms | 0.2304ms | 4.3407 KOps/s | 4.3872 KOps/s | |
test_mod_wrap[compile-overhead] | 0.3436ms | 0.2333ms | 4.2869 KOps/s | 4.4573 KOps/s | |
test_mod_wrap_and_backward[eager] | 13.4459ms | 11.3711ms | 87.9422 Ops/s | 86.8494 Ops/s | |
test_mod_wrap_and_backward[compile] | 17.3160ms | 11.9237ms | 83.8668 Ops/s | 82.2129 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 14.4306ms | 11.9051ms | 83.9979 Ops/s | 84.8219 Ops/s | |
test_seq_add[eager] | 0.2009ms | 88.9791μs | 11.2386 KOps/s | 11.5296 KOps/s | |
test_seq_add[compile] | 0.1521ms | 63.0316μs | 15.8650 KOps/s | 16.3794 KOps/s | |
test_seq_add[compile-overhead] | 0.1234ms | 60.1963μs | 16.6123 KOps/s | 16.1774 KOps/s | |
test_seq_wrap[eager] | 0.5063ms | 0.3813ms | 2.6224 KOps/s | 2.7005 KOps/s | |
test_seq_wrap[compile] | 0.4197ms | 0.2673ms | 3.7414 KOps/s | 3.7773 KOps/s | |
test_seq_wrap[compile-overhead] | 0.4825ms | 0.2713ms | 3.6855 KOps/s | 3.8296 KOps/s | |
test_func_call_runtime[False-eager] | 0.7009ms | 0.5302ms | 1.8859 KOps/s | 1.9592 KOps/s | |
test_func_call_runtime[False-compile] | 0.9359ms | 0.5021ms | 1.9917 KOps/s | 1.9963 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.8977ms | 0.4991ms | 2.0034 KOps/s | 2.0264 KOps/s | |
test_func_call_runtime[True-eager] | 0.8987ms | 0.7535ms | 1.3271 KOps/s | 1.3834 KOps/s | |
test_func_call_runtime[True-compile] | 0.5908ms | 0.4983ms | 2.0066 KOps/s | 1.9960 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.8761ms | 0.5080ms | 1.9685 KOps/s | 1.9812 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.7258ms | 0.5243ms | 1.9071 KOps/s | 2.0010 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.9294ms | 0.4986ms | 2.0055 KOps/s | 1.9858 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.9335ms | 0.4982ms | 2.0073 KOps/s | 2.0196 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.3900ms | 0.8828ms | 1.1328 KOps/s | 1.1770 KOps/s | |
test_func_call_cm_runtime[True-compile] | 1.0570ms | 0.7391ms | 1.3530 KOps/s | 1.3841 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.8495ms | 0.7358ms | 1.3590 KOps/s | 1.3735 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.7565ms | 1.8449ms | 542.0459 Ops/s | 546.7703 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 2.8769ms | 1.9089ms | 523.8739 Ops/s | 531.6914 Ops/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 2.6371ms | 1.9006ms | 526.1523 Ops/s | 531.2148 Ops/s | |
test_distributed | 0.2807ms | 0.1246ms | 8.0257 KOps/s | 7.9131 KOps/s | |
test_tdmodule | 44.3430μs | 17.5273μs | 57.0538 KOps/s | 55.9870 KOps/s | |
test_tdmodule_dispatch | 58.6100μs | 35.5939μs | 28.0947 KOps/s | 27.0535 KOps/s | |
test_tdseq | 41.1270μs | 20.1822μs | 49.5486 KOps/s | 49.8072 KOps/s | |
test_tdseq_dispatch | 63.3690μs | 40.4377μs | 24.7294 KOps/s | 24.1009 KOps/s | |
test_instantiation_functorch | 1.8009ms | 1.5538ms | 643.5786 Ops/s | 632.1718 Ops/s | |
test_instantiation_td | 1.8249ms | 1.1345ms | 881.4540 Ops/s | 876.6405 Ops/s | |
test_exec_functorch | 0.4220ms | 0.1861ms | 5.3748 KOps/s | 5.4786 KOps/s | |
test_exec_functional_call | 0.3411ms | 0.1753ms | 5.7050 KOps/s | 5.9313 KOps/s | |
test_exec_td | 0.4019ms | 0.1688ms | 5.9250 KOps/s | 6.1271 KOps/s | |
test_exec_td_decorator | 0.8240ms | 0.2218ms | 4.5090 KOps/s | 4.6250 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.9886ms | 0.6350ms | 1.5748 KOps/s | 1.5801 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.8441ms | 0.6295ms | 1.5886 KOps/s | 1.5807 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.7569ms | 0.4964ms | 2.0147 KOps/s | 2.0597 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6953ms | 0.4926ms | 2.0300 KOps/s | 2.0612 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.4154ms | 0.6112ms | 1.6361 KOps/s | 1.6399 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.9691ms | 0.6109ms | 1.6369 KOps/s | 1.6346 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.7487ms | 0.5057ms | 1.9775 KOps/s | 2.0011 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.8037ms | 0.5061ms | 1.9757 KOps/s | 1.9977 KOps/s | |
test_to_module_speed[True] | 1.4515ms | 1.2814ms | 780.4160 Ops/s | 782.7674 Ops/s | |
test_to_module_speed[False] | 2.0881ms | 1.2641ms | 791.0929 Ops/s | 808.9284 Ops/s | |
test_tc_init | 82.2140μs | 43.6034μs | 22.9340 KOps/s | 21.9664 KOps/s | |
test_tc_init_nested | 0.1570ms | 87.5241μs | 11.4254 KOps/s | 10.8081 KOps/s | |
test_tc_first_layer_tensor | 19.2860μs | 1.5466μs | 646.5917 KOps/s | 659.3353 KOps/s | |
test_tc_first_layer_nontensor | 25.3570μs | 4.6678μs | 214.2327 KOps/s | 212.8103 KOps/s | |
test_tc_second_layer_tensor | 22.3610μs | 2.9120μs | 343.4122 KOps/s | 359.3074 KOps/s | |
test_tc_second_layer_nontensor | 45.4740μs | 6.0299μs | 165.8395 KOps/s | 166.6291 KOps/s | |
test_unbind | 0.4447s | 14.7339ms | 67.8705 Ops/s | 70.3303 Ops/s | |
test_full_like | 7.8114ms | 6.9753ms | 143.3635 Ops/s | 144.2791 Ops/s | |
test_zeros_like | 3.0024ms | 2.6575ms | 376.2902 Ops/s | 375.8864 Ops/s | |
test_ones_like | 9.4232ms | 5.8553ms | 170.7864 Ops/s | 166.4527 Ops/s | |
test_clone | 12.7104ms | 7.4784ms | 133.7192 Ops/s | 131.2904 Ops/s | |
test_squeeze | 71.5440μs | 12.2660μs | 81.5264 KOps/s | 81.2834 KOps/s | |
test_unsqueeze | 0.2121ms | 90.8917μs | 11.0021 KOps/s | 11.2838 KOps/s | |
test_split | 0.3511ms | 0.1919ms | 5.2109 KOps/s | 5.2620 KOps/s | |
test_permute | 0.3771ms | 0.2192ms | 4.5624 KOps/s | 4.6663 KOps/s | |
test_stack | 27.1586ms | 23.0610ms | 43.3633 Ops/s | 41.7951 Ops/s | |
test_cat | 30.1735ms | 23.1037ms | 43.2831 Ops/s | 42.2925 Ops/s |
|
Name | Max | Mean | Ops | Ops on Repo HEAD
|
Change |
---|---|---|---|---|---|
test_plain_set_nested | 0.1272ms | 12.6479μs | 79.0645 KOps/s | 71.2511 KOps/s | |
test_plain_set_stack_nested | 43.2510μs | 12.8152μs | 78.0324 KOps/s | 70.1767 KOps/s | |
test_plain_set_nested_inplace | 41.4410μs | 13.8804μs | 72.0440 KOps/s | 66.0159 KOps/s | |
test_plain_set_stack_nested_inplace | 61.4710μs | 13.8894μs | 71.9972 KOps/s | 66.6516 KOps/s | |
test_items | 45.5010μs | 3.0318μs | 329.8337 KOps/s | 344.9332 KOps/s | |
test_items_nested | 0.3503ms | 0.3222ms | 3.1036 KOps/s | 3.1962 KOps/s | |
test_items_nested_locked | 0.3647ms | 0.3240ms | 3.0862 KOps/s | 3.1699 KOps/s | |
test_items_nested_leaf | 87.5420μs | 63.1215μs | 15.8425 KOps/s | 15.8725 KOps/s | |
test_items_stack_nested | 0.3768ms | 0.3243ms | 3.0839 KOps/s | 3.2002 KOps/s | |
test_items_stack_nested_leaf | 91.5010μs | 63.4123μs | 15.7698 KOps/s | 15.5043 KOps/s | |
test_items_stack_nested_locked | 0.3742ms | 0.3255ms | 3.0723 KOps/s | 3.1784 KOps/s | |
test_keys | 31.6600μs | 3.4507μs | 289.7946 KOps/s | 278.1717 KOps/s | |
test_keys_nested | 84.3920μs | 55.0573μs | 18.1629 KOps/s | 18.0186 KOps/s | |
test_keys_nested_locked | 2.5916ms | 59.9591μs | 16.6780 KOps/s | 16.4156 KOps/s | |
test_keys_nested_leaf | 72.1220μs | 45.2007μs | 22.1236 KOps/s | 21.5254 KOps/s | |
test_keys_stack_nested | 80.0520μs | 54.8563μs | 18.2294 KOps/s | 17.7899 KOps/s | |
test_keys_stack_nested_leaf | 75.0620μs | 46.3045μs | 21.5962 KOps/s | 20.9035 KOps/s | |
test_keys_stack_nested_locked | 0.1034ms | 60.3762μs | 16.5628 KOps/s | 16.3752 KOps/s | |
test_values | 3.5061μs | 0.8075μs | 1.2384 MOps/s | 1.1856 MOps/s | |
test_values_nested | 56.6010μs | 27.4018μs | 36.4939 KOps/s | 36.3971 KOps/s | |
test_values_nested_locked | 60.4320μs | 29.1845μs | 34.2647 KOps/s | 33.7568 KOps/s | |
test_values_nested_leaf | 49.9810μs | 24.2089μs | 41.3072 KOps/s | 41.3675 KOps/s | |
test_values_stack_nested | 63.7210μs | 27.6704μs | 36.1397 KOps/s | 34.6599 KOps/s | |
test_values_stack_nested_leaf | 0.1619ms | 24.4552μs | 40.8911 KOps/s | 39.3949 KOps/s | |
test_values_stack_nested_locked | 55.9010μs | 29.6614μs | 33.7139 KOps/s | 32.3235 KOps/s | |
test_membership | 1.6651μs | 0.4890μs | 2.0451 MOps/s | 2.0157 MOps/s | |
test_membership_nested | 14.9705μs | 1.7941μs | 557.3889 KOps/s | 573.8109 KOps/s | |
test_membership_nested_leaf | 12.7403μs | 1.7592μs | 568.4260 KOps/s | 588.3350 KOps/s | |
test_membership_stacked_nested | 35.7210μs | 1.7971μs | 556.4636 KOps/s | 557.9374 KOps/s | |
test_membership_stacked_nested_leaf | 29.7600μs | 1.7739μs | 563.7452 KOps/s | 547.6994 KOps/s | |
test_membership_nested_last | 35.1200μs | 2.6471μs | 377.7692 KOps/s | 380.3604 KOps/s | |
test_membership_nested_leaf_last | 38.2710μs | 2.6431μs | 378.3365 KOps/s | 379.3746 KOps/s | |
test_membership_stacked_nested_last | 32.4410μs | 2.6305μs | 380.1617 KOps/s | 387.1349 KOps/s | |
test_membership_stacked_nested_leaf_last | 26.0610μs | 2.6246μs | 381.0102 KOps/s | 387.3926 KOps/s | |
test_nested_getleaf | 60.0610μs | 6.0919μs | 164.1530 KOps/s | 164.2107 KOps/s | |
test_nested_get | 34.7610μs | 5.7195μs | 174.8397 KOps/s | 173.5060 KOps/s | |
test_stacked_getleaf | 29.3210μs | 6.0826μs | 164.4038 KOps/s | 166.0135 KOps/s | |
test_stacked_get | 43.3310μs | 5.8221μs | 171.7589 KOps/s | 177.0581 KOps/s | |
test_nested_getitemleaf | 27.4310μs | 6.1445μs | 162.7471 KOps/s | 162.5708 KOps/s | |
test_nested_getitem | 43.6810μs | 5.7260μs | 174.6427 KOps/s | 176.1164 KOps/s | |
test_stacked_getitemleaf | 33.6810μs | 6.1504μs | 162.5899 KOps/s | 165.7532 KOps/s | |
test_stacked_getitem | 51.4510μs | 5.7452μs | 174.0584 KOps/s | 178.2240 KOps/s | |
test_lock_nested | 4.9713ms | 0.4208ms | 2.3764 KOps/s | 2.3936 KOps/s | |
test_lock_stack_nested | 0.4687ms | 0.3847ms | 2.5997 KOps/s | 2.6051 KOps/s | |
test_unlock_nested | 0.7767ms | 0.3539ms | 2.8258 KOps/s | 2.8443 KOps/s | |
test_unlock_stack_nested | 0.3566ms | 0.3208ms | 3.1169 KOps/s | 3.1526 KOps/s | |
test_flatten_speed | 0.3104ms | 79.6385μs | 12.5567 KOps/s | 12.6540 KOps/s | |
test_unflatten_speed | 0.3267ms | 0.2859ms | 3.4978 KOps/s | 3.5564 KOps/s | |
test_common_ops | 1.3845ms | 1.1762ms | 850.1715 Ops/s | 798.0044 Ops/s | |
test_creation | 32.4310μs | 1.4467μs | 691.2330 KOps/s | 693.9780 KOps/s | |
test_creation_empty | 48.5410μs | 13.2013μs | 75.7500 KOps/s | 62.8318 KOps/s | |
test_creation_nested_1 | 44.8810μs | 15.0899μs | 66.2697 KOps/s | 55.3472 KOps/s | |
test_creation_nested_2 | 52.9010μs | 17.4062μs | 57.4508 KOps/s | 48.7703 KOps/s | |
test_clone | 73.5110μs | 31.1958μs | 32.0556 KOps/s | 35.0201 KOps/s | |
test_getitem[int] | 1.4380ms | 15.2153μs | 65.7232 KOps/s | 63.8887 KOps/s | |
test_getitem[slice_int] | 0.1174ms | 26.4723μs | 37.7753 KOps/s | 36.6802 KOps/s | |
test_getitem[range] | 0.2271ms | 0.1071ms | 9.3379 KOps/s | 9.1191 KOps/s | |
test_getitem[tuple] | 0.1168ms | 22.3641μs | 44.7146 KOps/s | 43.4083 KOps/s | |
test_getitem[list] | 0.2153ms | 95.5565μs | 10.4650 KOps/s | 10.2786 KOps/s | |
test_setitem_dim[int] | 67.9010μs | 44.1850μs | 22.6321 KOps/s | 22.4516 KOps/s | |
test_setitem_dim[slice_int] | 92.1320μs | 65.3007μs | 15.3138 KOps/s | 14.9776 KOps/s | |
test_setitem_dim[range] | 0.1562ms | 0.1238ms | 8.0743 KOps/s | 7.9471 KOps/s | |
test_setitem_dim[tuple] | 90.6620μs | 59.8246μs | 16.7155 KOps/s | 16.5676 KOps/s | |
test_setitem | 75.4410μs | 39.4496μs | 25.3488 KOps/s | 24.1797 KOps/s | |
test_set | 0.1173ms | 38.0212μs | 26.3011 KOps/s | 24.6316 KOps/s | |
test_set_shared | 0.3627ms | 49.6505μs | 20.1408 KOps/s | 19.8430 KOps/s | |
test_update | 80.3610μs | 45.2247μs | 22.1118 KOps/s | 20.5904 KOps/s | |
test_update_nested | 95.0420μs | 52.8372μs | 18.9261 KOps/s | 18.0119 KOps/s | |
test_update__nested | 0.2019ms | 58.8796μs | 16.9838 KOps/s | 17.0755 KOps/s | |
test_set_nested | 0.1970ms | 40.9434μs | 24.4240 KOps/s | 23.2449 KOps/s | |
test_set_nested_new | 92.9520μs | 44.5847μs | 22.4292 KOps/s | 21.5635 KOps/s | |
test_select | 0.2061ms | 57.1929μs | 17.4847 KOps/s | 16.8426 KOps/s | |
test_select_nested | 68.9620μs | 42.4739μs | 23.5439 KOps/s | 23.6305 KOps/s | |
test_exclude_nested | 86.6120μs | 59.5582μs | 16.7903 KOps/s | 17.2406 KOps/s | |
test_empty[True] | 0.3007ms | 0.2480ms | 4.0319 KOps/s | 4.1560 KOps/s | |
test_empty[False] | 5.4381μs | 0.8112μs | 1.2327 MOps/s | 1.2241 MOps/s | |
test_to | 66.9110μs | 25.4625μs | 39.2734 KOps/s | 39.3421 KOps/s | |
test_to_nonblocking | 98.2620μs | 24.2922μs | 41.1654 KOps/s | 41.1804 KOps/s | |
test_unbind_speed | 1.2004ms | 0.2754ms | 3.6305 KOps/s | 3.6428 KOps/s | |
test_unbind_speed_stack0 | 0.3732ms | 0.2721ms | 3.6754 KOps/s | 3.6705 KOps/s | |
test_unbind_speed_stack1 | 92.7395ms | 0.7133ms | 1.4020 KOps/s | 1.4131 KOps/s | |
test_split | 93.9280ms | 2.1714ms | 460.5306 Ops/s | 462.5995 Ops/s | |
test_chunk | 94.1099ms | 2.1024ms | 475.6544 Ops/s | 464.5054 Ops/s | |
test_creation[device0] | 0.3465ms | 0.1251ms | 7.9962 KOps/s | 8.0375 KOps/s | |
test_creation_from_tensor | 0.3878ms | 0.1293ms | 7.7310 KOps/s | 7.8486 KOps/s | |
test_add_one[memmap_tensor0] | 0.2304ms | 9.6655μs | 103.4605 KOps/s | 114.0736 KOps/s | |
test_contiguous[memmap_tensor0] | 22.4610μs | 2.1238μs | 470.8497 KOps/s | 470.0223 KOps/s | |
test_stack[memmap_tensor0] | 36.0410μs | 6.4422μs | 155.2253 KOps/s | 151.1681 KOps/s | |
test_memmaptd_index | 1.0969ms | 0.4059ms | 2.4636 KOps/s | 2.4182 KOps/s | |
test_memmaptd_index_astensor | 0.7351ms | 0.4651ms | 2.1499 KOps/s | 2.1042 KOps/s | |
test_memmaptd_index_op | 1.4159ms | 0.9776ms | 1.0229 KOps/s | 982.1350 Ops/s | |
test_serialize_model | 0.1299s | 0.1294s | 7.7269 Ops/s | 7.7022 Ops/s | |
test_serialize_model_pickle | 1.3485s | 1.2128s | 0.8245 Ops/s | 0.8250 Ops/s | |
test_serialize_weights | 0.1298s | 0.1284s | 7.7852 Ops/s | 7.0121 Ops/s | |
test_serialize_weights_returnearly | 0.2129s | 55.2839ms | 18.0885 Ops/s | 17.8675 Ops/s | |
test_serialize_weights_pickle | 1.3719s | 1.2166s | 0.8220 Ops/s | 0.8213 Ops/s | |
test_reshape_pytree | 79.9220μs | 35.1871μs | 28.4195 KOps/s | 28.2253 KOps/s | |
test_reshape_td | 0.1151ms | 40.9953μs | 24.3930 KOps/s | 24.0161 KOps/s | |
test_view_pytree | 68.5410μs | 34.6722μs | 28.8415 KOps/s | 28.4167 KOps/s | |
test_view_td | 86.2020μs | 45.0386μs | 22.2032 KOps/s | 21.1398 KOps/s | |
test_unbind_pytree | 70.5210μs | 33.9355μs | 29.4676 KOps/s | 29.5058 KOps/s | |
test_unbind_td | 0.4104ms | 42.4608μs | 23.5511 KOps/s | 23.7364 KOps/s | |
test_split_pytree | 0.3764ms | 43.7660μs | 22.8488 KOps/s | 22.3311 KOps/s | |
test_split_td | 93.5106ms | 63.1541μs | 15.8343 KOps/s | 18.2463 KOps/s | |
test_add_pytree | 0.1055ms | 55.7842μs | 17.9262 KOps/s | 17.3529 KOps/s | |
test_add_td | 0.1588ms | 88.2193μs | 11.3354 KOps/s | 11.2727 KOps/s | |
test_compile_add_one_nested[tensordict-compile] | 0.4018ms | 0.2034ms | 4.9174 KOps/s | 4.7689 KOps/s | |
test_compile_add_one_nested[tensordict-eager] | 0.2966ms | 0.1558ms | 6.4166 KOps/s | 6.4119 KOps/s | |
test_compile_add_one_nested[pytree-compile] | 0.1955ms | 0.1410ms | 7.0901 KOps/s | 7.0544 KOps/s | |
test_compile_add_one_nested[pytree-eager] | 0.2378ms | 0.1768ms | 5.6546 KOps/s | 5.1736 KOps/s | |
test_compile_copy_nested[tensordict-compile] | 72.9410μs | 20.3098μs | 49.2374 KOps/s | 48.1040 KOps/s | |
test_compile_copy_nested[tensordict-eager] | 90.3520μs | 44.0442μs | 22.7045 KOps/s | 22.9716 KOps/s | |
test_compile_copy_nested[pytree-compile] | 0.3261ms | 63.9328μs | 15.6414 KOps/s | 15.8899 KOps/s | |
test_compile_copy_nested[pytree-eager] | 0.1357ms | 48.9603μs | 20.4247 KOps/s | 20.1194 KOps/s | |
test_compile_add_one_flat[tensordict-compile] | 0.4141ms | 0.3092ms | 3.2339 KOps/s | 3.2689 KOps/s | |
test_compile_add_one_flat[tensordict-eager] | 0.2666ms | 0.2052ms | 4.8737 KOps/s | 4.7444 KOps/s | |
test_compile_add_one_flat[tensorclass-compile] | 0.1933ms | 0.1299ms | 7.6992 KOps/s | 7.9547 KOps/s | |
test_compile_add_one_flat[tensorclass-eager] | 0.1219ms | 61.4854μs | 16.2640 KOps/s | 16.1005 KOps/s | |
test_compile_add_one_flat[pytree-compile] | 0.3562ms | 0.3071ms | 3.2560 KOps/s | 3.2555 KOps/s | |
test_compile_add_one_flat[pytree-eager] | 0.6827ms | 0.5927ms | 1.6872 KOps/s | 1.5409 KOps/s | |
test_compile_add_self_flat[tensordict-eager] | 0.3910ms | 0.2466ms | 4.0553 KOps/s | 3.9409 KOps/s | |
test_compile_add_self_flat[tensordict-compile] | 0.3634ms | 0.3103ms | 3.2231 KOps/s | 3.2291 KOps/s | |
test_compile_add_self_flat[tensorclass-eager] | 0.1616ms | 69.7314μs | 14.3407 KOps/s | 13.7803 KOps/s | |
test_compile_add_self_flat[tensorclass-compile] | 0.2304ms | 0.1296ms | 7.7172 KOps/s | 7.8484 KOps/s | |
test_compile_add_self_flat[pytree-eager] | 0.5883ms | 0.5049ms | 1.9806 KOps/s | 1.7486 KOps/s | |
test_compile_add_self_flat[pytree-compile] | 0.3975ms | 0.3082ms | 3.2447 KOps/s | 3.2403 KOps/s | |
test_compile_copy_flat[tensordict-compile] | 79.9120μs | 19.1846μs | 52.1250 KOps/s | 56.0616 KOps/s | |
test_compile_copy_flat[tensordict-eager] | 0.1041ms | 29.4354μs | 33.9727 KOps/s | 34.5564 KOps/s | |
test_compile_copy_flat[pytree-compile] | 0.1016ms | 68.2966μs | 14.6420 KOps/s | 14.5123 KOps/s | |
test_compile_copy_flat[pytree-eager] | 0.1017ms | 51.3213μs | 19.4851 KOps/s | 19.3524 KOps/s | |
test_compile_assign_and_add[tensordict-compile] | 2.2503ms | 0.7754ms | 1.2897 KOps/s | 1.1800 KOps/s | |
test_compile_assign_and_add[tensordict-eager] | 3.3243ms | 3.1424ms | 318.2268 Ops/s | 306.5514 Ops/s | |
test_compile_assign_and_add[pytree-compile] | 2.2384ms | 0.7754ms | 1.2897 KOps/s | 1.1620 KOps/s | |
test_compile_assign_and_add[pytree-eager] | 3.1975ms | 3.0562ms | 327.2024 Ops/s | 301.9275 Ops/s | |
test_compile_indexing[tensor-tensordict-compile] | 0.1539ms | 0.1071ms | 9.3368 KOps/s | 8.9468 KOps/s | |
test_compile_indexing[tensor-tensordict-eager] | 0.1850ms | 57.9551μs | 17.2547 KOps/s | 15.6137 KOps/s | |
test_compile_indexing[tensor-tensorclass-compile] | 0.2102ms | 0.1017ms | 9.8295 KOps/s | 9.8345 KOps/s | |
test_compile_indexing[tensor-tensorclass-eager] | 74.4510μs | 41.0971μs | 24.3326 KOps/s | 23.5153 KOps/s | |
test_compile_indexing[tensor-pytree-compile] | 0.1427ms | 0.1041ms | 9.6046 KOps/s | 9.7244 KOps/s | |
test_compile_indexing[tensor-pytree-eager] | 82.8820μs | 44.8733μs | 22.2850 KOps/s | 23.6254 KOps/s | |
test_compile_indexing[slice-tensordict-compile] | 0.2644ms | 0.1350ms | 7.4098 KOps/s | 7.4517 KOps/s | |
test_compile_indexing[slice-tensordict-eager] | 0.1497ms | 24.2411μs | 41.2523 KOps/s | 40.7895 KOps/s | |
test_compile_indexing[slice-tensorclass-compile] | 0.1871ms | 0.1280ms | 7.8118 KOps/s | 7.7729 KOps/s | |
test_compile_indexing[slice-tensorclass-eager] | 54.1110μs | 20.3984μs | 49.0236 KOps/s | 50.0281 KOps/s | |
test_compile_indexing[slice-pytree-compile] | 0.1881ms | 0.1288ms | 7.7632 KOps/s | 7.7411 KOps/s | |
test_compile_indexing[slice-pytree-eager] | 52.7410μs | 19.9443μs | 50.1396 KOps/s | 49.9158 KOps/s | |
test_compile_indexing[int-tensordict-compile] | 0.1963ms | 0.1346ms | 7.4303 KOps/s | 7.4087 KOps/s | |
test_compile_indexing[int-tensordict-eager] | 0.4974ms | 24.2330μs | 41.2660 KOps/s | 40.3213 KOps/s | |
test_compile_indexing[int-tensorclass-compile] | 0.2725ms | 0.1332ms | 7.5065 KOps/s | 7.7306 KOps/s | |
test_compile_indexing[int-tensorclass-eager] | 46.9810μs | 20.1197μs | 49.7026 KOps/s | 49.4971 KOps/s | |
test_compile_indexing[int-pytree-compile] | 0.1729ms | 0.1286ms | 7.7763 KOps/s | 7.7427 KOps/s | |
test_compile_indexing[int-pytree-eager] | 50.9610μs | 20.1450μs | 49.6402 KOps/s | 49.2318 KOps/s | |
test_mod_add[eager] | 72.4910μs | 29.7805μs | 33.5790 KOps/s | 32.2745 KOps/s | |
test_mod_add[compile] | 0.2963ms | 68.1403μs | 14.6756 KOps/s | 14.4776 KOps/s | |
test_mod_add[compile-overhead] | 0.2604ms | 0.1382ms | 7.2382 KOps/s | 7.2275 KOps/s | |
test_mod_wrap[eager] | 0.3169ms | 0.2327ms | 4.2977 KOps/s | 4.1527 KOps/s | |
test_mod_wrap[compile] | 1.1144ms | 0.2787ms | 3.5887 KOps/s | 3.3521 KOps/s | |
test_mod_wrap[compile-overhead] | 7.9303ms | 4.1651ms | 240.0909 Ops/s | 245.0415 Ops/s | |
test_mod_wrap_and_backward[eager] | 1.4354ms | 1.3335ms | 749.9201 Ops/s | 695.6471 Ops/s | |
test_mod_wrap_and_backward[compile] | 1.6683ms | 1.3060ms | 765.7096 Ops/s | 706.3970 Ops/s | |
test_mod_wrap_and_backward[compile-overhead] | 1.2739ms | 0.8724ms | 1.1462 KOps/s | 1.0246 KOps/s | |
test_seq_add[eager] | 0.1413ms | 90.1744μs | 11.0896 KOps/s | 10.3740 KOps/s | |
test_seq_add[compile] | 0.3686ms | 81.2631μs | 12.3057 KOps/s | 12.6034 KOps/s | |
test_seq_add[compile-overhead] | 0.1649ms | 0.1175ms | 8.5125 KOps/s | 8.9336 KOps/s | |
test_seq_wrap[eager] | 0.4269ms | 0.3731ms | 2.6802 KOps/s | 2.6070 KOps/s | |
test_seq_wrap[compile] | 0.4108ms | 0.3048ms | 3.2812 KOps/s | 3.2936 KOps/s | |
test_seq_wrap[compile-overhead] | 0.2583ms | 0.2092ms | 4.7796 KOps/s | 4.8256 KOps/s | |
test_func_call_runtime[False-eager] | 0.8610ms | 0.7413ms | 1.3491 KOps/s | 1.3596 KOps/s | |
test_func_call_runtime[False-compile] | 0.9782ms | 0.7873ms | 1.2701 KOps/s | 1.2903 KOps/s | |
test_func_call_runtime[False-compile-overhead] | 0.4119ms | 0.3416ms | 2.9271 KOps/s | 2.9118 KOps/s | |
test_func_call_runtime[True-eager] | 1.0131ms | 0.8878ms | 1.1264 KOps/s | 1.1217 KOps/s | |
test_func_call_runtime[True-compile] | 0.9340ms | 0.7974ms | 1.2541 KOps/s | 1.2370 KOps/s | |
test_func_call_runtime[True-compile-overhead] | 0.4515ms | 0.3734ms | 2.6782 KOps/s | 2.6576 KOps/s | |
test_func_call_cm_runtime[False-eager] | 0.8329ms | 0.7150ms | 1.3985 KOps/s | 1.3767 KOps/s | |
test_func_call_cm_runtime[False-compile] | 0.8512ms | 0.7652ms | 1.3068 KOps/s | 1.2833 KOps/s | |
test_func_call_cm_runtime[False-compile-overhead] | 0.3895ms | 0.3419ms | 2.9244 KOps/s | 2.9003 KOps/s | |
test_func_call_cm_runtime[True-eager] | 1.0828ms | 0.9765ms | 1.0240 KOps/s | 1.0150 KOps/s | |
test_func_call_cm_runtime[True-compile] | 0.9805ms | 0.8238ms | 1.2138 KOps/s | 1.1945 KOps/s | |
test_func_call_cm_runtime[True-compile-overhead] | 0.4801ms | 0.3961ms | 2.5245 KOps/s | 2.4745 KOps/s | |
test_vmap_func_call_cm_runtime[eager] | 2.5598ms | 2.0446ms | 489.0969 Ops/s | 481.9476 Ops/s | |
test_vmap_func_call_cm_runtime[compile] | 0.9430ms | 0.8377ms | 1.1938 KOps/s | 1.1571 KOps/s | |
test_vmap_func_call_cm_runtime[compile-overhead] | 0.5319ms | 0.4034ms | 2.4788 KOps/s | 2.4479 KOps/s | |
test_distributed | 0.8357ms | 0.1545ms | 6.4707 KOps/s | 8.4430 KOps/s | |
test_tdmodule | 0.5401ms | 14.0770μs | 71.0379 KOps/s | 66.6883 KOps/s | |
test_tdmodule_dispatch | 46.8610μs | 26.1770μs | 38.2014 KOps/s | 32.4092 KOps/s | |
test_tdseq | 31.6600μs | 13.5733μs | 73.6738 KOps/s | 60.1405 KOps/s | |
test_tdseq_dispatch | 61.0210μs | 28.5867μs | 34.9813 KOps/s | 29.2981 KOps/s | |
test_instantiation_functorch | 1.9014ms | 1.8019ms | 554.9579 Ops/s | 525.5641 Ops/s | |
test_instantiation_td | 1.8244ms | 1.1748ms | 851.2334 Ops/s | 840.3486 Ops/s | |
test_exec_functorch | 0.2423ms | 0.2072ms | 4.8270 KOps/s | 4.4588 KOps/s | |
test_exec_functional_call | 0.2386ms | 0.2028ms | 4.9304 KOps/s | 4.4565 KOps/s | |
test_exec_td | 0.2416ms | 0.2084ms | 4.7976 KOps/s | 4.6855 KOps/s | |
test_exec_td_decorator | 1.0271ms | 0.2500ms | 3.9994 KOps/s | 3.8963 KOps/s | |
test_vmap_mlp_speed[True-True] | 0.7783ms | 0.6678ms | 1.4975 KOps/s | 1.4635 KOps/s | |
test_vmap_mlp_speed[True-False] | 0.7483ms | 0.6648ms | 1.5043 KOps/s | 1.4695 KOps/s | |
test_vmap_mlp_speed[False-True] | 0.6816ms | 0.5684ms | 1.7594 KOps/s | 1.7407 KOps/s | |
test_vmap_mlp_speed[False-False] | 0.6118ms | 0.5693ms | 1.7567 KOps/s | 1.7375 KOps/s | |
test_vmap_mlp_speed_decorator[True-True] | 1.2838ms | 0.6473ms | 1.5449 KOps/s | 1.4878 KOps/s | |
test_vmap_mlp_speed_decorator[True-False] | 0.7798ms | 0.6505ms | 1.5373 KOps/s | 1.4246 KOps/s | |
test_vmap_mlp_speed_decorator[False-True] | 0.6835ms | 0.5770ms | 1.7331 KOps/s | 1.6193 KOps/s | |
test_vmap_mlp_speed_decorator[False-False] | 0.7144ms | 0.5885ms | 1.6993 KOps/s | 1.6080 KOps/s | |
test_vmap_transformer_speed[True-True] | 8.2680ms | 8.1784ms | 122.2736 Ops/s | 119.6518 Ops/s | |
test_vmap_transformer_speed[True-False] | 8.5151ms | 8.1632ms | 122.5014 Ops/s | 120.1227 Ops/s | |
test_vmap_transformer_speed[False-True] | 8.2883ms | 7.9941ms | 125.0920 Ops/s | 123.0879 Ops/s | |
test_vmap_transformer_speed[False-False] | 8.3450ms | 7.9808ms | 125.3005 Ops/s | 122.5445 Ops/s | |
test_vmap_transformer_speed_decorator[True-True] | 19.9069ms | 19.2251ms | 52.0154 Ops/s | 51.6018 Ops/s | |
test_vmap_transformer_speed_decorator[True-False] | 19.8358ms | 19.2965ms | 51.8228 Ops/s | 51.4379 Ops/s | |
test_vmap_transformer_speed_decorator[False-True] | 19.6789ms | 19.0201ms | 52.5760 Ops/s | 51.8664 Ops/s | |
test_vmap_transformer_speed_decorator[False-False] | 20.1934ms | 19.0286ms | 52.5524 Ops/s | 51.7766 Ops/s | |
test_to_module_speed[True] | 1.3450ms | 0.9251ms | 1.0809 KOps/s | 1.0895 KOps/s | |
test_to_module_speed[False] | 1.2820ms | 0.9074ms | 1.1021 KOps/s | 1.1215 KOps/s | |
test_tc_init | 63.4110μs | 30.5347μs | 32.7496 KOps/s | 27.4110 KOps/s | |
test_tc_init_nested | 0.1027ms | 62.1177μs | 16.0985 KOps/s | 13.9107 KOps/s | |
test_tc_first_layer_tensor | 5.1630μs | 0.6900μs | 1.4492 MOps/s | 1.4215 MOps/s | |
test_tc_first_layer_nontensor | 22.7210μs | 2.2340μs | 447.6302 KOps/s | 442.2745 KOps/s | |
test_tc_second_layer_tensor | 17.1870μs | 1.3953μs | 716.6764 KOps/s | 719.0131 KOps/s | |
test_tc_second_layer_nontensor | 25.7410μs | 2.9278μs | 341.5529 KOps/s | 339.2350 KOps/s | |
test_unbind | 0.1904s | 11.9468ms | 83.7044 Ops/s | 93.5707 Ops/s | |
test_full_like | 0.6585ms | 0.5756ms | 1.7374 KOps/s | 1.7431 KOps/s | |
test_zeros_like | 0.2587ms | 0.1979ms | 5.0538 KOps/s | 5.0523 KOps/s | |
test_ones_like | 0.2412ms | 0.1978ms | 5.0560 KOps/s | 5.0558 KOps/s | |
test_clone | 0.4488ms | 0.4145ms | 2.4128 KOps/s | 2.4182 KOps/s | |
test_squeeze | 37.5510μs | 9.6871μs | 103.2299 KOps/s | 102.0674 KOps/s | |
test_unsqueeze | 0.2818ms | 72.4216μs | 13.8080 KOps/s | 13.5565 KOps/s | |
test_split | 0.2497ms | 0.1552ms | 6.4454 KOps/s | 6.4005 KOps/s | |
test_permute | 0.2241ms | 0.1765ms | 5.6662 KOps/s | 5.7126 KOps/s | |
test_stack | 1.2560ms | 0.8557ms | 1.1687 KOps/s | 1.1488 KOps/s | |
test_cat | 1.2577ms | 1.2314ms | 812.0757 Ops/s | 811.6963 Ops/s |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
bug
Something isn't working
CLA Signed
This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Stack from ghstack (oldest at bottom):