Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] MCTS Scoring functions #2358

Open
wants to merge 14 commits into
base: gh/vmoens/6/base
Choose a base branch
from
Open

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Aug 4, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
Copy link

pytorch-bot bot commented Aug 4, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2358

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 5 New Failures, 16 Unrelated Failures

As of commit 5a59f00 with merge base 408cf7d (image):

NEW FAILURES - The following jobs have failed:

FLAKY - The following jobs failed but were likely due to flakiness present on trunk:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Aug 4, 2024
vmoens added a commit that referenced this pull request Aug 7, 2024
ghstack-source-id: 7ee601a3fb990db519cd33906ecba54dd1695676
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 14, 2024
ghstack-source-id: 90b80a6bcd2bf905074e828292d308647a094774
Pull Request resolved: #2358
vmoens added a commit that referenced this pull request Oct 14, 2024
ghstack-source-id: 90b80a6bcd2bf905074e828292d308647a094774
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 15, 2024
ghstack-source-id: f4674b6206d0f41908f6d161855ffa58a8468635
Pull Request resolved: #2358
vmoens added a commit that referenced this pull request Oct 15, 2024
ghstack-source-id: f4674b6206d0f41908f6d161855ffa58a8468635
Pull Request resolved: #2358
@vmoens vmoens added the enhancement New feature or request label Oct 26, 2024
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 26, 2024
ghstack-source-id: 2e60d6f52d70e4ae84f6df2ed4150f3fd2790bcb
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Oct 26, 2024
ghstack-source-id: 5fdfbeab44f579aa01e333f6900cf0c9297d58aa
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 4, 2024
ghstack-source-id: 2b296b99c31559e46bec430c9b428d6a855760f8
Pull Request resolved: #2358
Copy link

github-actions bot commented Nov 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}38$. Worsened: $\large\color{#d91a1a}4$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4245s 0.4232s 2.3629 Ops/s 2.1784 Ops/s $\textbf{\color{#35bf28}+8.47\%}$
test_transformed 0.6031s 0.6004s 1.6656 Ops/s 1.5856 Ops/s $\textbf{\color{#35bf28}+5.04\%}$
test_serial 1.3556s 1.3474s 0.7422 Ops/s 0.7257 Ops/s $\color{#35bf28}+2.27\%$
test_parallel 1.2918s 1.2829s 0.7795 Ops/s 0.7615 Ops/s $\color{#35bf28}+2.36\%$
test_step_mdp_speed[True-True-True-True-True] 0.1477ms 26.8220μs 37.2828 KOps/s 37.8292 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[True-True-True-True-False] 48.6500μs 15.6593μs 63.8599 KOps/s 64.4673 KOps/s $\color{#d91a1a}-0.94\%$
test_step_mdp_speed[True-True-True-False-True] 50.7750μs 15.4089μs 64.8975 KOps/s 65.6176 KOps/s $\color{#d91a1a}-1.10\%$
test_step_mdp_speed[True-True-True-False-False] 35.8570μs 8.9890μs 111.2476 KOps/s 111.8099 KOps/s $\color{#d91a1a}-0.50\%$
test_step_mdp_speed[True-True-False-True-True] 72.5850μs 28.6416μs 34.9142 KOps/s 35.2091 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[True-True-False-True-False] 69.0290μs 17.2408μs 58.0018 KOps/s 58.2469 KOps/s $\color{#d91a1a}-0.42\%$
test_step_mdp_speed[True-True-False-False-True] 68.9910μs 16.7076μs 59.8529 KOps/s 59.5311 KOps/s $\color{#35bf28}+0.54\%$
test_step_mdp_speed[True-True-False-False-False] 43.6010μs 10.4957μs 95.2768 KOps/s 94.4474 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[True-False-True-True-True] 80.4100μs 29.9247μs 33.4172 KOps/s 33.2540 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-True-True-False] 49.1720μs 18.8671μs 53.0024 KOps/s 53.0679 KOps/s $\color{#d91a1a}-0.12\%$
test_step_mdp_speed[True-False-True-False-True] 60.4230μs 16.7196μs 59.8101 KOps/s 59.4053 KOps/s $\color{#35bf28}+0.68\%$
test_step_mdp_speed[True-False-True-False-False] 52.8380μs 10.4956μs 95.2780 KOps/s 95.8764 KOps/s $\color{#d91a1a}-0.62\%$
test_step_mdp_speed[True-False-False-True-True] 67.2950μs 31.2873μs 31.9619 KOps/s 31.6534 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[True-False-False-True-False] 51.8060μs 20.5312μs 48.7063 KOps/s 49.4185 KOps/s $\color{#d91a1a}-1.44\%$
test_step_mdp_speed[True-False-False-False-True] 62.8470μs 18.2050μs 54.9301 KOps/s 50.8569 KOps/s $\textbf{\color{#35bf28}+8.01\%}$
test_step_mdp_speed[True-False-False-False-False] 49.9530μs 12.4799μs 80.1287 KOps/s 81.7542 KOps/s $\color{#d91a1a}-1.99\%$
test_step_mdp_speed[False-True-True-True-True] 76.2920μs 30.1171μs 33.2037 KOps/s 32.8292 KOps/s $\color{#35bf28}+1.14\%$
test_step_mdp_speed[False-True-True-True-False] 48.3300μs 18.9145μs 52.8695 KOps/s 52.1853 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-True-False-True] 57.4780μs 19.3391μs 51.7087 KOps/s 51.3837 KOps/s $\color{#35bf28}+0.63\%$
test_step_mdp_speed[False-True-True-False-False] 44.3930μs 11.7369μs 85.2017 KOps/s 85.1788 KOps/s $\color{#35bf28}+0.03\%$
test_step_mdp_speed[False-True-False-True-True] 80.4900μs 31.4434μs 31.8032 KOps/s 31.7568 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-False-True-False] 65.6630μs 20.4496μs 48.9007 KOps/s 49.2818 KOps/s $\color{#d91a1a}-0.77\%$
test_step_mdp_speed[False-True-False-False-True] 2.7977ms 20.7778μs 48.1283 KOps/s 48.2396 KOps/s $\color{#d91a1a}-0.23\%$
test_step_mdp_speed[False-True-False-False-False] 60.5430μs 13.3419μs 74.9520 KOps/s 74.9395 KOps/s $\color{#35bf28}+0.02\%$
test_step_mdp_speed[False-False-True-True-True] 77.5650μs 32.9046μs 30.3909 KOps/s 30.1647 KOps/s $\color{#35bf28}+0.75\%$
test_step_mdp_speed[False-False-True-True-False] 65.7530μs 21.7955μs 45.8811 KOps/s 45.6898 KOps/s $\color{#35bf28}+0.42\%$
test_step_mdp_speed[False-False-True-False-True] 51.3450μs 20.6079μs 48.5251 KOps/s 48.3731 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-False-True-False-False] 78.2840μs 13.2419μs 75.5181 KOps/s 74.8849 KOps/s $\color{#35bf28}+0.85\%$
test_step_mdp_speed[False-False-False-True-True] 64.7100μs 34.2183μs 29.2241 KOps/s 29.1331 KOps/s $\color{#35bf28}+0.31\%$
test_step_mdp_speed[False-False-False-True-False] 55.0530μs 23.2786μs 42.9578 KOps/s 42.7307 KOps/s $\color{#35bf28}+0.53\%$
test_step_mdp_speed[False-False-False-False-True] 82.5340μs 21.9763μs 45.5035 KOps/s 44.9061 KOps/s $\color{#35bf28}+1.33\%$
test_step_mdp_speed[False-False-False-False-False] 47.9790μs 14.6323μs 68.3420 KOps/s 67.6583 KOps/s $\color{#35bf28}+1.01\%$
test_values[generalized_advantage_estimate-True-True] 10.7144ms 9.9577ms 100.4246 Ops/s 95.2617 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_values[vec_generalized_advantage_estimate-True-True] 38.9823ms 34.0526ms 29.3663 Ops/s 27.9711 Ops/s $\color{#35bf28}+4.99\%$
test_values[td0_return_estimate-False-False] 0.2169ms 0.1951ms 5.1248 KOps/s 5.1596 KOps/s $\color{#d91a1a}-0.67\%$
test_values[td1_return_estimate-False-False] 29.1183ms 24.4930ms 40.8280 Ops/s 39.3217 Ops/s $\color{#35bf28}+3.83\%$
test_values[vec_td1_return_estimate-False-False] 36.3315ms 34.1032ms 29.3228 Ops/s 27.8490 Ops/s $\textbf{\color{#35bf28}+5.29\%}$
test_values[td_lambda_return_estimate-True-False] 36.8646ms 35.2297ms 28.3851 Ops/s 27.4665 Ops/s $\color{#35bf28}+3.34\%$
test_values[vec_td_lambda_return_estimate-True-False] 38.5649ms 34.1282ms 29.3013 Ops/s 24.8017 Ops/s $\textbf{\color{#35bf28}+18.14\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.8160ms 8.6099ms 116.1448 Ops/s 113.5161 Ops/s $\color{#35bf28}+2.32\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.7430ms 1.8492ms 540.7611 Ops/s 560.8079 Ops/s $\color{#d91a1a}-3.57\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.5561ms 0.3625ms 2.7586 KOps/s 2.7379 KOps/s $\color{#35bf28}+0.75\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 45.1166ms 40.8294ms 24.4922 Ops/s 25.4707 Ops/s $\color{#d91a1a}-3.84\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 4.2538ms 3.0636ms 326.4125 Ops/s 323.1184 Ops/s $\color{#35bf28}+1.02\%$
test_dqn_speed[False-None] 1.8411ms 1.4426ms 693.1773 Ops/s 731.2881 Ops/s $\textbf{\color{#d91a1a}-5.21\%}$
test_dqn_speed[False-backward] 1.9915ms 1.8941ms 527.9418 Ops/s 536.5725 Ops/s $\color{#d91a1a}-1.61\%$
test_dqn_speed[True-None] 0.7662ms 0.4711ms 2.1226 KOps/s 2.0755 KOps/s $\color{#35bf28}+2.27\%$
test_dqn_speed[True-backward] 1.0278ms 0.9433ms 1.0601 KOps/s 1.0858 KOps/s $\color{#d91a1a}-2.36\%$
test_dqn_speed[reduce-overhead-None] 0.8040ms 0.4743ms 2.1083 KOps/s 2.0947 KOps/s $\color{#35bf28}+0.65\%$
test_dqn_speed[reduce-overhead-backward] 1.0017ms 0.9138ms 1.0943 KOps/s 1.0800 KOps/s $\color{#35bf28}+1.33\%$
test_ddpg_speed[False-None] 3.6748ms 2.9647ms 337.3031 Ops/s 350.4748 Ops/s $\color{#d91a1a}-3.76\%$
test_ddpg_speed[False-backward] 4.0392ms 3.9545ms 252.8793 Ops/s 247.1004 Ops/s $\color{#35bf28}+2.34\%$
test_ddpg_speed[True-None] 1.2355ms 1.0234ms 977.1618 Ops/s 935.3674 Ops/s $\color{#35bf28}+4.47\%$
test_ddpg_speed[True-backward] 1.9938ms 1.9288ms 518.4657 Ops/s 431.7466 Ops/s $\textbf{\color{#35bf28}+20.09\%}$
test_ddpg_speed[reduce-overhead-None] 2.8707ms 1.0783ms 927.3504 Ops/s 966.4628 Ops/s $\color{#d91a1a}-4.05\%$
test_ddpg_speed[reduce-overhead-backward] 2.0108ms 1.9252ms 519.4366 Ops/s 499.5933 Ops/s $\color{#35bf28}+3.97\%$
test_sac_speed[False-None] 9.1218ms 7.9770ms 125.3606 Ops/s 113.4169 Ops/s $\textbf{\color{#35bf28}+10.53\%}$
test_sac_speed[False-backward] 11.6069ms 10.7414ms 93.0978 Ops/s 86.1813 Ops/s $\textbf{\color{#35bf28}+8.03\%}$
test_sac_speed[True-None] 2.4368ms 1.8591ms 537.8842 Ops/s 501.3875 Ops/s $\textbf{\color{#35bf28}+7.28\%}$
test_sac_speed[True-backward] 3.8791ms 3.5466ms 281.9617 Ops/s 250.7365 Ops/s $\textbf{\color{#35bf28}+12.45\%}$
test_sac_speed[reduce-overhead-None] 4.2459ms 1.8973ms 527.0576 Ops/s 508.6319 Ops/s $\color{#35bf28}+3.62\%$
test_sac_speed[reduce-overhead-backward] 3.8734ms 3.7113ms 269.4450 Ops/s 250.9397 Ops/s $\textbf{\color{#35bf28}+7.37\%}$
test_redq_speed[False-None] 15.1384ms 13.1325ms 76.1469 Ops/s 73.8428 Ops/s $\color{#35bf28}+3.12\%$
test_redq_speed[False-backward] 25.5816ms 22.5990ms 44.2497 Ops/s 43.1495 Ops/s $\color{#35bf28}+2.55\%$
test_redq_speed[True-None] 5.7806ms 4.8267ms 207.1823 Ops/s 176.5600 Ops/s $\textbf{\color{#35bf28}+17.34\%}$
test_redq_speed[True-backward] 13.8106ms 11.9804ms 83.4698 Ops/s 79.8447 Ops/s $\color{#35bf28}+4.54\%$
test_redq_speed[reduce-overhead-None] 5.5068ms 4.5103ms 221.7155 Ops/s 196.7703 Ops/s $\textbf{\color{#35bf28}+12.68\%}$
test_redq_speed[reduce-overhead-backward] 13.7353ms 12.1176ms 82.5244 Ops/s 78.5287 Ops/s $\textbf{\color{#35bf28}+5.09\%}$
test_redq_deprec_speed[False-None] 13.6069ms 12.5120ms 79.9234 Ops/s 73.3375 Ops/s $\textbf{\color{#35bf28}+8.98\%}$
test_redq_deprec_speed[False-backward] 20.3702ms 18.2249ms 54.8700 Ops/s 50.3188 Ops/s $\textbf{\color{#35bf28}+9.04\%}$
test_redq_deprec_speed[True-None] 4.5944ms 3.5921ms 278.3901 Ops/s 268.3310 Ops/s $\color{#35bf28}+3.75\%$
test_redq_deprec_speed[True-backward] 8.5049ms 8.0236ms 124.6320 Ops/s 112.1819 Ops/s $\textbf{\color{#35bf28}+11.10\%}$
test_redq_deprec_speed[reduce-overhead-None] 4.5106ms 3.6271ms 275.6993 Ops/s 268.8514 Ops/s $\color{#35bf28}+2.55\%$
test_redq_deprec_speed[reduce-overhead-backward] 9.6093ms 8.2544ms 121.1472 Ops/s 117.2931 Ops/s $\color{#35bf28}+3.29\%$
test_td3_speed[False-None] 8.2086ms 7.8666ms 127.1193 Ops/s 123.4367 Ops/s $\color{#35bf28}+2.98\%$
test_td3_speed[False-backward] 10.8236ms 10.3119ms 96.9756 Ops/s 95.8242 Ops/s $\color{#35bf28}+1.20\%$
test_td3_speed[True-None] 1.9209ms 1.7352ms 576.3186 Ops/s 566.1392 Ops/s $\color{#35bf28}+1.80\%$
test_td3_speed[True-backward] 3.4391ms 3.3650ms 297.1743 Ops/s 290.2692 Ops/s $\color{#35bf28}+2.38\%$
test_td3_speed[reduce-overhead-None] 2.1312ms 1.7504ms 571.2961 Ops/s 572.5881 Ops/s $\color{#d91a1a}-0.23\%$
test_td3_speed[reduce-overhead-backward] 3.8456ms 3.4002ms 294.1011 Ops/s 273.8703 Ops/s $\textbf{\color{#35bf28}+7.39\%}$
test_cql_speed[False-None] 39.1643ms 35.9002ms 27.8550 Ops/s 26.6629 Ops/s $\color{#35bf28}+4.47\%$
test_cql_speed[False-backward] 48.5355ms 45.5647ms 21.9468 Ops/s 21.1049 Ops/s $\color{#35bf28}+3.99\%$
test_cql_speed[True-None] 17.3134ms 15.8134ms 63.2376 Ops/s 61.8370 Ops/s $\color{#35bf28}+2.27\%$
test_cql_speed[True-backward] 23.7335ms 22.6015ms 44.2449 Ops/s 43.0828 Ops/s $\color{#35bf28}+2.70\%$
test_cql_speed[reduce-overhead-None] 16.6464ms 15.7210ms 63.6090 Ops/s 62.3733 Ops/s $\color{#35bf28}+1.98\%$
test_cql_speed[reduce-overhead-backward] 24.6425ms 22.2587ms 44.9263 Ops/s 42.3444 Ops/s $\textbf{\color{#35bf28}+6.10\%}$
test_a2c_speed[False-None] 8.1211ms 7.0584ms 141.6761 Ops/s 128.5104 Ops/s $\textbf{\color{#35bf28}+10.24\%}$
test_a2c_speed[False-backward] 16.5282ms 14.4442ms 69.2321 Ops/s 64.4244 Ops/s $\textbf{\color{#35bf28}+7.46\%}$
test_a2c_speed[True-None] 3.7135ms 3.2996ms 303.0664 Ops/s 280.7972 Ops/s $\textbf{\color{#35bf28}+7.93\%}$
test_a2c_speed[True-backward] 10.6763ms 9.9231ms 100.7752 Ops/s 94.5419 Ops/s $\textbf{\color{#35bf28}+6.59\%}$
test_a2c_speed[reduce-overhead-None] 5.3782ms 3.4044ms 293.7394 Ops/s 291.6656 Ops/s $\color{#35bf28}+0.71\%$
test_a2c_speed[reduce-overhead-backward] 9.9587ms 9.6893ms 103.2070 Ops/s 96.5629 Ops/s $\textbf{\color{#35bf28}+6.88\%}$
test_ppo_speed[False-None] 9.3525ms 7.3020ms 136.9479 Ops/s 129.9074 Ops/s $\textbf{\color{#35bf28}+5.42\%}$
test_ppo_speed[False-backward] 17.0978ms 14.5942ms 68.5205 Ops/s 66.0180 Ops/s $\color{#35bf28}+3.79\%$
test_ppo_speed[True-None] 4.3508ms 3.7035ms 270.0167 Ops/s 263.3887 Ops/s $\color{#35bf28}+2.52\%$
test_ppo_speed[True-backward] 9.9547ms 9.5571ms 104.6339 Ops/s 100.4866 Ops/s $\color{#35bf28}+4.13\%$
test_ppo_speed[reduce-overhead-None] 4.5466ms 3.7152ms 269.1674 Ops/s 262.1466 Ops/s $\color{#35bf28}+2.68\%$
test_ppo_speed[reduce-overhead-backward] 10.9198ms 9.8499ms 101.5244 Ops/s 99.6337 Ops/s $\color{#35bf28}+1.90\%$
test_reinforce_speed[False-None] 8.2343ms 6.5809ms 151.9560 Ops/s 152.7794 Ops/s $\color{#d91a1a}-0.54\%$
test_reinforce_speed[False-backward] 11.6184ms 10.0437ms 99.5651 Ops/s 100.4610 Ops/s $\color{#d91a1a}-0.89\%$
test_reinforce_speed[True-None] 4.9457ms 2.7887ms 358.5946 Ops/s 361.7219 Ops/s $\color{#d91a1a}-0.86\%$
test_reinforce_speed[True-backward] 9.4116ms 8.7215ms 114.6591 Ops/s 108.8612 Ops/s $\textbf{\color{#35bf28}+5.33\%}$
test_reinforce_speed[reduce-overhead-None] 3.5898ms 2.6786ms 373.3246 Ops/s 360.6498 Ops/s $\color{#35bf28}+3.51\%$
test_reinforce_speed[reduce-overhead-backward] 9.5030ms 8.7296ms 114.5524 Ops/s 107.0845 Ops/s $\textbf{\color{#35bf28}+6.97\%}$
test_iql_speed[False-None] 35.1085ms 32.2283ms 31.0286 Ops/s 30.1679 Ops/s $\color{#35bf28}+2.85\%$
test_iql_speed[False-backward] 47.0783ms 44.8705ms 22.2864 Ops/s 21.6180 Ops/s $\color{#35bf28}+3.09\%$
test_iql_speed[True-None] 12.1111ms 10.5678ms 94.6269 Ops/s 89.2328 Ops/s $\textbf{\color{#35bf28}+6.05\%}$
test_iql_speed[True-backward] 23.5670ms 21.5693ms 46.3622 Ops/s 43.8194 Ops/s $\textbf{\color{#35bf28}+5.80\%}$
test_iql_speed[reduce-overhead-None] 12.3973ms 10.8615ms 92.0684 Ops/s 89.5048 Ops/s $\color{#35bf28}+2.86\%$
test_iql_speed[reduce-overhead-backward] 23.6614ms 21.5196ms 46.4693 Ops/s 44.1013 Ops/s $\textbf{\color{#35bf28}+5.37\%}$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.0741ms 4.8009ms 208.2944 Ops/s 198.5251 Ops/s $\color{#35bf28}+4.92\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7547ms 0.5183ms 1.9293 KOps/s 1.9635 KOps/s $\color{#d91a1a}-1.74\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 1.3786ms 0.5137ms 1.9465 KOps/s 2.0298 KOps/s $\color{#d91a1a}-4.10\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 4.9823ms 4.5404ms 220.2448 Ops/s 206.2759 Ops/s $\textbf{\color{#35bf28}+6.77\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.3558ms 0.4943ms 2.0231 KOps/s 1.9251 KOps/s $\textbf{\color{#35bf28}+5.09\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7830ms 0.4779ms 2.0924 KOps/s 2.0546 KOps/s $\color{#35bf28}+1.84\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.1861ms 1.6338ms 612.0860 Ops/s 603.9677 Ops/s $\color{#35bf28}+1.34\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 2.0335ms 1.5790ms 633.2977 Ops/s 624.8215 Ops/s $\color{#35bf28}+1.36\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 7.4543ms 4.6839ms 213.4994 Ops/s 193.8914 Ops/s $\textbf{\color{#35bf28}+10.11\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.0244ms 0.6356ms 1.5733 KOps/s 1.5088 KOps/s $\color{#35bf28}+4.27\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9791ms 0.6202ms 1.6125 KOps/s 1.5825 KOps/s $\color{#35bf28}+1.89\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3914ms 4.6263ms 216.1540 Ops/s 206.1996 Ops/s $\color{#35bf28}+4.83\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7457ms 0.5153ms 1.9408 KOps/s 1.9056 KOps/s $\color{#35bf28}+1.84\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 7.1940ms 0.5037ms 1.9854 KOps/s 1.9965 KOps/s $\color{#d91a1a}-0.56\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 5.3919ms 4.7281ms 211.5002 Ops/s 206.7992 Ops/s $\color{#35bf28}+2.27\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.2309ms 0.5059ms 1.9768 KOps/s 1.9511 KOps/s $\color{#35bf28}+1.32\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6717ms 0.4745ms 2.1073 KOps/s 2.0752 KOps/s $\color{#35bf28}+1.54\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.6581ms 5.0611ms 197.5869 Ops/s 198.3926 Ops/s $\color{#d91a1a}-0.41\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.0462ms 0.6663ms 1.5009 KOps/s 1.5245 KOps/s $\color{#d91a1a}-1.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9379ms 0.6311ms 1.5847 KOps/s 1.5783 KOps/s $\color{#35bf28}+0.41\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 5.5090ms 4.1278ms 242.2574 Ops/s 241.5715 Ops/s $\color{#35bf28}+0.28\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 8.1017ms 2.2917ms 436.3604 Ops/s 416.9007 Ops/s $\color{#35bf28}+4.67\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 2.5292ms 1.2481ms 801.2422 Ops/s 792.0154 Ops/s $\color{#35bf28}+1.16\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4032s 12.1135ms 82.5525 Ops/s 247.2428 Ops/s $\textbf{\color{#d91a1a}-66.61\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 6.8411ms 2.2879ms 437.0801 Ops/s 383.7659 Ops/s $\textbf{\color{#35bf28}+13.89\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 1.7114ms 1.2063ms 828.9525 Ops/s 744.8188 Ops/s $\textbf{\color{#35bf28}+11.30\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 6.2129ms 4.3419ms 230.3151 Ops/s 247.3410 Ops/s $\textbf{\color{#d91a1a}-6.88\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 9.2325ms 2.6675ms 374.8866 Ops/s 410.7045 Ops/s $\textbf{\color{#d91a1a}-8.72\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 5.0960ms 1.4996ms 666.8314 Ops/s 651.6908 Ops/s $\color{#35bf28}+2.32\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 12.1270ms 11.2151ms 89.1659 Ops/s 84.4333 Ops/s $\textbf{\color{#35bf28}+5.61\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 15.1338ms 14.6549ms 68.2368 Ops/s 69.3760 Ops/s $\color{#d91a1a}-1.64\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 22.7115ms 20.1685ms 49.5822 Ops/s 47.6960 Ops/s $\color{#35bf28}+3.95\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 16.7335ms 14.7366ms 67.8584 Ops/s 67.6561 Ops/s $\color{#35bf28}+0.30\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.8213ms 20.0691ms 49.8279 Ops/s 47.3313 Ops/s $\textbf{\color{#35bf28}+5.27\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 16.2109ms 15.7905ms 63.3292 Ops/s 61.7337 Ops/s $\color{#35bf28}+2.58\%$

Copy link

github-actions bot commented Nov 4, 2024

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}5$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7333s 0.7302s 1.3694 Ops/s 1.3239 Ops/s $\color{#35bf28}+3.44\%$
test_transformed 0.9695s 0.9677s 1.0333 Ops/s 1.0153 Ops/s $\color{#35bf28}+1.78\%$
test_serial 2.0882s 2.0868s 0.4792 Ops/s 0.4708 Ops/s $\color{#35bf28}+1.79\%$
test_parallel 1.9735s 1.9445s 0.5143 Ops/s 0.5042 Ops/s $\color{#35bf28}+2.00\%$
test_step_mdp_speed[True-True-True-True-True] 0.2030ms 35.4631μs 28.1983 KOps/s 28.4153 KOps/s $\color{#d91a1a}-0.76\%$
test_step_mdp_speed[True-True-True-True-False] 48.0200μs 20.6173μs 48.5030 KOps/s 49.2996 KOps/s $\color{#d91a1a}-1.62\%$
test_step_mdp_speed[True-True-True-False-True] 46.9510μs 19.9404μs 50.1495 KOps/s 50.6730 KOps/s $\color{#d91a1a}-1.03\%$
test_step_mdp_speed[True-True-True-False-False] 39.6810μs 11.4664μs 87.2111 KOps/s 88.4767 KOps/s $\color{#d91a1a}-1.43\%$
test_step_mdp_speed[True-True-False-True-True] 72.7210μs 37.6514μs 26.5594 KOps/s 26.9589 KOps/s $\color{#d91a1a}-1.48\%$
test_step_mdp_speed[True-True-False-True-False] 64.1010μs 22.0518μs 45.3478 KOps/s 45.0050 KOps/s $\color{#35bf28}+0.76\%$
test_step_mdp_speed[True-True-False-False-True] 54.6210μs 22.0393μs 45.3735 KOps/s 45.4314 KOps/s $\color{#d91a1a}-0.13\%$
test_step_mdp_speed[True-True-False-False-False] 46.4710μs 13.4063μs 74.5918 KOps/s 74.5203 KOps/s $\color{#35bf28}+0.10\%$
test_step_mdp_speed[True-False-True-True-True] 89.6110μs 39.1301μs 25.5558 KOps/s 25.3818 KOps/s $\color{#35bf28}+0.69\%$
test_step_mdp_speed[True-False-True-True-False] 87.6110μs 23.6096μs 42.3556 KOps/s 41.2034 KOps/s $\color{#35bf28}+2.80\%$
test_step_mdp_speed[True-False-True-False-True] 52.4110μs 21.5758μs 46.3483 KOps/s 45.0087 KOps/s $\color{#35bf28}+2.98\%$
test_step_mdp_speed[True-False-True-False-False] 46.9110μs 13.3292μs 75.0234 KOps/s 73.9296 KOps/s $\color{#35bf28}+1.48\%$
test_step_mdp_speed[True-False-False-True-True] 75.5210μs 40.7960μs 24.5122 KOps/s 24.0048 KOps/s $\color{#35bf28}+2.11\%$
test_step_mdp_speed[True-False-False-True-False] 61.9610μs 25.7437μs 38.8444 KOps/s 37.9271 KOps/s $\color{#35bf28}+2.42\%$
test_step_mdp_speed[True-False-False-False-True] 71.0510μs 23.4982μs 42.5565 KOps/s 42.4616 KOps/s $\color{#35bf28}+0.22\%$
test_step_mdp_speed[True-False-False-False-False] 51.0310μs 15.2522μs 65.5645 KOps/s 65.1699 KOps/s $\color{#35bf28}+0.61\%$
test_step_mdp_speed[False-True-True-True-True] 82.5610μs 39.3169μs 25.4344 KOps/s 24.6902 KOps/s $\color{#35bf28}+3.01\%$
test_step_mdp_speed[False-True-True-True-False] 60.7810μs 24.2302μs 41.2708 KOps/s 40.6446 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[False-True-True-False-True] 52.2110μs 25.7965μs 38.7649 KOps/s 38.3793 KOps/s $\color{#35bf28}+1.00\%$
test_step_mdp_speed[False-True-True-False-False] 42.9300μs 15.0397μs 66.4906 KOps/s 65.2073 KOps/s $\color{#35bf28}+1.97\%$
test_step_mdp_speed[False-True-False-True-True] 74.5910μs 40.9337μs 24.4298 KOps/s 23.8871 KOps/s $\color{#35bf28}+2.27\%$
test_step_mdp_speed[False-True-False-True-False] 57.5610μs 25.5978μs 39.0659 KOps/s 38.1915 KOps/s $\color{#35bf28}+2.29\%$
test_step_mdp_speed[False-True-False-False-True] 3.4367ms 27.5527μs 36.2941 KOps/s 36.2221 KOps/s $\color{#35bf28}+0.20\%$
test_step_mdp_speed[False-True-False-False-False] 59.3200μs 16.8602μs 59.3114 KOps/s 58.7394 KOps/s $\color{#35bf28}+0.97\%$
test_step_mdp_speed[False-False-True-True-True] 75.2710μs 43.3441μs 23.0712 KOps/s 22.9778 KOps/s $\color{#35bf28}+0.41\%$
test_step_mdp_speed[False-False-True-True-False] 57.4310μs 27.9113μs 35.8278 KOps/s 35.3314 KOps/s $\color{#35bf28}+1.41\%$
test_step_mdp_speed[False-False-True-False-True] 60.4210μs 27.0774μs 36.9311 KOps/s 36.5865 KOps/s $\color{#35bf28}+0.94\%$
test_step_mdp_speed[False-False-True-False-False] 53.7410μs 17.0494μs 58.6531 KOps/s 59.1478 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-False-False-True-True] 92.5510μs 45.5689μs 21.9448 KOps/s 21.8576 KOps/s $\color{#35bf28}+0.40\%$
test_step_mdp_speed[False-False-False-True-False] 64.8200μs 29.9043μs 33.4400 KOps/s 33.2427 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-False-False-False-True] 64.6910μs 28.6735μs 34.8754 KOps/s 34.4419 KOps/s $\color{#35bf28}+1.26\%$
test_step_mdp_speed[False-False-False-False-False] 49.6610μs 18.6401μs 53.6477 KOps/s 53.3043 KOps/s $\color{#35bf28}+0.64\%$
test_values[generalized_advantage_estimate-True-True] 25.0522ms 24.5356ms 40.7571 Ops/s 40.9862 Ops/s $\color{#d91a1a}-0.56\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1026s 2.9396ms 340.1783 Ops/s 320.1125 Ops/s $\textbf{\color{#35bf28}+6.27\%}$
test_values[td0_return_estimate-False-False] 90.1510μs 64.4813μs 15.5084 KOps/s 15.6960 KOps/s $\color{#d91a1a}-1.20\%$
test_values[td1_return_estimate-False-False] 56.8741ms 55.0894ms 18.1523 Ops/s 18.2683 Ops/s $\color{#d91a1a}-0.63\%$
test_values[vec_td1_return_estimate-False-False] 1.2708ms 1.0675ms 936.7679 Ops/s 942.6748 Ops/s $\color{#d91a1a}-0.63\%$
test_values[td_lambda_return_estimate-True-False] 88.8969ms 87.3155ms 11.4527 Ops/s 11.5497 Ops/s $\color{#d91a1a}-0.84\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.2439ms 1.0654ms 938.5729 Ops/s 937.2659 Ops/s $\color{#35bf28}+0.14\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 24.9625ms 24.6639ms 40.5452 Ops/s 40.9633 Ops/s $\color{#d91a1a}-1.02\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0208ms 0.7394ms 1.3524 KOps/s 1.3738 KOps/s $\color{#d91a1a}-1.56\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7422ms 0.6558ms 1.5248 KOps/s 1.5351 KOps/s $\color{#d91a1a}-0.67\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5051ms 1.4651ms 682.5654 Ops/s 686.6393 Ops/s $\color{#d91a1a}-0.59\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7055ms 0.6710ms 1.4904 KOps/s 1.5033 KOps/s $\color{#d91a1a}-0.86\%$
test_dqn_speed[False-None] 0.1017s 1.4258ms 701.3683 Ops/s 767.9257 Ops/s $\textbf{\color{#d91a1a}-8.67\%}$
test_dqn_speed[False-backward] 1.8174ms 1.7717ms 564.4202 Ops/s 555.7347 Ops/s $\color{#35bf28}+1.56\%$
test_dqn_speed[True-None] 1.2061ms 0.5543ms 1.8042 KOps/s 1.8210 KOps/s $\color{#d91a1a}-0.92\%$
test_dqn_speed[True-backward] 1.0364ms 1.0015ms 998.4845 Ops/s 796.4257 Ops/s $\textbf{\color{#35bf28}+25.37\%}$
test_dqn_speed[reduce-overhead-None] 0.6690ms 0.5571ms 1.7951 KOps/s 1.7773 KOps/s $\color{#35bf28}+1.00\%$
test_dqn_speed[reduce-overhead-backward] 1.0563ms 1.0061ms 993.9444 Ops/s 997.0125 Ops/s $\color{#d91a1a}-0.31\%$
test_ddpg_speed[False-None] 3.2381ms 2.6199ms 381.6947 Ops/s 374.6476 Ops/s $\color{#35bf28}+1.88\%$
test_ddpg_speed[False-backward] 4.4109ms 3.8976ms 256.5679 Ops/s 254.4655 Ops/s $\color{#35bf28}+0.83\%$
test_ddpg_speed[True-None] 1.3023ms 1.2203ms 819.4755 Ops/s 813.9390 Ops/s $\color{#35bf28}+0.68\%$
test_ddpg_speed[True-backward] 2.2614ms 2.2109ms 452.3066 Ops/s 395.1379 Ops/s $\textbf{\color{#35bf28}+14.47\%}$
test_ddpg_speed[reduce-overhead-None] 1.4017ms 1.2282ms 814.1687 Ops/s 800.4739 Ops/s $\color{#35bf28}+1.71\%$
test_ddpg_speed[reduce-overhead-backward] 2.2562ms 2.2015ms 454.2308 Ops/s 453.8883 Ops/s $\color{#35bf28}+0.08\%$
test_sac_speed[False-None] 8.2057ms 7.3027ms 136.9358 Ops/s 133.9261 Ops/s $\color{#35bf28}+2.25\%$
test_sac_speed[False-backward] 11.0137ms 10.5257ms 95.0055 Ops/s 93.4320 Ops/s $\color{#35bf28}+1.68\%$
test_sac_speed[True-None] 2.1527ms 1.9734ms 506.7368 Ops/s 495.0174 Ops/s $\color{#35bf28}+2.37\%$
test_sac_speed[True-backward] 4.0362ms 3.9144ms 255.4687 Ops/s 257.1092 Ops/s $\color{#d91a1a}-0.64\%$
test_sac_speed[reduce-overhead-None] 2.1187ms 2.0024ms 499.4113 Ops/s 500.3977 Ops/s $\color{#d91a1a}-0.20\%$
test_sac_speed[reduce-overhead-backward] 4.2113ms 3.9464ms 253.3987 Ops/s 256.6692 Ops/s $\color{#d91a1a}-1.27\%$
test_redq_speed[False-None] 15.6241ms 10.3683ms 96.4476 Ops/s 86.7418 Ops/s $\textbf{\color{#35bf28}+11.19\%}$
test_redq_speed[False-backward] 18.0721ms 17.0875ms 58.5224 Ops/s 56.7032 Ops/s $\color{#35bf28}+3.21\%$
test_redq_speed[True-None] 3.9496ms 3.4902ms 286.5144 Ops/s 290.7941 Ops/s $\color{#d91a1a}-1.47\%$
test_redq_speed[True-backward] 8.7444ms 8.5129ms 117.4686 Ops/s 118.0206 Ops/s $\color{#d91a1a}-0.47\%$
test_redq_speed[reduce-overhead-None] 3.8102ms 3.4790ms 287.4352 Ops/s 285.5574 Ops/s $\color{#35bf28}+0.66\%$
test_redq_speed[reduce-overhead-backward] 8.7349ms 8.4341ms 118.5669 Ops/s 117.2273 Ops/s $\color{#35bf28}+1.14\%$
test_redq_deprec_speed[False-None] 10.7937ms 10.3324ms 96.7830 Ops/s 96.2887 Ops/s $\color{#35bf28}+0.51\%$
test_redq_deprec_speed[False-backward] 15.5816ms 15.1355ms 66.0697 Ops/s 66.1336 Ops/s $\color{#d91a1a}-0.10\%$
test_redq_deprec_speed[True-None] 3.3536ms 3.1838ms 314.0935 Ops/s 320.4441 Ops/s $\color{#d91a1a}-1.98\%$
test_redq_deprec_speed[True-backward] 7.3032ms 7.0799ms 141.2458 Ops/s 141.6902 Ops/s $\color{#d91a1a}-0.31\%$
test_redq_deprec_speed[reduce-overhead-None] 3.8737ms 3.1956ms 312.9341 Ops/s 321.0033 Ops/s $\color{#d91a1a}-2.51\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.2430ms 7.0595ms 141.6532 Ops/s 149.3338 Ops/s $\textbf{\color{#d91a1a}-5.14\%}$
test_td3_speed[False-None] 7.4733ms 7.2809ms 137.3452 Ops/s 134.2344 Ops/s $\color{#35bf28}+2.32\%$
test_td3_speed[False-backward] 10.5159ms 10.1183ms 98.8307 Ops/s 95.8246 Ops/s $\color{#35bf28}+3.14\%$
test_td3_speed[True-None] 1.9042ms 1.8796ms 532.0263 Ops/s 534.4133 Ops/s $\color{#d91a1a}-0.45\%$
test_td3_speed[True-backward] 3.7941ms 3.6665ms 272.7427 Ops/s 264.8446 Ops/s $\color{#35bf28}+2.98\%$
test_td3_speed[reduce-overhead-None] 1.9650ms 1.8789ms 532.2366 Ops/s 536.7999 Ops/s $\color{#d91a1a}-0.85\%$
test_td3_speed[reduce-overhead-backward] 3.7057ms 3.6354ms 275.0705 Ops/s 275.7223 Ops/s $\color{#d91a1a}-0.24\%$
test_cql_speed[False-None] 26.9076ms 24.4456ms 40.9072 Ops/s 40.5022 Ops/s $\color{#35bf28}+1.00\%$
test_cql_speed[False-backward] 38.1267ms 33.6839ms 29.6877 Ops/s 29.7717 Ops/s $\color{#d91a1a}-0.28\%$
test_cql_speed[True-None] 10.9977ms 10.7259ms 93.2320 Ops/s 93.7045 Ops/s $\color{#d91a1a}-0.50\%$
test_cql_speed[True-backward] 17.1489ms 16.7213ms 59.8040 Ops/s 61.6674 Ops/s $\color{#d91a1a}-3.02\%$
test_cql_speed[reduce-overhead-None] 11.0428ms 10.7841ms 92.7295 Ops/s 94.9744 Ops/s $\color{#d91a1a}-2.36\%$
test_cql_speed[reduce-overhead-backward] 16.9306ms 16.6413ms 60.0916 Ops/s 61.4987 Ops/s $\color{#d91a1a}-2.29\%$
test_a2c_speed[False-None] 5.4585ms 5.1905ms 192.6615 Ops/s 187.1788 Ops/s $\color{#35bf28}+2.93\%$
test_a2c_speed[False-backward] 13.2703ms 11.7929ms 84.7971 Ops/s 84.7910 Ops/s $+0.01\%$
test_a2c_speed[True-None] 3.0988ms 3.0151ms 331.6617 Ops/s 331.3115 Ops/s $\color{#35bf28}+0.11\%$
test_a2c_speed[True-backward] 8.5343ms 8.3532ms 119.7142 Ops/s 121.6197 Ops/s $\color{#d91a1a}-1.57\%$
test_a2c_speed[reduce-overhead-None] 3.1270ms 2.9943ms 333.9696 Ops/s 331.6086 Ops/s $\color{#35bf28}+0.71\%$
test_a2c_speed[reduce-overhead-backward] 8.5749ms 8.3376ms 119.9384 Ops/s 121.1155 Ops/s $\color{#d91a1a}-0.97\%$
test_ppo_speed[False-None] 7.3128ms 5.5867ms 178.9968 Ops/s 178.6870 Ops/s $\color{#35bf28}+0.17\%$
test_ppo_speed[False-backward] 12.5052ms 12.2399ms 81.7000 Ops/s 83.3624 Ops/s $\color{#d91a1a}-1.99\%$
test_ppo_speed[True-None] 3.4504ms 3.3437ms 299.0669 Ops/s 299.9169 Ops/s $\color{#d91a1a}-0.28\%$
test_ppo_speed[True-backward] 8.2645ms 8.1059ms 123.3670 Ops/s 123.2819 Ops/s $\color{#35bf28}+0.07\%$
test_ppo_speed[reduce-overhead-None] 3.4862ms 3.3623ms 297.4195 Ops/s 297.9063 Ops/s $\color{#d91a1a}-0.16\%$
test_ppo_speed[reduce-overhead-backward] 8.2051ms 8.0574ms 124.1090 Ops/s 125.3359 Ops/s $\color{#d91a1a}-0.98\%$
test_reinforce_speed[False-None] 6.3816ms 4.4038ms 227.0771 Ops/s 225.2111 Ops/s $\color{#35bf28}+0.83\%$
test_reinforce_speed[False-backward] 7.3813ms 7.1988ms 138.9127 Ops/s 136.0435 Ops/s $\color{#35bf28}+2.11\%$
test_reinforce_speed[True-None] 2.4338ms 2.2058ms 453.3582 Ops/s 438.9400 Ops/s $\color{#35bf28}+3.28\%$
test_reinforce_speed[True-backward] 7.2587ms 6.9821ms 143.2241 Ops/s 125.9508 Ops/s $\textbf{\color{#35bf28}+13.71\%}$
test_reinforce_speed[reduce-overhead-None] 2.3115ms 2.1952ms 455.5354 Ops/s 460.1964 Ops/s $\color{#d91a1a}-1.01\%$
test_reinforce_speed[reduce-overhead-backward] 7.3228ms 7.0373ms 142.0990 Ops/s 143.8459 Ops/s $\color{#d91a1a}-1.21\%$
test_iql_speed[False-None] 20.2237ms 19.4504ms 51.4128 Ops/s 51.6311 Ops/s $\color{#d91a1a}-0.42\%$
test_iql_speed[False-backward] 31.9622ms 29.9541ms 33.3845 Ops/s 33.5214 Ops/s $\color{#d91a1a}-0.41\%$
test_iql_speed[True-None] 6.9966ms 6.6212ms 151.0293 Ops/s 146.7898 Ops/s $\color{#35bf28}+2.89\%$
test_iql_speed[True-backward] 15.7603ms 15.3882ms 64.9849 Ops/s 65.2174 Ops/s $\color{#d91a1a}-0.36\%$
test_iql_speed[reduce-overhead-None] 7.0620ms 6.6729ms 149.8591 Ops/s 150.6892 Ops/s $\color{#d91a1a}-0.55\%$
test_iql_speed[reduce-overhead-backward] 15.7664ms 15.3621ms 65.0953 Ops/s 65.8698 Ops/s $\color{#d91a1a}-1.18\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.4999ms 6.3109ms 158.4570 Ops/s 157.6107 Ops/s $\color{#35bf28}+0.54\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7682ms 0.2623ms 3.8123 KOps/s 3.7924 KOps/s $\color{#35bf28}+0.53\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4471ms 0.2423ms 4.1273 KOps/s 4.1424 KOps/s $\color{#d91a1a}-0.37\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.2497ms 6.0290ms 165.8641 Ops/s 163.7141 Ops/s $\color{#35bf28}+1.31\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 2.4452ms 0.2541ms 3.9358 KOps/s 3.9284 KOps/s $\color{#35bf28}+0.19\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5227ms 0.2421ms 4.1298 KOps/s 4.2491 KOps/s $\color{#d91a1a}-2.81\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.5612ms 1.2939ms 772.8805 Ops/s 819.1815 Ops/s $\textbf{\color{#d91a1a}-5.65\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.3946ms 1.1856ms 843.4758 Ops/s 849.0907 Ops/s $\color{#d91a1a}-0.66\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.5549ms 6.2090ms 161.0566 Ops/s 158.4713 Ops/s $\color{#35bf28}+1.63\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 1.4117ms 0.4379ms 2.2837 KOps/s 2.3265 KOps/s $\color{#d91a1a}-1.84\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7008ms 0.4197ms 2.3829 KOps/s 2.3354 KOps/s $\color{#35bf28}+2.03\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.3357ms 6.0880ms 164.2563 Ops/s 164.5975 Ops/s $\color{#d91a1a}-0.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.6011ms 0.2626ms 3.8077 KOps/s 3.3078 KOps/s $\textbf{\color{#35bf28}+15.11\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.4518ms 0.2416ms 4.1391 KOps/s 3.6872 KOps/s $\textbf{\color{#35bf28}+12.26\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.3005ms 6.0702ms 164.7392 Ops/s 163.6414 Ops/s $\color{#35bf28}+0.67\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.4943ms 0.2528ms 3.9558 KOps/s 3.3071 KOps/s $\textbf{\color{#35bf28}+19.62\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 6.7767ms 0.2455ms 4.0730 KOps/s 4.1877 KOps/s $\color{#d91a1a}-2.74\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.7056ms 6.2014ms 161.2545 Ops/s 161.8023 Ops/s $\color{#d91a1a}-0.34\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.2907ms 0.4424ms 2.2606 KOps/s 2.2730 KOps/s $\color{#d91a1a}-0.55\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6187ms 0.4537ms 2.2043 KOps/s 2.5447 KOps/s $\textbf{\color{#d91a1a}-13.38\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4544s 14.3271ms 69.7977 Ops/s 191.4337 Ops/s $\textbf{\color{#d91a1a}-63.54\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 9.4784ms 2.0177ms 495.6022 Ops/s 444.7555 Ops/s $\textbf{\color{#35bf28}+11.43\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 6.9958ms 1.1670ms 856.8785 Ops/s 762.9166 Ops/s $\textbf{\color{#35bf28}+12.32\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.0470ms 5.3335ms 187.4949 Ops/s 190.0063 Ops/s $\color{#d91a1a}-1.32\%$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.7470ms 2.0094ms 497.6696 Ops/s 490.6418 Ops/s $\color{#35bf28}+1.43\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.8835ms 1.1501ms 869.4712 Ops/s 865.5948 Ops/s $\color{#35bf28}+0.45\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.3761s 12.9231ms 77.3808 Ops/s 35.1278 Ops/s $\textbf{\color{#35bf28}+120.28\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 3.9547ms 1.9373ms 516.1851 Ops/s 479.2062 Ops/s $\textbf{\color{#35bf28}+7.72\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 2.4034ms 1.2062ms 829.0675 Ops/s 748.5763 Ops/s $\textbf{\color{#35bf28}+10.75\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3178ms 12.7252ms 78.5843 Ops/s 76.2990 Ops/s $\color{#35bf28}+3.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.9575ms 16.5663ms 60.3636 Ops/s 60.3302 Ops/s $\color{#35bf28}+0.06\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 18.0629ms 17.2880ms 57.8435 Ops/s 55.7938 Ops/s $\color{#35bf28}+3.67\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.1068ms 16.6153ms 60.1856 Ops/s 60.5591 Ops/s $\color{#d91a1a}-0.62\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 17.7099ms 17.2857ms 57.8512 Ops/s 56.3408 Ops/s $\color{#35bf28}+2.68\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 18.6230ms 17.9627ms 55.6709 Ops/s 56.0894 Ops/s $\color{#d91a1a}-0.75\%$

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 8, 2024
ghstack-source-id: f72713742c6ef7936f1d560e4001b52f5cb641e8
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 8, 2024
ghstack-source-id: 6e84abfb640e6814c53da86ee60996269e739261
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 8, 2024
ghstack-source-id: 6d8a80a81ef1a97d1cd77cefc92af653c2d8e128
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 8, 2024
ghstack-source-id: 4c9984d9e75c7b8011756d5867a35c95097d848f
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 8, 2024
ghstack-source-id: 83d3d79c70b1b81e224593cd2f03149a08edf883
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: b03ec8b90bd26620906d5592e755e36e20971128
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: 55838b4e3bb8b984f14c85ba41e40ca96620571a
Pull Request resolved: #2358
[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 18, 2024
ghstack-source-id: e3d38491d6d88753a61178c4a5b8aeae11d1ae44
Pull Request resolved: #2358
vmoens added a commit that referenced this pull request Nov 19, 2024
ghstack-source-id: e3d38491d6d88753a61178c4a5b8aeae11d1ae44
Pull Request resolved: #2358
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants