Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Doc] Add AOTInductor back #2564

Merged
merged 1 commit into from
Nov 14, 2024
Merged

[Doc] Add AOTInductor back #2564

merged 1 commit into from
Nov 14, 2024

Conversation

vmoens
Copy link
Contributor

@vmoens vmoens commented Nov 14, 2024

Stack from ghstack (oldest at bottom):

[ghstack-poisoned]
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 774eb5973045861f284fdc67f74945b1eecdeaf2
Pull Request resolved: #2564
Copy link

pytorch-bot bot commented Nov 14, 2024

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/rl/2564

Note: Links to docs will display an error until the docs builds have been completed.

❗ 1 Active SEVs

There are 1 currently active SEVs. If your PR is affected, please view them below:

❌ 18 New Failures, 2 Pending, 3 Unrelated Failures

As of commit 78ac0a5 with merge base 9d292a0 (image):

NEW FAILURES - The following jobs have failed:

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Nov 14, 2024
@vmoens vmoens merged commit 78ac0a5 into gh/vmoens/40/base Nov 14, 2024
16 of 28 checks passed
vmoens added a commit that referenced this pull request Nov 14, 2024
ghstack-source-id: 774eb5973045861f284fdc67f74945b1eecdeaf2
Pull Request resolved: #2564
@vmoens vmoens deleted the gh/vmoens/40/head branch November 14, 2024 14:26
Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of CPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}3$. Worsened: $\large\color{#d91a1a}39$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.4319s 0.4309s 2.3210 Ops/s 2.2665 Ops/s $\color{#35bf28}+2.40\%$
test_transformed 0.7135s 0.6199s 1.6131 Ops/s 1.6470 Ops/s $\color{#d91a1a}-2.06\%$
test_serial 1.3438s 1.3352s 0.7490 Ops/s 0.7455 Ops/s $\color{#35bf28}+0.46\%$
test_parallel 1.3023s 1.2901s 0.7751 Ops/s 0.7716 Ops/s $\color{#35bf28}+0.46\%$
test_step_mdp_speed[True-True-True-True-True] 0.2359ms 27.3223μs 36.6002 KOps/s 37.4960 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[True-True-True-True-False] 73.1260μs 16.0322μs 62.3746 KOps/s 63.0476 KOps/s $\color{#d91a1a}-1.07\%$
test_step_mdp_speed[True-True-True-False-True] 47.2580μs 15.8814μs 62.9669 KOps/s 65.2681 KOps/s $\color{#d91a1a}-3.53\%$
test_step_mdp_speed[True-True-True-False-False] 59.7610μs 9.1563μs 109.2146 KOps/s 110.9245 KOps/s $\color{#d91a1a}-1.54\%$
test_step_mdp_speed[True-True-False-True-True] 90.1580μs 29.3258μs 34.0997 KOps/s 34.6462 KOps/s $\color{#d91a1a}-1.58\%$
test_step_mdp_speed[True-True-False-True-False] 52.9390μs 17.8238μs 56.1049 KOps/s 55.9115 KOps/s $\color{#35bf28}+0.35\%$
test_step_mdp_speed[True-True-False-False-True] 73.1870μs 17.2926μs 57.8282 KOps/s 58.7473 KOps/s $\color{#d91a1a}-1.56\%$
test_step_mdp_speed[True-True-False-False-False] 0.6259ms 10.6974μs 93.4806 KOps/s 93.7403 KOps/s $\color{#d91a1a}-0.28\%$
test_step_mdp_speed[True-False-True-True-True] 0.2248ms 32.2839μs 30.9752 KOps/s 32.4909 KOps/s $\color{#d91a1a}-4.66\%$
test_step_mdp_speed[True-False-True-True-False] 65.7520μs 19.5983μs 51.0247 KOps/s 52.0914 KOps/s $\color{#d91a1a}-2.05\%$
test_step_mdp_speed[True-False-True-False-True] 0.1521ms 17.7198μs 56.4341 KOps/s 58.3969 KOps/s $\color{#d91a1a}-3.36\%$
test_step_mdp_speed[True-False-True-False-False] 39.7050μs 10.8878μs 91.8455 KOps/s 93.3213 KOps/s $\color{#d91a1a}-1.58\%$
test_step_mdp_speed[True-False-False-True-True] 98.4040μs 32.2654μs 30.9929 KOps/s 31.4873 KOps/s $\color{#d91a1a}-1.57\%$
test_step_mdp_speed[True-False-False-True-False] 68.4470μs 21.0566μs 47.4911 KOps/s 48.1469 KOps/s $\color{#d91a1a}-1.36\%$
test_step_mdp_speed[True-False-False-False-True] 56.8160μs 18.7298μs 53.3907 KOps/s 53.5028 KOps/s $\color{#d91a1a}-0.21\%$
test_step_mdp_speed[True-False-False-False-False] 63.1680μs 12.2607μs 81.5615 KOps/s 81.8459 KOps/s $\color{#d91a1a}-0.35\%$
test_step_mdp_speed[False-True-True-True-True] 87.2030μs 30.9030μs 32.3593 KOps/s 32.9435 KOps/s $\color{#d91a1a}-1.77\%$
test_step_mdp_speed[False-True-True-True-False] 60.0920μs 19.2442μs 51.9637 KOps/s 51.9315 KOps/s $\color{#35bf28}+0.06\%$
test_step_mdp_speed[False-True-True-False-True] 2.4347ms 20.4711μs 48.8493 KOps/s 50.9860 KOps/s $\color{#d91a1a}-4.19\%$
test_step_mdp_speed[False-True-True-False-False] 0.1730ms 12.0809μs 82.7750 KOps/s 83.7277 KOps/s $\color{#d91a1a}-1.14\%$
test_step_mdp_speed[False-True-False-True-True] 70.1710μs 32.4468μs 30.8196 KOps/s 30.3694 KOps/s $\color{#35bf28}+1.48\%$
test_step_mdp_speed[False-True-False-True-False] 0.1698ms 20.8115μs 48.0505 KOps/s 48.1175 KOps/s $\color{#d91a1a}-0.14\%$
test_step_mdp_speed[False-True-False-False-True] 85.2290μs 21.3214μs 46.9011 KOps/s 46.1893 KOps/s $\color{#35bf28}+1.54\%$
test_step_mdp_speed[False-True-False-False-False] 44.4930μs 13.7965μs 72.4820 KOps/s 73.7059 KOps/s $\color{#d91a1a}-1.66\%$
test_step_mdp_speed[False-False-True-True-True] 0.1064ms 33.9914μs 29.4192 KOps/s 29.7416 KOps/s $\color{#d91a1a}-1.08\%$
test_step_mdp_speed[False-False-True-True-False] 0.6262ms 22.4882μs 44.4677 KOps/s 44.8458 KOps/s $\color{#d91a1a}-0.84\%$
test_step_mdp_speed[False-False-True-False-True] 65.2710μs 21.4105μs 46.7060 KOps/s 47.8513 KOps/s $\color{#d91a1a}-2.39\%$
test_step_mdp_speed[False-False-True-False-False] 71.9640μs 13.7324μs 72.8206 KOps/s 74.6605 KOps/s $\color{#d91a1a}-2.46\%$
test_step_mdp_speed[False-False-False-True-True] 0.1033ms 35.2782μs 28.3461 KOps/s 28.7187 KOps/s $\color{#d91a1a}-1.30\%$
test_step_mdp_speed[False-False-False-True-False] 63.3490μs 24.0395μs 41.5983 KOps/s 41.5974 KOps/s $+0.00\%$
test_step_mdp_speed[False-False-False-False-True] 52.1370μs 22.6717μs 44.1078 KOps/s 44.0921 KOps/s $\color{#35bf28}+0.04\%$
test_step_mdp_speed[False-False-False-False-False] 59.1900μs 15.0124μs 66.6114 KOps/s 67.9967 KOps/s $\color{#d91a1a}-2.04\%$
test_values[generalized_advantage_estimate-True-True] 10.5980ms 9.8513ms 101.5093 Ops/s 102.9201 Ops/s $\color{#d91a1a}-1.37\%$
test_values[vec_generalized_advantage_estimate-True-True] 37.3204ms 35.6334ms 28.0636 Ops/s 29.8769 Ops/s $\textbf{\color{#d91a1a}-6.07\%}$
test_values[td0_return_estimate-False-False] 0.2438ms 0.1843ms 5.4253 KOps/s 5.1463 KOps/s $\textbf{\color{#35bf28}+5.42\%}$
test_values[td1_return_estimate-False-False] 28.1687ms 25.0481ms 39.9232 Ops/s 40.9881 Ops/s $\color{#d91a1a}-2.60\%$
test_values[vec_td1_return_estimate-False-False] 37.6548ms 35.8617ms 27.8849 Ops/s 29.8439 Ops/s $\textbf{\color{#d91a1a}-6.56\%}$
test_values[td_lambda_return_estimate-True-False] 38.7630ms 35.5513ms 28.1284 Ops/s 28.6507 Ops/s $\color{#d91a1a}-1.82\%$
test_values[vec_td_lambda_return_estimate-True-False] 37.3170ms 35.8659ms 27.8817 Ops/s 29.7984 Ops/s $\textbf{\color{#d91a1a}-6.43\%}$
test_gae_speed[generalized_advantage_estimate-False-1-512] 8.8224ms 8.5108ms 117.4973 Ops/s 119.5975 Ops/s $\color{#d91a1a}-1.76\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 2.3794ms 1.9400ms 515.4649 Ops/s 491.2866 Ops/s $\color{#35bf28}+4.92\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.6057ms 0.3594ms 2.7823 KOps/s 2.7586 KOps/s $\color{#35bf28}+0.86\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 47.5215ms 44.4989ms 22.4725 Ops/s 24.6610 Ops/s $\textbf{\color{#d91a1a}-8.87\%}$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 3.2520ms 3.0708ms 325.6479 Ops/s 327.0568 Ops/s $\color{#d91a1a}-0.43\%$
test_dqn_speed[False-None] 6.1949ms 1.3452ms 743.4044 Ops/s 740.8264 Ops/s $\color{#35bf28}+0.35\%$
test_dqn_speed[False-backward] 2.4734ms 1.8754ms 533.2085 Ops/s 541.2518 Ops/s $\color{#d91a1a}-1.49\%$
test_dqn_speed[True-None] 0.9217ms 0.4653ms 2.1490 KOps/s 2.1107 KOps/s $\color{#35bf28}+1.81\%$
test_dqn_speed[True-backward] 1.0058ms 0.9094ms 1.0996 KOps/s 1.1063 KOps/s $\color{#d91a1a}-0.60\%$
test_dqn_speed[reduce-overhead-None] 0.5672ms 0.4684ms 2.1351 KOps/s 2.1245 KOps/s $\color{#35bf28}+0.50\%$
test_dqn_speed[reduce-overhead-backward] 1.0018ms 0.9202ms 1.0868 KOps/s 1.0982 KOps/s $\color{#d91a1a}-1.04\%$
test_ddpg_speed[False-None] 4.5346ms 2.8401ms 352.0969 Ops/s 356.4370 Ops/s $\color{#d91a1a}-1.22\%$
test_ddpg_speed[False-backward] 4.3682ms 4.0489ms 246.9809 Ops/s 252.2957 Ops/s $\color{#d91a1a}-2.11\%$
test_ddpg_speed[True-None] 1.7671ms 1.0221ms 978.3445 Ops/s 996.2250 Ops/s $\color{#d91a1a}-1.79\%$
test_ddpg_speed[True-backward] 2.4479ms 2.0523ms 487.2534 Ops/s 514.3913 Ops/s $\textbf{\color{#d91a1a}-5.28\%}$
test_ddpg_speed[reduce-overhead-None] 2.0080ms 1.0298ms 971.0757 Ops/s 998.2227 Ops/s $\color{#d91a1a}-2.72\%$
test_ddpg_speed[reduce-overhead-backward] 2.0341ms 1.9433ms 514.5904 Ops/s 511.5013 Ops/s $\color{#35bf28}+0.60\%$
test_sac_speed[False-None] 8.5579ms 7.9727ms 125.4287 Ops/s 124.5568 Ops/s $\color{#35bf28}+0.70\%$
test_sac_speed[False-backward] 12.7859ms 12.0417ms 83.0445 Ops/s 89.3723 Ops/s $\textbf{\color{#d91a1a}-7.08\%}$
test_sac_speed[True-None] 2.3560ms 1.8473ms 541.3271 Ops/s 539.0540 Ops/s $\color{#35bf28}+0.42\%$
test_sac_speed[True-backward] 4.1055ms 3.7932ms 263.6275 Ops/s 279.5320 Ops/s $\textbf{\color{#d91a1a}-5.69\%}$
test_sac_speed[reduce-overhead-None] 2.1999ms 1.9663ms 508.5773 Ops/s 538.2780 Ops/s $\textbf{\color{#d91a1a}-5.52\%}$
test_sac_speed[reduce-overhead-backward] 4.0848ms 3.9265ms 254.6827 Ops/s 280.9728 Ops/s $\textbf{\color{#d91a1a}-9.36\%}$
test_redq_speed[False-None] 15.5184ms 13.3879ms 74.6945 Ops/s 76.2073 Ops/s $\color{#d91a1a}-1.99\%$
test_redq_speed[False-backward] 24.2283ms 23.2737ms 42.9669 Ops/s 43.9077 Ops/s $\color{#d91a1a}-2.14\%$
test_redq_speed[True-None] 6.7066ms 5.8787ms 170.1060 Ops/s 182.7352 Ops/s $\textbf{\color{#d91a1a}-6.91\%}$
test_redq_speed[True-backward] 14.0966ms 13.3857ms 74.7065 Ops/s 75.2759 Ops/s $\color{#d91a1a}-0.76\%$
test_redq_speed[reduce-overhead-None] 6.7765ms 5.8967ms 169.5876 Ops/s 182.5667 Ops/s $\textbf{\color{#d91a1a}-7.11\%}$
test_redq_speed[reduce-overhead-backward] 13.9996ms 13.3746ms 74.7688 Ops/s 76.8699 Ops/s $\color{#d91a1a}-2.73\%$
test_redq_deprec_speed[False-None] 16.6616ms 14.6278ms 68.3631 Ops/s 69.8646 Ops/s $\color{#d91a1a}-2.15\%$
test_redq_deprec_speed[False-backward] 29.5058ms 20.9418ms 47.7514 Ops/s 49.5368 Ops/s $\color{#d91a1a}-3.60\%$
test_redq_deprec_speed[True-None] 5.5145ms 4.6242ms 216.2518 Ops/s 216.0258 Ops/s $\color{#35bf28}+0.10\%$
test_redq_deprec_speed[True-backward] 10.0169ms 9.6158ms 103.9955 Ops/s 103.9060 Ops/s $\color{#35bf28}+0.09\%$
test_redq_deprec_speed[reduce-overhead-None] 5.3253ms 4.6842ms 213.4816 Ops/s 212.4229 Ops/s $\color{#35bf28}+0.50\%$
test_redq_deprec_speed[reduce-overhead-backward] 11.2426ms 9.7395ms 102.6742 Ops/s 105.6877 Ops/s $\color{#d91a1a}-2.85\%$
test_td3_speed[False-None] 9.3744ms 8.9635ms 111.5633 Ops/s 106.8497 Ops/s $\color{#35bf28}+4.41\%$
test_td3_speed[False-backward] 12.5756ms 11.9482ms 83.6945 Ops/s 82.3406 Ops/s $\color{#35bf28}+1.64\%$
test_td3_speed[True-None] 2.5162ms 2.1323ms 468.9671 Ops/s 554.8604 Ops/s $\textbf{\color{#d91a1a}-15.48\%}$
test_td3_speed[True-backward] 4.4769ms 4.1085ms 243.3994 Ops/s 272.9553 Ops/s $\textbf{\color{#d91a1a}-10.83\%}$
test_td3_speed[reduce-overhead-None] 2.1358ms 1.9362ms 516.4636 Ops/s 558.3436 Ops/s $\textbf{\color{#d91a1a}-7.50\%}$
test_td3_speed[reduce-overhead-backward] 4.2026ms 3.7797ms 264.5724 Ops/s 288.2938 Ops/s $\textbf{\color{#d91a1a}-8.23\%}$
test_cql_speed[False-None] 47.1136ms 38.1879ms 26.1863 Ops/s 27.2735 Ops/s $\color{#d91a1a}-3.99\%$
test_cql_speed[False-backward] 51.5253ms 48.8341ms 20.4775 Ops/s 21.7075 Ops/s $\textbf{\color{#d91a1a}-5.67\%}$
test_cql_speed[True-None] 18.0746ms 16.6239ms 60.1544 Ops/s 62.3020 Ops/s $\color{#d91a1a}-3.45\%$
test_cql_speed[True-backward] 25.4420ms 24.2876ms 41.1733 Ops/s 43.1037 Ops/s $\color{#d91a1a}-4.48\%$
test_cql_speed[reduce-overhead-None] 17.6968ms 16.5382ms 60.4661 Ops/s 61.9924 Ops/s $\color{#d91a1a}-2.46\%$
test_cql_speed[reduce-overhead-backward] 24.6109ms 23.7856ms 42.0422 Ops/s 43.4689 Ops/s $\color{#d91a1a}-3.28\%$
test_a2c_speed[False-None] 8.9050ms 7.8643ms 127.1577 Ops/s 132.4567 Ops/s $\color{#d91a1a}-4.00\%$
test_a2c_speed[False-backward] 21.1008ms 15.7478ms 63.5010 Ops/s 51.2115 Ops/s $\textbf{\color{#35bf28}+24.00\%}$
test_a2c_speed[True-None] 8.9486ms 4.0369ms 247.7159 Ops/s 291.5284 Ops/s $\textbf{\color{#d91a1a}-15.03\%}$
test_a2c_speed[True-backward] 10.9375ms 10.6962ms 93.4911 Ops/s 99.8853 Ops/s $\textbf{\color{#d91a1a}-6.40\%}$
test_a2c_speed[reduce-overhead-None] 4.5491ms 3.7810ms 264.4801 Ops/s 297.2476 Ops/s $\textbf{\color{#d91a1a}-11.02\%}$
test_a2c_speed[reduce-overhead-backward] 11.2153ms 10.8694ms 92.0012 Ops/s 98.1232 Ops/s $\textbf{\color{#d91a1a}-6.24\%}$
test_ppo_speed[False-None] 8.7201ms 8.1992ms 121.9628 Ops/s 122.0636 Ops/s $\color{#d91a1a}-0.08\%$
test_ppo_speed[False-backward] 17.8374ms 16.1976ms 61.7377 Ops/s 64.5452 Ops/s $\color{#d91a1a}-4.35\%$
test_ppo_speed[True-None] 5.4234ms 4.3113ms 231.9479 Ops/s 254.7221 Ops/s $\textbf{\color{#d91a1a}-8.94\%}$
test_ppo_speed[True-backward] 11.0398ms 10.6780ms 93.6501 Ops/s 96.6136 Ops/s $\color{#d91a1a}-3.07\%$
test_ppo_speed[reduce-overhead-None] 4.6546ms 4.2845ms 233.3996 Ops/s 237.8366 Ops/s $\color{#d91a1a}-1.87\%$
test_ppo_speed[reduce-overhead-backward] 10.9783ms 10.5490ms 94.7956 Ops/s 101.4052 Ops/s $\textbf{\color{#d91a1a}-6.52\%}$
test_reinforce_speed[False-None] 9.1804ms 7.0931ms 140.9819 Ops/s 151.3790 Ops/s $\textbf{\color{#d91a1a}-6.87\%}$
test_reinforce_speed[False-backward] 11.5825ms 10.6034ms 94.3096 Ops/s 98.6859 Ops/s $\color{#d91a1a}-4.43\%$
test_reinforce_speed[True-None] 3.8084ms 3.1945ms 313.0421 Ops/s 363.3004 Ops/s $\textbf{\color{#d91a1a}-13.83\%}$
test_reinforce_speed[True-backward] 10.3246ms 9.6009ms 104.1570 Ops/s 102.8614 Ops/s $\color{#35bf28}+1.26\%$
test_reinforce_speed[reduce-overhead-None] 3.7027ms 3.1348ms 318.9985 Ops/s 364.0058 Ops/s $\textbf{\color{#d91a1a}-12.36\%}$
test_reinforce_speed[reduce-overhead-backward] 9.8102ms 9.5877ms 104.3008 Ops/s 112.1654 Ops/s $\textbf{\color{#d91a1a}-7.01\%}$
test_iql_speed[False-None] 35.7807ms 34.0762ms 29.3460 Ops/s 30.5821 Ops/s $\color{#d91a1a}-4.04\%$
test_iql_speed[False-backward] 48.7970ms 47.5594ms 21.0263 Ops/s 21.4990 Ops/s $\color{#d91a1a}-2.20\%$
test_iql_speed[True-None] 12.4710ms 11.3203ms 88.3369 Ops/s 89.5011 Ops/s $\color{#d91a1a}-1.30\%$
test_iql_speed[True-backward] 36.4678ms 23.3526ms 42.8218 Ops/s 43.9619 Ops/s $\color{#d91a1a}-2.59\%$
test_iql_speed[reduce-overhead-None] 12.4365ms 11.2649ms 88.7714 Ops/s 88.9455 Ops/s $\color{#d91a1a}-0.20\%$
test_iql_speed[reduce-overhead-backward] 24.1537ms 23.1376ms 43.2198 Ops/s 44.1145 Ops/s $\color{#d91a1a}-2.03\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 6.2420ms 5.7552ms 173.7545 Ops/s 192.9153 Ops/s $\textbf{\color{#d91a1a}-9.93\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.1950ms 0.5462ms 1.8308 KOps/s 1.8871 KOps/s $\color{#d91a1a}-2.99\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.7769ms 0.5134ms 1.9478 KOps/s 1.9954 KOps/s $\color{#d91a1a}-2.39\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.4144ms 5.1928ms 192.5751 Ops/s 207.4674 Ops/s $\textbf{\color{#d91a1a}-7.18\%}$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.4286s 0.8600ms 1.1628 KOps/s 1.9517 KOps/s $\textbf{\color{#d91a1a}-40.42\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.7338ms 0.4979ms 2.0083 KOps/s 2.0391 KOps/s $\color{#d91a1a}-1.51\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 2.2411ms 1.7058ms 586.2474 Ops/s 608.0592 Ops/s $\color{#d91a1a}-3.59\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 4.4115ms 1.6965ms 589.4471 Ops/s 621.4287 Ops/s $\textbf{\color{#d91a1a}-5.15\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0643ms 5.4645ms 183.0004 Ops/s 201.8328 Ops/s $\textbf{\color{#d91a1a}-9.33\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.0966ms 0.6794ms 1.4718 KOps/s 1.5191 KOps/s $\color{#d91a1a}-3.12\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.9854ms 0.6439ms 1.5530 KOps/s 1.5881 KOps/s $\color{#d91a1a}-2.21\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 8.4761ms 5.4423ms 183.7453 Ops/s 206.1280 Ops/s $\textbf{\color{#d91a1a}-10.86\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.7926ms 0.5359ms 1.8660 KOps/s 1.9295 KOps/s $\color{#d91a1a}-3.29\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 8.2145ms 0.5300ms 1.8869 KOps/s 1.9333 KOps/s $\color{#d91a1a}-2.40\%$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 9.5370ms 5.1276ms 195.0216 Ops/s 209.6211 Ops/s $\textbf{\color{#d91a1a}-6.96\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.5788ms 0.5320ms 1.8795 KOps/s 1.9692 KOps/s $\color{#d91a1a}-4.55\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6769ms 0.4989ms 2.0044 KOps/s 2.0641 KOps/s $\color{#d91a1a}-2.89\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0470ms 5.5712ms 179.4959 Ops/s 201.6207 Ops/s $\textbf{\color{#d91a1a}-10.97\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 3.0220ms 0.6830ms 1.4642 KOps/s 1.5072 KOps/s $\color{#d91a1a}-2.86\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.8996ms 0.6384ms 1.5663 KOps/s 1.5898 KOps/s $\color{#d91a1a}-1.48\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 0.4957s 14.5968ms 68.5080 Ops/s 231.3112 Ops/s $\textbf{\color{#d91a1a}-70.38\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 3.5557ms 2.0961ms 477.0783 Ops/s 417.6889 Ops/s $\textbf{\color{#35bf28}+14.22\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 10.1201ms 1.4640ms 683.0612 Ops/s 802.7186 Ops/s $\textbf{\color{#d91a1a}-14.91\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 7.1617ms 4.7575ms 210.1957 Ops/s 240.1531 Ops/s $\textbf{\color{#d91a1a}-12.47\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 7.3368ms 2.4449ms 409.0209 Ops/s 429.6482 Ops/s $\color{#d91a1a}-4.80\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 6.2355ms 1.4877ms 672.1780 Ops/s 698.2684 Ops/s $\color{#d91a1a}-3.74\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 0.4765s 14.6490ms 68.2639 Ops/s 236.9049 Ops/s $\textbf{\color{#d91a1a}-71.19\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 10.2426ms 2.6209ms 381.5551 Ops/s 382.1982 Ops/s $\color{#d91a1a}-0.17\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 6.6552ms 1.6566ms 603.6325 Ops/s 649.6024 Ops/s $\textbf{\color{#d91a1a}-7.08\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.8953ms 11.3995ms 87.7232 Ops/s 86.8798 Ops/s $\color{#35bf28}+0.97\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 19.1487ms 14.6609ms 68.2087 Ops/s 68.9008 Ops/s $\color{#d91a1a}-1.00\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 21.5156ms 20.3601ms 49.1157 Ops/s 49.5527 Ops/s $\color{#d91a1a}-0.88\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 15.6839ms 14.7155ms 67.9558 Ops/s 68.7207 Ops/s $\color{#d91a1a}-1.11\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 21.1917ms 20.2886ms 49.2888 Ops/s 50.0684 Ops/s $\color{#d91a1a}-1.56\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 17.9612ms 15.8628ms 63.0406 Ops/s 61.3220 Ops/s $\color{#35bf28}+2.80\%$

Copy link

$\color{#D29922}\textsf{\Large⚠\kern{0.2cm}\normalsize Warning}$ Result of GPU Benchmark Tests

Total Benchmarks: 149. Improved: $\large\color{#35bf28}13$. Worsened: $\large\color{#d91a1a}8$.

Expand to view detailed results
Name Max Mean Ops Ops on Repo HEAD Change
test_simple 0.7276s 0.7256s 1.3781 Ops/s 1.3394 Ops/s $\color{#35bf28}+2.89\%$
test_transformed 0.9634s 0.9625s 1.0390 Ops/s 1.0291 Ops/s $\color{#35bf28}+0.96\%$
test_serial 2.0868s 2.0795s 0.4809 Ops/s 0.4800 Ops/s $\color{#35bf28}+0.18\%$
test_parallel 2.0863s 1.9951s 0.5012 Ops/s 0.5179 Ops/s $\color{#d91a1a}-3.22\%$
test_step_mdp_speed[True-True-True-True-True] 0.1890ms 34.0951μs 29.3297 KOps/s 28.7424 KOps/s $\color{#35bf28}+2.04\%$
test_step_mdp_speed[True-True-True-True-False] 52.6830μs 19.9968μs 50.0080 KOps/s 51.3172 KOps/s $\color{#d91a1a}-2.55\%$
test_step_mdp_speed[True-True-True-False-True] 49.6020μs 19.0968μs 52.3647 KOps/s 52.2679 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[True-True-True-False-False] 39.0920μs 11.1877μs 89.3837 KOps/s 91.4170 KOps/s $\color{#d91a1a}-2.22\%$
test_step_mdp_speed[True-True-False-True-True] 71.0840μs 36.8130μs 27.1643 KOps/s 26.5759 KOps/s $\color{#35bf28}+2.21\%$
test_step_mdp_speed[True-True-False-True-False] 46.1430μs 21.3770μs 46.7793 KOps/s 46.0110 KOps/s $\color{#35bf28}+1.67\%$
test_step_mdp_speed[True-True-False-False-True] 65.1030μs 20.5875μs 48.5731 KOps/s 46.6780 KOps/s $\color{#35bf28}+4.06\%$
test_step_mdp_speed[True-True-False-False-False] 46.2220μs 12.9069μs 77.4779 KOps/s 77.1027 KOps/s $\color{#35bf28}+0.49\%$
test_step_mdp_speed[True-False-True-True-True] 73.3340μs 38.5383μs 25.9482 KOps/s 25.4747 KOps/s $\color{#35bf28}+1.86\%$
test_step_mdp_speed[True-False-True-True-False] 49.7130μs 23.9536μs 41.7473 KOps/s 42.3922 KOps/s $\color{#d91a1a}-1.52\%$
test_step_mdp_speed[True-False-True-False-True] 55.6730μs 21.0043μs 47.6092 KOps/s 47.3724 KOps/s $\color{#35bf28}+0.50\%$
test_step_mdp_speed[True-False-True-False-False] 42.1520μs 13.2349μs 75.5579 KOps/s 76.2341 KOps/s $\color{#d91a1a}-0.89\%$
test_step_mdp_speed[True-False-False-True-True] 81.2740μs 41.0128μs 24.3826 KOps/s 24.1688 KOps/s $\color{#35bf28}+0.88\%$
test_step_mdp_speed[True-False-False-True-False] 56.5130μs 25.5867μs 39.0828 KOps/s 39.1994 KOps/s $\color{#d91a1a}-0.30\%$
test_step_mdp_speed[True-False-False-False-True] 51.3430μs 22.6981μs 44.0565 KOps/s 43.7689 KOps/s $\color{#35bf28}+0.66\%$
test_step_mdp_speed[True-False-False-False-False] 47.8720μs 14.9034μs 67.0987 KOps/s 66.9958 KOps/s $\color{#35bf28}+0.15\%$
test_step_mdp_speed[False-True-True-True-True] 74.9840μs 38.8527μs 25.7383 KOps/s 25.9822 KOps/s $\color{#d91a1a}-0.94\%$
test_step_mdp_speed[False-True-True-True-False] 56.4820μs 23.7705μs 42.0690 KOps/s 42.3806 KOps/s $\color{#d91a1a}-0.74\%$
test_step_mdp_speed[False-True-True-False-True] 57.4130μs 24.8549μs 40.2335 KOps/s 39.9970 KOps/s $\color{#35bf28}+0.59\%$
test_step_mdp_speed[False-True-True-False-False] 44.2020μs 14.7835μs 67.6431 KOps/s 69.0946 KOps/s $\color{#d91a1a}-2.10\%$
test_step_mdp_speed[False-True-False-True-True] 76.3540μs 41.2773μs 24.2264 KOps/s 24.2648 KOps/s $\color{#d91a1a}-0.16\%$
test_step_mdp_speed[False-True-False-True-False] 55.9430μs 25.8517μs 38.6821 KOps/s 38.5715 KOps/s $\color{#35bf28}+0.29\%$
test_step_mdp_speed[False-True-False-False-True] 3.5512ms 26.7534μs 37.3785 KOps/s 36.8937 KOps/s $\color{#35bf28}+1.31\%$
test_step_mdp_speed[False-True-False-False-False] 44.5020μs 16.5906μs 60.2749 KOps/s 60.6936 KOps/s $\color{#d91a1a}-0.69\%$
test_step_mdp_speed[False-False-True-True-True] 79.6740μs 42.5857μs 23.4821 KOps/s 23.1866 KOps/s $\color{#35bf28}+1.27\%$
test_step_mdp_speed[False-False-True-True-False] 52.7130μs 27.6955μs 36.1070 KOps/s 35.9773 KOps/s $\color{#35bf28}+0.36\%$
test_step_mdp_speed[False-False-True-False-True] 58.5630μs 25.9242μs 38.5740 KOps/s 37.1573 KOps/s $\color{#35bf28}+3.81\%$
test_step_mdp_speed[False-False-True-False-False] 39.3920μs 16.4483μs 60.7964 KOps/s 60.6824 KOps/s $\color{#35bf28}+0.19\%$
test_step_mdp_speed[False-False-False-True-True] 68.6240μs 43.3071μs 23.0909 KOps/s 22.5302 KOps/s $\color{#35bf28}+2.49\%$
test_step_mdp_speed[False-False-False-True-False] 60.8930μs 29.4766μs 33.9252 KOps/s 34.1798 KOps/s $\color{#d91a1a}-0.75\%$
test_step_mdp_speed[False-False-False-False-True] 64.1830μs 28.2973μs 35.3391 KOps/s 35.3768 KOps/s $\color{#d91a1a}-0.11\%$
test_step_mdp_speed[False-False-False-False-False] 45.3030μs 18.2593μs 54.7666 KOps/s 54.6477 KOps/s $\color{#35bf28}+0.22\%$
test_values[generalized_advantage_estimate-True-True] 24.4975ms 24.0054ms 41.6574 Ops/s 41.8457 Ops/s $\color{#d91a1a}-0.45\%$
test_values[vec_generalized_advantage_estimate-True-True] 0.1018s 2.9221ms 342.2194 Ops/s 358.4787 Ops/s $\color{#d91a1a}-4.54\%$
test_values[td0_return_estimate-False-False] 89.7450μs 66.2227μs 15.1006 KOps/s 15.1232 KOps/s $\color{#d91a1a}-0.15\%$
test_values[td1_return_estimate-False-False] 54.1826ms 53.8824ms 18.5589 Ops/s 18.5872 Ops/s $\color{#d91a1a}-0.15\%$
test_values[vec_td1_return_estimate-False-False] 1.3436ms 1.0724ms 932.5136 Ops/s 936.7289 Ops/s $\color{#d91a1a}-0.45\%$
test_values[td_lambda_return_estimate-True-False] 94.3547ms 86.8423ms 11.5151 Ops/s 11.6596 Ops/s $\color{#d91a1a}-1.24\%$
test_values[vec_td_lambda_return_estimate-True-False] 1.3965ms 1.0683ms 936.0869 Ops/s 934.1530 Ops/s $\color{#35bf28}+0.21\%$
test_gae_speed[generalized_advantage_estimate-False-1-512] 23.9360ms 23.7173ms 42.1633 Ops/s 42.1843 Ops/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-1-512] 1.0105ms 0.7330ms 1.3642 KOps/s 1.3483 KOps/s $\color{#35bf28}+1.18\%$
test_gae_speed[vec_generalized_advantage_estimate-False-1-512] 0.7541ms 0.6501ms 1.5383 KOps/s 1.5391 KOps/s $\color{#d91a1a}-0.05\%$
test_gae_speed[vec_generalized_advantage_estimate-True-32-512] 1.5208ms 1.4629ms 683.5734 Ops/s 684.1504 Ops/s $\color{#d91a1a}-0.08\%$
test_gae_speed[vec_generalized_advantage_estimate-False-32-512] 0.7509ms 0.6663ms 1.5009 KOps/s 1.5067 KOps/s $\color{#d91a1a}-0.39\%$
test_dqn_speed[False-None] 0.1014s 1.4108ms 708.8295 Ops/s 767.3344 Ops/s $\textbf{\color{#d91a1a}-7.62\%}$
test_dqn_speed[False-backward] 2.1650ms 1.8005ms 555.3997 Ops/s 559.2641 Ops/s $\color{#d91a1a}-0.69\%$
test_dqn_speed[True-None] 0.8121ms 0.5599ms 1.7860 KOps/s 1.7482 KOps/s $\color{#35bf28}+2.16\%$
test_dqn_speed[True-backward] 1.0468ms 0.9916ms 1.0085 KOps/s 822.1699 Ops/s $\textbf{\color{#35bf28}+22.67\%}$
test_dqn_speed[reduce-overhead-None] 0.6932ms 0.5614ms 1.7813 KOps/s 1.7170 KOps/s $\color{#35bf28}+3.74\%$
test_dqn_speed[reduce-overhead-backward] 1.0617ms 0.9981ms 1.0019 KOps/s 998.8215 Ops/s $\color{#35bf28}+0.30\%$
test_ddpg_speed[False-None] 3.0864ms 2.5850ms 386.8516 Ops/s 380.9973 Ops/s $\color{#35bf28}+1.54\%$
test_ddpg_speed[False-backward] 4.0739ms 3.8666ms 258.6249 Ops/s 258.1007 Ops/s $\color{#35bf28}+0.20\%$
test_ddpg_speed[True-None] 1.5996ms 1.2379ms 807.8011 Ops/s 811.9775 Ops/s $\color{#d91a1a}-0.51\%$
test_ddpg_speed[True-backward] 2.2308ms 2.1822ms 458.2479 Ops/s 364.2847 Ops/s $\textbf{\color{#35bf28}+25.79\%}$
test_ddpg_speed[reduce-overhead-None] 1.5854ms 1.2168ms 821.8099 Ops/s 808.8893 Ops/s $\color{#35bf28}+1.60\%$
test_ddpg_speed[reduce-overhead-backward] 2.2734ms 2.1781ms 459.1138 Ops/s 453.7098 Ops/s $\color{#35bf28}+1.19\%$
test_sac_speed[False-None] 8.1099ms 7.3300ms 136.4259 Ops/s 133.7170 Ops/s $\color{#35bf28}+2.03\%$
test_sac_speed[False-backward] 11.2089ms 10.6570ms 93.8350 Ops/s 92.0609 Ops/s $\color{#35bf28}+1.93\%$
test_sac_speed[True-None] 2.1013ms 1.9457ms 513.9548 Ops/s 513.3868 Ops/s $\color{#35bf28}+0.11\%$
test_sac_speed[True-backward] 5.3969ms 3.9705ms 251.8594 Ops/s 255.5312 Ops/s $\color{#d91a1a}-1.44\%$
test_sac_speed[reduce-overhead-None] 2.3536ms 1.9508ms 512.6085 Ops/s 507.4220 Ops/s $\color{#35bf28}+1.02\%$
test_sac_speed[reduce-overhead-backward] 3.9712ms 3.8257ms 261.3900 Ops/s 257.2440 Ops/s $\color{#35bf28}+1.61\%$
test_redq_speed[False-None] 15.1479ms 9.7103ms 102.9832 Ops/s 96.0133 Ops/s $\textbf{\color{#35bf28}+7.26\%}$
test_redq_speed[False-backward] 17.5287ms 16.7624ms 59.6573 Ops/s 59.6605 Ops/s $-0.01\%$
test_redq_speed[True-None] 4.2156ms 3.2588ms 306.8598 Ops/s 291.8541 Ops/s $\textbf{\color{#35bf28}+5.14\%}$
test_redq_speed[True-backward] 8.5718ms 8.1302ms 122.9977 Ops/s 114.6810 Ops/s $\textbf{\color{#35bf28}+7.25\%}$
test_redq_speed[reduce-overhead-None] 3.6205ms 3.2349ms 309.1253 Ops/s 294.4810 Ops/s $\color{#35bf28}+4.97\%$
test_redq_speed[reduce-overhead-backward] 8.2592ms 7.9829ms 125.2674 Ops/s 120.6472 Ops/s $\color{#35bf28}+3.83\%$
test_redq_deprec_speed[False-None] 10.7264ms 10.2890ms 97.1907 Ops/s 93.3380 Ops/s $\color{#35bf28}+4.13\%$
test_redq_deprec_speed[False-backward] 15.7845ms 15.1169ms 66.1510 Ops/s 64.9203 Ops/s $\color{#35bf28}+1.90\%$
test_redq_deprec_speed[True-None] 3.3197ms 3.1741ms 315.0461 Ops/s 312.1485 Ops/s $\color{#35bf28}+0.93\%$
test_redq_deprec_speed[True-backward] 7.5413ms 7.1007ms 140.8319 Ops/s 138.6009 Ops/s $\color{#35bf28}+1.61\%$
test_redq_deprec_speed[reduce-overhead-None] 3.6626ms 3.1854ms 313.9356 Ops/s 310.7960 Ops/s $\color{#35bf28}+1.01\%$
test_redq_deprec_speed[reduce-overhead-backward] 7.3158ms 7.1305ms 140.2420 Ops/s 140.5062 Ops/s $\color{#d91a1a}-0.19\%$
test_td3_speed[False-None] 7.3865ms 7.2261ms 138.3878 Ops/s 134.4522 Ops/s $\color{#35bf28}+2.93\%$
test_td3_speed[False-backward] 10.3833ms 10.0288ms 99.7130 Ops/s 95.6328 Ops/s $\color{#35bf28}+4.27\%$
test_td3_speed[True-None] 1.9034ms 1.8491ms 540.8071 Ops/s 538.2110 Ops/s $\color{#35bf28}+0.48\%$
test_td3_speed[True-backward] 3.7409ms 3.5874ms 278.7559 Ops/s 274.0143 Ops/s $\color{#35bf28}+1.73\%$
test_td3_speed[reduce-overhead-None] 1.9214ms 1.8484ms 540.9963 Ops/s 541.9616 Ops/s $\color{#d91a1a}-0.18\%$
test_td3_speed[reduce-overhead-backward] 3.7105ms 3.5726ms 279.9098 Ops/s 274.0119 Ops/s $\color{#35bf28}+2.15\%$
test_cql_speed[False-None] 26.9576ms 23.5648ms 42.4363 Ops/s 40.5129 Ops/s $\color{#35bf28}+4.75\%$
test_cql_speed[False-backward] 38.8485ms 33.9223ms 29.4792 Ops/s 28.9555 Ops/s $\color{#35bf28}+1.81\%$
test_cql_speed[True-None] 11.3307ms 10.3373ms 96.7368 Ops/s 96.5114 Ops/s $\color{#35bf28}+0.23\%$
test_cql_speed[True-backward] 16.4323ms 16.0301ms 62.3827 Ops/s 62.7168 Ops/s $\color{#d91a1a}-0.53\%$
test_cql_speed[reduce-overhead-None] 10.7134ms 10.4347ms 95.8338 Ops/s 95.4783 Ops/s $\color{#35bf28}+0.37\%$
test_cql_speed[reduce-overhead-backward] 16.2891ms 15.8246ms 63.1927 Ops/s 61.4311 Ops/s $\color{#35bf28}+2.87\%$
test_a2c_speed[False-None] 5.5075ms 5.1217ms 195.2494 Ops/s 190.8766 Ops/s $\color{#35bf28}+2.29\%$
test_a2c_speed[False-backward] 11.6619ms 11.3419ms 88.1683 Ops/s 85.5567 Ops/s $\color{#35bf28}+3.05\%$
test_a2c_speed[True-None] 3.1699ms 2.9596ms 337.8799 Ops/s 335.1468 Ops/s $\color{#35bf28}+0.82\%$
test_a2c_speed[True-backward] 8.6185ms 8.2110ms 121.7879 Ops/s 122.1942 Ops/s $\color{#d91a1a}-0.33\%$
test_a2c_speed[reduce-overhead-None] 3.6590ms 2.9755ms 336.0732 Ops/s 336.1552 Ops/s $\color{#d91a1a}-0.02\%$
test_a2c_speed[reduce-overhead-backward] 8.2862ms 8.1281ms 123.0299 Ops/s 123.1938 Ops/s $\color{#d91a1a}-0.13\%$
test_ppo_speed[False-None] 5.8305ms 5.3888ms 185.5711 Ops/s 179.9875 Ops/s $\color{#35bf28}+3.10\%$
test_ppo_speed[False-backward] 12.2887ms 11.8285ms 84.5419 Ops/s 83.2986 Ops/s $\color{#35bf28}+1.49\%$
test_ppo_speed[True-None] 3.6716ms 3.3539ms 298.1612 Ops/s 297.0436 Ops/s $\color{#35bf28}+0.38\%$
test_ppo_speed[True-backward] 8.1802ms 8.0173ms 124.7304 Ops/s 120.2512 Ops/s $\color{#35bf28}+3.72\%$
test_ppo_speed[reduce-overhead-None] 3.4920ms 3.3291ms 300.3800 Ops/s 297.9970 Ops/s $\color{#35bf28}+0.80\%$
test_ppo_speed[reduce-overhead-backward] 8.1834ms 7.9627ms 125.5858 Ops/s 124.3011 Ops/s $\color{#35bf28}+1.03\%$
test_reinforce_speed[False-None] 5.8020ms 4.2919ms 232.9989 Ops/s 221.9531 Ops/s $\color{#35bf28}+4.98\%$
test_reinforce_speed[False-backward] 7.4703ms 7.1125ms 140.5982 Ops/s 137.8143 Ops/s $\color{#35bf28}+2.02\%$
test_reinforce_speed[True-None] 2.2706ms 2.1395ms 467.4082 Ops/s 446.5096 Ops/s $\color{#35bf28}+4.68\%$
test_reinforce_speed[True-backward] 7.0693ms 6.8509ms 145.9666 Ops/s 131.3728 Ops/s $\textbf{\color{#35bf28}+11.11\%}$
test_reinforce_speed[reduce-overhead-None] 2.5705ms 2.1296ms 469.5679 Ops/s 458.1359 Ops/s $\color{#35bf28}+2.50\%$
test_reinforce_speed[reduce-overhead-backward] 7.1261ms 6.8413ms 146.1714 Ops/s 145.5587 Ops/s $\color{#35bf28}+0.42\%$
test_iql_speed[False-None] 19.1369ms 18.6915ms 53.5002 Ops/s 51.5047 Ops/s $\color{#35bf28}+3.87\%$
test_iql_speed[False-backward] 29.9513ms 29.2874ms 34.1444 Ops/s 33.6794 Ops/s $\color{#35bf28}+1.38\%$
test_iql_speed[True-None] 7.1220ms 6.4075ms 156.0662 Ops/s 150.6512 Ops/s $\color{#35bf28}+3.59\%$
test_iql_speed[True-backward] 15.2125ms 14.8447ms 67.3640 Ops/s 66.8093 Ops/s $\color{#35bf28}+0.83\%$
test_iql_speed[reduce-overhead-None] 6.9670ms 6.3882ms 156.5387 Ops/s 149.2621 Ops/s $\color{#35bf28}+4.88\%$
test_iql_speed[reduce-overhead-backward] 15.0569ms 14.7901ms 67.6130 Ops/s 66.9841 Ops/s $\color{#35bf28}+0.94\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 7.3763ms 6.0039ms 166.5597 Ops/s 164.8308 Ops/s $\color{#35bf28}+1.05\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 0.2899s 0.5648ms 1.7705 KOps/s 3.1228 KOps/s $\textbf{\color{#d91a1a}-43.30\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.5240ms 0.3404ms 2.9380 KOps/s 3.0596 KOps/s $\color{#d91a1a}-3.97\%$
test_rb_sample[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 6.0352ms 5.8136ms 172.0104 Ops/s 174.0424 Ops/s $\color{#d91a1a}-1.17\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 0.8511ms 0.3198ms 3.1268 KOps/s 3.0246 KOps/s $\color{#35bf28}+3.38\%$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.5345ms 0.2951ms 3.3891 KOps/s 3.3572 KOps/s $\color{#35bf28}+0.95\%$
test_rb_sample[TensorDictReplayBuffer-LazyMemmapStorage-sampler6-10000] 1.6484ms 1.3781ms 725.6528 Ops/s 687.6915 Ops/s $\textbf{\color{#35bf28}+5.52\%}$
test_rb_sample[TensorDictReplayBuffer-LazyTensorStorage-sampler7-10000] 1.5717ms 1.3363ms 748.3482 Ops/s 713.2914 Ops/s $\color{#35bf28}+4.91\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.0999ms 5.9793ms 167.2423 Ops/s 170.9773 Ops/s $\color{#d91a1a}-2.18\%$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.1070ms 0.4207ms 2.3773 KOps/s 1.9333 KOps/s $\textbf{\color{#35bf28}+22.96\%}$
test_rb_sample[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.7306ms 0.3982ms 2.5116 KOps/s 2.0110 KOps/s $\textbf{\color{#35bf28}+24.89\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-RandomSampler-4000] 5.9701ms 5.8606ms 170.6296 Ops/s 176.2964 Ops/s $\color{#d91a1a}-3.21\%$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-10000] 1.8447ms 0.3000ms 3.3339 KOps/s 3.4499 KOps/s $\color{#d91a1a}-3.36\%$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-10000] 0.6028ms 0.3492ms 2.8635 KOps/s 3.6430 KOps/s $\textbf{\color{#d91a1a}-21.40\%}$
test_rb_iterate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-4000] 13.0515ms 5.9843ms 167.1031 Ops/s 176.5582 Ops/s $\textbf{\color{#d91a1a}-5.36\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-10000] 1.0147ms 0.3003ms 3.3302 KOps/s 3.0286 KOps/s $\textbf{\color{#35bf28}+9.96\%}$
test_rb_iterate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-10000] 0.6274ms 0.3093ms 3.2333 KOps/s 4.0983 KOps/s $\textbf{\color{#d91a1a}-21.11\%}$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-ListStorage-None-4000] 6.1296ms 5.9915ms 166.9032 Ops/s 172.1584 Ops/s $\color{#d91a1a}-3.05\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-10000] 2.4384ms 0.4596ms 2.1757 KOps/s 2.1293 KOps/s $\color{#35bf28}+2.18\%$
test_rb_iterate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-10000] 0.6339ms 0.4190ms 2.3864 KOps/s 2.2810 KOps/s $\color{#35bf28}+4.62\%$
test_rb_populate[TensorDictReplayBuffer-ListStorage-RandomSampler-400] 6.7820ms 5.1992ms 192.3374 Ops/s 38.1427 Ops/s $\textbf{\color{#35bf28}+404.26\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-RandomSampler-400] 6.1212ms 1.9354ms 516.6869 Ops/s 468.4382 Ops/s $\textbf{\color{#35bf28}+10.30\%}$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-RandomSampler-400] 8.7014ms 1.2450ms 803.2074 Ops/s 904.3232 Ops/s $\textbf{\color{#d91a1a}-11.18\%}$
test_rb_populate[TensorDictReplayBuffer-ListStorage-SamplerWithoutReplacement-400] 0.4061s 13.2692ms 75.3626 Ops/s 188.4219 Ops/s $\textbf{\color{#d91a1a}-60.00\%}$
test_rb_populate[TensorDictReplayBuffer-LazyMemmapStorage-SamplerWithoutReplacement-400] 9.2645ms 2.0084ms 497.9079 Ops/s 493.8031 Ops/s $\color{#35bf28}+0.83\%$
test_rb_populate[TensorDictReplayBuffer-LazyTensorStorage-SamplerWithoutReplacement-400] 7.5799ms 1.1818ms 846.1913 Ops/s 792.4736 Ops/s $\textbf{\color{#35bf28}+6.78\%}$
test_rb_populate[TensorDictPrioritizedReplayBuffer-ListStorage-None-400] 7.3539ms 5.4528ms 183.3907 Ops/s 183.9529 Ops/s $\color{#d91a1a}-0.31\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyMemmapStorage-None-400] 8.9511ms 2.1653ms 461.8203 Ops/s 472.2048 Ops/s $\color{#d91a1a}-2.20\%$
test_rb_populate[TensorDictPrioritizedReplayBuffer-LazyTensorStorage-None-400] 7.2252ms 1.3870ms 720.9679 Ops/s 750.5564 Ops/s $\color{#d91a1a}-3.94\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-True] 13.3377ms 12.5063ms 79.9600 Ops/s 79.0274 Ops/s $\color{#35bf28}+1.18\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-10000-10000-100-False] 17.1343ms 16.1365ms 61.9714 Ops/s 61.4578 Ops/s $\color{#35bf28}+0.84\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-True] 17.9712ms 17.0427ms 58.6761 Ops/s 57.6995 Ops/s $\color{#35bf28}+1.69\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-100000-10000-100-False] 17.2463ms 16.6717ms 59.9819 Ops/s 59.7653 Ops/s $\color{#35bf28}+0.36\%$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-True] 20.6562ms 17.8803ms 55.9274 Ops/s 59.2475 Ops/s $\textbf{\color{#d91a1a}-5.60\%}$
test_rb_extend_sample[ReplayBuffer-LazyTensorStorage-RandomSampler-1000000-10000-100-False] 22.9765ms 18.0264ms 55.4743 Ops/s 56.9961 Ops/s $\color{#d91a1a}-2.67\%$

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants