Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experiment Results #3

Open
Itsukara opened this issue Sep 7, 2016 · 6 comments
Open

Experiment Results #3

Itsukara opened this issue Sep 7, 2016 · 6 comments

Comments

@Itsukara
Copy link
Owner

Itsukara commented Sep 7, 2016

This thread is used for sharing experiment results. I'd appreciate if you could write your experiment result to this thread when you try my code. The following messages are sample reports.

@Itsukara
Copy link
Owner Author

Itsukara commented Sep 7, 2016

(Sample report 1)

Summary

  • Total run step: 48M STEP
  • Average Score (last 16 lines): 610-710
  • Learning curve: https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160907/20160907105007.png
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 48M STEP: --train-episode-steps=30 --lives-lost-reward=-0.03 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

Details

  • 0M - 48M STEP: --train-episode-steps=30 --lives-lost-reward=-0.03 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

******************** options ********************^M
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.03, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)^M

<Average Episode score (last 16 lines)>
@@@ Average Episode score = 698.000000, s= 47597186,th=3
@@@ Average Episode score = 641.000000, s= 47601045,th=2
@@@ Average Episode score = 665.000000, s= 47609651,th=4
@@@ Average Episode score = 675.000000, s= 47616326,th=7
@@@ Average Episode score = 610.000000, s= 47674768,th=0
@@@ Average Episode score = 692.000000, s= 47705725,th=5
@@@ Average Episode score = 698.000000, s= 47758888,th=3
@@@ Average Episode score = 661.000000, s= 47791572,th=1
@@@ Average Episode score = 701.000000, s= 47798075,th=4
@@@ Average Episode score = 672.000000, s= 47815701,th=6
@@@ Average Episode score = 700.000000, s= 47821598,th=2
@@@ Average Episode score = 610.000000, s= 47858878,th=0
@@@ Average Episode score = 689.000000, s= 47859319,th=7
@@@ Average Episode score = 710.000000, s= 47928128,th=5
@@@ Average Episode score = 661.000000, s= 47993088,th=3
@@@ Average Episode score = 665.000000, s= 47993704,th=1

@Itsukara
Copy link
Owner Author

Itsukara commented Sep 7, 2016

(Sample Report2)

Summary

  • Total run step: 26M STEP
  • Average Score (last 16 lines): 359 - 424
  • Learning curve: https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160907/20160907105132.png
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 3M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True
  • 3M - 26M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=True --color-averaging-in-ale=False

Details

  • 0M - 3M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=False --color-averaging-in-ale=True

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=3, end_time_step=3000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


  • 3M - 26M STEP: --train-episode-steps=30 --lives-lost-reward=-0.02 --reset-max-reward=True --psc-use=True --randomness-time=3000 --color-maximizing-in-gs=True --color-averaging-in-ale=False

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=False, color_maximizing_in_gs=True, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=1, frames_skip_in_gs=4, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=1.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines)>
@@@ Average Episode score = 400.000000, s= 25752794,th=6
@@@ Average Episode score = 419.000000, s= 25781176,th=1
@@@ Average Episode score = 409.000000, s= 25791626,th=2
@@@ Average Episode score = 362.000000, s= 25795054,th=3
@@@ Average Episode score = 381.000000, s= 25810532,th=4
@@@ Average Episode score = 376.000000, s= 25812920,th=7
@@@ Average Episode score = 396.000000, s= 25814558,th=5
@@@ Average Episode score = 424.000000, s= 25949909,th=1
@@@ Average Episode score = 359.000000, s= 25950264,th=0
@@@ Average Episode score = 373.000000, s= 25951475,th=7
@@@ Average Episode score = 397.000000, s= 25957375,th=6
@@@ Average Episode score = 369.000000, s= 25963558,th=3
@@@ Average Episode score = 386.000000, s= 25996207,th=5
@@@ Average Episode score = 392.000000, s= 26015968,th=4
@@@ Average Episode score = 388.000000, s= 26053439,th=6
@@@ Average Episode score = 415.000000, s= 26065448,th=2

@tflare
Copy link

tflare commented Sep 11, 2016

Summary

Details

  • --train-episode-steps=30 --lives-lost-reward=-0.02 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=80 --log-interval=800 --max-to-keep=5 --color-maximizing-in-gs=False --color-averaging-in-ale=True

options

Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=80, end_time_step=80000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=-0.02, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=100, max_play_steps=4500, max_play_time=300, max_time_step=100000000, max_to_keep=5, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)

last 20 Average

@@@ Average Episode score = 317.000000, s= 43205428,th=5
@@@ Average Episode score = 332.000000, s= 43229035,th=4
@@@ Average Episode score = 321.000000, s= 43232792,th=6
@@@ Average Episode score = 312.000000, s= 43250069,th=3
@@@ Average Episode score = 336.000000, s= 43256872,th=2
@@@ Average Episode score = 298.000000, s= 43301059,th=0
@@@ Average Episode score = 350.000000, s= 43349114,th=7
@@@ Average Episode score = 331.000000, s= 43352754,th=1
@@@ Average Episode score = 315.000000, s= 43362133,th=6
@@@ Average Episode score = 307.000000, s= 43393477,th=5
@@@ Average Episode score = 329.000000, s= 43425115,th=4
@@@ Average Episode score = 340.000000, s= 43439061,th=2
@@@ Average Episode score = 301.000000, s= 43456306,th=3
@@@ Average Episode score = 329.000000, s= 43512264,th=1
@@@ Average Episode score = 307.000000, s= 43519510,th=0
@@@ Average Episode score = 298.000000, s= 43525898,th=6
@@@ Average Episode score = 337.000000, s= 43529651,th=7
@@@ Average Episode score = 323.000000, s= 43626017,th=4
@@@ Average Episode score = 346.000000, s= 43626579,th=2
@@@ Average Episode score = 312.000000, s= 43634385,th=5

@Itsukara
Copy link
Owner Author

Itsukara commented Sep 12, 2016

Summary

Run Structure

  • 0M - 84M STEP: --train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200

Details

  • 0M - 84M STEP: --train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --repeat-action-probability=0.0 --randomness-time=3000 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=225, no_reward_time=15, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=2.2222222222222223e-05, randomness_log_interval=1500.0, randomness_log_num=30, randomness_steps=45000, randomness_time=3000.0, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)
use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines just before 84M STEP)>
@@@ Average Episode score = 1114.000000, s= 83401163,th=4
@@@ Average Episode score = 1280.000000, s= 83447970,th=3
@@@ Average Episode score = 1130.000000, s= 83486089,th=7
@@@ Average Episode score = 1193.000000, s= 83510661,th=6
@@@ Average Episode score = 1213.000000, s= 83570385,th=2
@@@ Average Episode score = 1081.000000, s= 83572954,th=5
@@@ Average Episode score = 1023.000000, s= 83623526,th=1
@@@ Average Episode score = 1012.000000, s= 83633493,th=0
@@@ Average Episode score = 1156.000000, s= 83691654,th=4
@@@ Average Episode score = 1265.000000, s= 83771600,th=3
@@@ Average Episode score = 1007.000000, s= 83772564,th=1
@@@ Average Episode score = 1169.000000, s= 83777307,th=6
@@@ Average Episode score = 1054.000000, s= 83779671,th=5
@@@ Average Episode score = 1065.000000, s= 83787417,th=7
@@@ Average Episode score = 1255.000000, s= 83913801,th=2
@@@ Average Episode score = 1017.000000, s= 83935966,th=0

@Itsukara
Copy link
Owner Author

Itsukara commented Sep 18, 2016

Summary

  • Total run step: 65.2M STEP
  • Average Score (last 16 lines): 1090 - 1573
  • Learning curve:https://cdn-ak.f.st-hatena.com/images/fotolife/I/Itsukara/20160916/20160916141555.png
  • Checkpoints data:
  • Experimenter: Itsukara
  • Web site: http://itsukara.hateblo.jp/

Run Structure

  • 0M - 62.5M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

Details

  • 0M - 62.5M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, greediness=0.01, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=9000, no_reward_time=600, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=0.00022222222222222223, randomness_log_interval=150.0, randomness_log_num=30, randomness_steps=4500, randomness_time=300, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, repeat_action_ratio=0.25, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (last 16 lines 65.2M STEP)>
@@@ Average Episode score = 1369.000000, s= 64695930,th=2
@@@ Average Episode score = 1441.000000, s= 64752913,th=6
@@@ Average Episode score = 1183.000000, s= 64781579,th=5
@@@ Average Episode score = 1217.000000, s= 64783400,th=3
@@@ Average Episode score = 1431.000000, s= 64823778,th=0
@@@ Average Episode score = 1252.000000, s= 64847953,th=4
@@@ Average Episode score = 1508.000000, s= 64858656,th=7
@@@ Average Episode score = 1090.000000, s= 64887300,th=1
@@@ Average Episode score = 1522.000000, s= 64988743,th=6
@@@ Average Episode score = 1349.000000, s= 65035796,th=2
@@@ Average Episode score = 1202.000000, s= 65065045,th=5
@@@ Average Episode score = 1201.000000, s= 65143090,th=3
@@@ Average Episode score = 1492.000000, s= 65156749,th=0
@@@ Average Episode score = 1573.000000, s= 65180094,th=7
@@@ Average Episode score = 1357.000000, s= 65206441,th=4
@@@ Average Episode score = 1191.000000, s= 65217488,th=1

@Itsukara
Copy link
Owner Author

Itsukara commented Oct 7, 2016

Summary

Run Structure

  • 0M - 100M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

Details

  • 0M - 100M STEP:option=--train-episode-steps=30 --tes-extend=True --lives-lost-reward=0.0 --lives-lost-rratio=0.0 --reset-max-reward=True --psc-use=True --no-reward-time=600 --end-mega-step=100 --log-interval=800 --color-maximizing-in-gs=False --color-averaging-in-ale=True --max-mega-step=200 --greediness=0.01 --repeat-action-ratio=0.25

******************** options ********************
Namespace(action_size=18, average_score_log_interval=10, basic_income=0.0, basic_income_time=100000000000000000000, checkpoint_dir='checkpoints', color_averaging_in_ale=True, color_maximizing_in_gs=False, display=True, end_mega_step=100, end_time_step=100000000, entropy_beta=0.01, frames_skip_in_ale=4, frames_skip_in_gs=1, gamma=0.99, grad_norm_clip=40.0, greediness=0.01, initial_alpha_high=0.01, initial_alpha_log_rate=0.4226, initial_alpha_low=0.0001, lives_lost_reward=0.0, lives_lost_rratio=0.0, lives_lost_weight=1.0, local_t_max=5, log_file='tmp/a3c_log', log_interval=800, max_mega_step=200, max_play_steps=4500, max_play_time=300, max_time_step=200000000, max_to_keep=None, no_reward_steps=9000, no_reward_time=600, num_experiments=1, parallel_size=8, performance_log_interval=1500, psc_beta=0.01, psc_frsize=42, psc_maxval=127, psc_use=True, randomness=0.00022222222222222223, randomness_log_interval=150.0, randomness_log_num=30, randomness_steps=4500, randomness_time=300, record_gs_screen_dir=None, record_new_record_dir=None, record_screen_dir=None, repeat_action_probability=0.0, repeat_action_ratio=0.25, reset_max_reward=True, reward_clip=1.0, rmsp_alpha=0.99, rmsp_epsilon=0.1, rom='montezuma_revenge.bin', save_mega_interval=3, save_time_interval=3000000, score_averaging_length=100, score_highest_ratio=0.5, score_log_interval=900, stack_frames_in_gs=False, terminate_on_lives_lost=False, tes_extend=True, tes_extend_ratio=5.0, train_episode_steps=30, train_in_eval=False, use_gpu=True, use_lstm=False, verbose=True)


<Average Episode score (36 lines @ 78.459M - 79.864M STEP)>
@@@ Average Episode score = 1683.000000, s= 78459385,th=2
@@@ Average Episode score = 1570.000000, s= 78474074,th=5
@@@ Average Episode score = 1600.000000, s= 78534232,th=3
@@@ Average Episode score = 1528.000000, s= 78534463,th=1
@@@ Average Episode score = 1561.000000, s= 78602286,th=0
@@@ Average Episode score = 1730.000000, s= 78649996,th=7
@@@ Average Episode score = 1596.000000, s= 78763734,th=4
@@@ Average Episode score = 1539.000000, s= 78787510,th=6
@@@ Average Episode score = 1585.000000, s= 78790811,th=5
@@@ Average Episode score = 1723.000000, s= 78799980,th=2
@@@ Average Episode score = 1532.000000, s= 78864929,th=1
@@@ Average Episode score = 1561.000000, s= 78889484,th=3
@@@ Average Episode score = 1541.000000, s= 78962613,th=0
@@@ Average Episode score = 1754.000000, s= 78976248,th=7
@@@ Average Episode score = 1621.000000, s= 78996270,th=4
@@@ Average Episode score = 1619.000000, s= 79100351,th=5
@@@ Average Episode score = 1664.000000, s= 79115006,th=2
@@@ Average Episode score = 1529.000000, s= 79128643,th=6
@@@ Average Episode score = 1552.000000, s= 79202649,th=3
@@@ Average Episode score = 1573.000000, s= 79225666,th=1
@@@ Average Episode score = 1604.000000, s= 79308819,th=4
@@@ Average Episode score = 1734.000000, s= 79311025,th=7
@@@ Average Episode score = 1561.000000, s= 79323126,th=0
@@@ Average Episode score = 1680.000000, s= 79414686,th=5
@@@ Average Episode score = 1703.000000, s= 79427641,th=2
@@@ Average Episode score = 1600.000000, s= 79437383,th=6
@@@ Average Episode score = 1569.000000, s= 79558727,th=1
@@@ Average Episode score = 1592.000000, s= 79562370,th=3
@@@ Average Episode score = 1562.000000, s= 79625564,th=4
@@@ Average Episode score = 1520.000000, s= 79657699,th=0
@@@ Average Episode score = 1636.000000, s= 79668274,th=7
@@@ Average Episode score = 1678.000000, s= 79721917,th=5
@@@ Average Episode score = 1553.000000, s= 79739129,th=6
@@@ Average Episode score = 1668.000000, s= 79739390,th=2
@@@ Average Episode score = 1520.000000, s= 79784754,th=1
@@@ Average Episode score = 1546.000000, s= 79864012,th=3

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants