You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When I run the cql algorithm, I found the algorithm only execute behavior clone. I checked the config used. The training step is 100 and the 'num_bc_iters' is set to 50.
When I further dive to the source code of CQLLearner, I found the 'counts' in function 'step' has two keys "steps" and "walltime".
However, in the inplementation of 'step', the key used is "learner_steps".
The invalid key "learner_steps" makes the "cur_step" always be 0, thus causing the algorithm only execute behavior clone.
When I correct the key "learner_steps" to "steps", the problem is solved.
The text was updated successfully, but these errors were encountered:
When I run the cql algorithm, I found the algorithm only execute behavior clone. I checked the config used. The training step is 100 and the 'num_bc_iters' is set to 50.
When I further dive to the source code of CQLLearner, I found the 'counts' in function 'step' has two keys "steps" and "walltime".
However, in the inplementation of 'step', the key used is "learner_steps".
The invalid key "learner_steps" makes the "cur_step" always be 0, thus causing the algorithm only execute behavior clone.
When I correct the key "learner_steps" to "steps", the problem is solved.
The text was updated successfully, but these errors were encountered: