Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] ACME Integration #60

Closed
5 tasks
Trinkle23897 opened this issue Feb 17, 2022 · 7 comments · Fixed by #157
Closed
5 tasks

[Feature Request] ACME Integration #60

Trinkle23897 opened this issue Feb 17, 2022 · 7 comments · Fixed by #157
Assignees
Labels
enhancement New feature or request

Comments

@Trinkle23897
Copy link
Collaborator

Trinkle23897 commented Feb 17, 2022

https://github.com/deepmind/acme

Road Map:

@TianyiSun316

  • Go through ACME codebase and integrate vector_env to the available algorithms;
  • Write Atari examples;
  • Check Atari performance: Pong and Breakout;
  • Submit PR;

@LeoGuo98

  • Do some experiments with sample efficiency (actually you can try out with different libraries, either ACME, tianshou, or sb3, this doesn't depend on the previous item)

Resources:

tianshou: #51
stable-baselines3: #39
cleanrl: #48 #53

cc @zhongwen

@Trinkle23897 Trinkle23897 added the enhancement New feature or request label Apr 27, 2022
@zhongwen
Copy link

@TianyiSun316
Copy link
Collaborator

Use Acme JAX agents intead of the TF agents

References: https://github.com/deepmind/acme/tree/master/acme/agents/jax

Example scripts, you can just replace it with a dqn agent.

https://github.com/deepmind/acme/blob/master/examples/atari/run_impala.py

https://github.com/deepmind/acme/blob/master/examples/atari/run_r2d2.py

I tried but seems like JAX DQN network is not usable. Current version is based on TensorFlow. Maybe we can discuss this offline?

@zhongwen
Copy link

zhongwen commented May 9, 2022

Sure!

@zhongwen
Copy link

Let's use R2D2 example first, since Tianyi has significant difficutlites and spent little time to understand the acme codebase.

@zhongwen
Copy link

Steps:

  • Implement a new EnvironmentLoop to integrate with EnvPool;
  • Impelement a new adder to integrate with batched inputs;
  • Integrate with the R2D2 example
  • Experiments

@zhongwen
Copy link

How is the progress? You have only 1 day left for the task.

@TianyiSun316
Copy link
Collaborator

Let's use R2D2 example first, since Tianyi has significant difficutlites and spent little time to understand the acme codebase.

I was eager to know your solution of using JAX DQN to write the example when we discuss offline and there are several things I forgot to mention during the offline discussion:

  1. I have already finished step 1 in your "Steps", you will find it in my PR Add acme JAX R2D2 example #104
  2. Using JAX DQN is more complex than the example scripts you mentioned above. There is no pre-written DQN network that we can use directly. We may need to write functions like make_atari_networks to get a usable DQN network, while the example scripts simply called the API that has already been implemented.
  3. Implementing R2D2 is also more complex than the example scripts since the R2D2 agent expects the observation to use the OAR struct. Our EnvWrapper needs to adapt to it accordingly.

I have finished the example using JAX R2D2 agents. Please review the code and let me know if there are any modifications needed.

@Trinkle23897 Trinkle23897 linked a pull request Jun 22, 2022 that will close this issue
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants