support for google cloud TPU #24

QingyuanWang · 2020-04-07T13:54:14Z

I have marked all applicable categories:
- exception-raising bug
- RL algorithm bug
- documentation request (i.e. "X is missing from the documentation.")
- new feature request
I have visited the [source website], and in particular read the [known issues]
I have searched through the [issue tracker] for duplicates
I have mentioned version numbers, operating system and environment, where applicable:

Hi,
I see distributed training in your todo list, does that include support for google cloud TPU?

Trinkle23897 · 2020-04-08T06:24:35Z

For me, I don't use TPU. But @fengredrum said he would like to do this work in the future.

fengredrum · 2020-04-16T07:28:12Z

Hi, @QingyuanWang
Google just announces an offline RL paradigm, which is suitable for training on a hardware like cloud TPU.

They've released the corresponding dataset and code, you can check the detail from this url: https://ai.googleblog.com/2020/04/an-optimistic-perspective-on-offline.html

Maybe some day I'll write some code about it, just maybe :)

QingyuanWang · 2020-04-16T07:57:02Z

Thank you for your information, I will check that.

DrJimFan · 2020-04-29T15:51:11Z

I think you can always use tianshou to rollout env and collect samples, and then train the network on pytorch compiled with TPU support: https://medium.com/pytorch/get-started-with-pytorch-cloud-tpus-and-colab-a24757b8f7fc
It seems quite straightforward and doesn't require change to tianshou itself.

QingyuanWang · 2020-04-29T16:07:45Z

I think you can always use tianshou to rollout env and collect samples, and then train the network on pytorch compiled with TPU support: https://medium.com/pytorch/get-started-with-pytorch-cloud-tpus-and-colab-a24757b8f7fc
It seems quite straightforward and doesn't require change to tianshou itself.

Hi, it is not difficult to use pytorch on a single TPU core. However, a TPU has 8 cores with 8gb vram each and that is one important feature where TPU outperforms GPU. What I hope is tianshou will support the distributed training with all 8 cores. Not very sure whether if that will be different from distributed training on GPU.

duburcqa · 2020-07-21T11:18:02Z

@Trinkle23897 I don't think keeping this issue open is relevant.

Trinkle23897 added the enhancement Feature that is not a new algorithm or an algorithm enhancement label May 5, 2020

Trinkle23897 closed this as completed Jul 21, 2020

szrlee mentioned this issue Aug 20, 2020

The best practice using Tianshou for offline RL? #188

Closed

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

support for google cloud TPU #24

support for google cloud TPU #24

QingyuanWang commented Apr 7, 2020

Trinkle23897 commented Apr 8, 2020 •

edited

Loading

fengredrum commented Apr 16, 2020

QingyuanWang commented Apr 16, 2020

DrJimFan commented Apr 29, 2020

QingyuanWang commented Apr 29, 2020

duburcqa commented Jul 21, 2020

support for google cloud TPU #24

support for google cloud TPU #24

Comments

QingyuanWang commented Apr 7, 2020

Trinkle23897 commented Apr 8, 2020 • edited Loading

fengredrum commented Apr 16, 2020

QingyuanWang commented Apr 16, 2020

DrJimFan commented Apr 29, 2020

QingyuanWang commented Apr 29, 2020

duburcqa commented Jul 21, 2020

Trinkle23897 commented Apr 8, 2020 •

edited

Loading