About dataset #1

penghao-wu · 2022-08-08T13:16:06Z

Thanks for sharing your great work! Just wonder is it possible to share your zip file of those sampled frames from the youtube videos? It is a little bit time and space consuming to download all the videos.
Thank you in advance.

zqh0253 · 2022-08-09T08:32:26Z

Hi, thanks for your interest in our work. You can find the frames here: OneDrive link.
After downloading, you can run

cat sega* > frames.zip

to get the zip file.

penghao-wu · 2022-08-09T09:04:57Z

Thank you very much! Another question about the down-stream tasks: do you also use the DI-drive engine for imitation learning and collect data using the default setting? Besides, which carla version is used for training and evaluation?

zqh0253 · 2022-08-09T10:19:21Z

Yes, I use DI-drive engine for imitation learning and the Carla version I use is 0.9.9.4.

Based on default settings, I change several things including the camera setting (to match the pretrained image resolution).

penghao-wu · 2022-08-10T02:22:20Z

Thanks. Could you provide the camera settings you used including the size, position, and fov? Also, could you provide more details about the collecting and evaluation suites. For example, do you use the default 'FullTown01-v1' suite and weather to collect data, and what evaluation suite and weather are used (straight/One turn/navigation/navigation with dynamics). That would be very helpful.
Thanks a lot.

zqh0253 · 2022-08-10T02:49:36Z

The camera setting I use is dict(size=[320, 180], position=[2.0, 0.0, 1.4], rotation=[0,0,0], fov=100). I use default FullTown01-v1 to collect data and FullTown02-v2 for evaluation.
Note that these settings are based on an old version of DI-drive ( with commit ID: f532c9e9a6b26386a933049c1754ca5262d76e0a).

penghao-wu · 2022-08-10T02:51:18Z

Thanks for your help. I appreciate it a lot.

penghao-wu · 2022-08-10T07:32:41Z

Sorry to bother you again. I have a few questions about the training to confirm.

Why do we have a rindex 6510787 in the dataset, is it used to fix the index problem in dir-65?
There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.
For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.
For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?
Thanks in advance.

zqh0253 · 2022-08-10T11:38:04Z

Why do we have a rindex 6510787 in the dataset.

This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.

There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.

Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.

For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.

I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.

For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?

This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.

penghao-wu · 2022-08-12T04:13:17Z

Hi, I follow your instructions and train an agent (pretrained from imagenet) using 4K data. However, the SR evaluated on the FullTown02-v2 suite of it is $37.3 \pm 3.1$, which is higher than the reported $21.3 \pm 7.5$ in paper. Do I miss any details or other modifications are needed? What are the possible reasons in your opinion? I am using Carla 0.9.9.4 and the same DI-Drive version as you. I sample the 10% data uniformly (eg data_4K = data_40K[::10]), is this the same way you do it or I should choose data_4K = data_40K[:4000].

Besides, do you plan to release the pre-calculated steer values for the uploaded 80K frames or the code for the inverse dynamic model? If not, could you please share more details about the model structure so that I can implement and train it by myself.

Also, as the DI-drive only contains PPO model with bev input, could you provide your model file or model details for PPO training?

Thanks a lot!

zqh0253 · 2022-08-15T02:48:35Z

Hi,

I sample the 10% data uniformly (eg data_4K = data_40K[::10]), is this the same way you do it or I should choose data_4K = data_40K[:4000].

I reduce the data set size in trajectory level, data_40K[:4000]. Since redundancy exists in adjacent frames, reducing in trajectory level will create harder problem than reducing in frame level (data_40K[::10]). So I think the performance gap you reported is expected.

Do you plan to release the pre-calculated steer values?

Yes, I am working on this part and will release in the future. Stay tuned.

PPO training.

I do not experiment much with PPO model design. A resnet34 backbone is used to extract the visual feature. Then, the feature goes through a mlp and is concatenated with the velocity as the output of the encoder.

zqh0253 · 2022-08-15T02:52:03Z

Let's keep this issue only dataset-relevant. If you have any further questions about training, feel free to open a new one.

SiyuanHuang95 · 2022-12-01T08:27:22Z

Why do we have a rindex 6510787 in the dataset.

This number is used to fix a bug in the index. The data part is not ready yet and I am still working on this.

There are 8,642,040 frames in total, and if we sample them by an interval of 10, there would be 0.8M samples in total, which does not match 1.3M.

Current 0.8M data is a small portion of the whole ytb frames that I experiment with. I pick up the video clips with close visual appearance to Carla to form these 0.8M frames. This does not affect the pretrain quality in carla downstream tasks. I will consider uploading the whole dataset in the future.

For pre-training, do you take the checkpoint with best accuracy on the 30% test-set or take the last epoch one.

I use the last epoch. I found the test accuracy stable during training so I simply pick the last checkpoint.

For imitation learning training, do you use the last epoch (100 epoch) checkpoint for evaluation?

This one is a little tricky. Due to IL's distribution shift problem, I found that the test performance between different epochs varies greatly. It is also hard to decide which epoch to pick since we do not have a validate environment. So for each pretrain weight, I pick (i*10)th, i=[3, 10] checkpoint and report the highest success rate.

Hi, Thanks for sharing your great work!

In this issue, you mentioned that you picked up the clips with a close visual appearance to Carla, so what is the criteria for visual appearance similarity? Picking up 0.8M will bring a better performance or it just saves the training cost?
Did you follow the default suite of DI-Drive for IL Carla dataset generation?

Bests,

zqh0253 · 2022-12-01T13:01:42Z

Hi, thanks for your interest in our work.

There are no clear criteria; I simply removed some driving videos with extreme weather to save the training cost. It would help with more carefully designed measures, for example, calculating the feature distance between Carla frames and video frames of a particular video and then sorting all the videos based on that distance.
Yes, I follow the default suite of DI-Drive for IL dataset generation, except for several settings mentioned here.

SiyuanHuang95 · 2022-12-01T15:55:45Z

Thanks for your reply! 1. Have you tested the performance with the full-size dataset? Will it cause a performance drop? Or it just takes more time? Given the diversity, extreme weather would add more diversity, which is the gain source of the pre-trained large model. 3. I am wondering if it would matter much for different Carla versions. And I found the default version for DI is 0.9.9. Can you share why you make another choice? Bests

zqh0253 · 2022-12-07T04:21:05Z

We didn't conduct experiments comparing different dataset size. Indeed with more diversity, the pre-trained large model could be even stronger.
From the very start, DI-drive did not support Carla 0.9.9. And that is why we used an old version.

SiyuanHuang95 · 2022-12-07T05:35:54Z

Okay, thanks.
Thanks for your information.

penghao-wu closed this as completed Aug 10, 2022

penghao-wu reopened this Aug 12, 2022

zqh0253 closed this as completed Aug 15, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About dataset #1

About dataset #1

penghao-wu commented Aug 8, 2022

zqh0253 commented Aug 9, 2022

penghao-wu commented Aug 9, 2022

zqh0253 commented Aug 9, 2022

penghao-wu commented Aug 10, 2022

zqh0253 commented Aug 10, 2022

penghao-wu commented Aug 10, 2022

penghao-wu commented Aug 10, 2022 •

edited

Loading

zqh0253 commented Aug 10, 2022

penghao-wu commented Aug 12, 2022 •

edited

Loading

zqh0253 commented Aug 15, 2022

zqh0253 commented Aug 15, 2022

SiyuanHuang95 commented Dec 1, 2022

zqh0253 commented Dec 1, 2022

SiyuanHuang95 commented Dec 1, 2022 via email •

edited

Loading

zqh0253 commented Dec 7, 2022 •

edited

Loading

SiyuanHuang95 commented Dec 7, 2022

About dataset #1

About dataset #1

Comments

penghao-wu commented Aug 8, 2022

zqh0253 commented Aug 9, 2022

penghao-wu commented Aug 9, 2022

zqh0253 commented Aug 9, 2022

penghao-wu commented Aug 10, 2022

zqh0253 commented Aug 10, 2022

penghao-wu commented Aug 10, 2022

penghao-wu commented Aug 10, 2022 • edited Loading

zqh0253 commented Aug 10, 2022

penghao-wu commented Aug 12, 2022 • edited Loading

zqh0253 commented Aug 15, 2022

zqh0253 commented Aug 15, 2022

SiyuanHuang95 commented Dec 1, 2022

zqh0253 commented Dec 1, 2022

SiyuanHuang95 commented Dec 1, 2022 via email • edited Loading

zqh0253 commented Dec 7, 2022 • edited Loading

SiyuanHuang95 commented Dec 7, 2022

penghao-wu commented Aug 10, 2022 •

edited

Loading

penghao-wu commented Aug 12, 2022 •

edited

Loading

SiyuanHuang95 commented Dec 1, 2022 via email •

edited

Loading

zqh0253 commented Dec 7, 2022 •

edited

Loading