-
Notifications
You must be signed in to change notification settings - Fork 18
Sub-optimal results on customized KITTI dataset #14
Comments
Thanks a lot for your message! It is difficult to say whether the method will work on KITTI, as the dataset is quite a bit smaller, and the larger diversity of Waymo Open might be helpful in stabilizing the approach. Regarding your questions:
Hope this helps! |
Thank you for your feedback! Are there some suggestions on the image resolution? In our implementation, the input/output image size is set to 192x640, which is different from the Waymo setting. I noticed the broadcast decoder will amplify the resolution 16 times, and there are also fixed patch sizes (ie, 8) during the encoding stage. According to your experience, do you think it is necessary to tune these parameters according to the input resolution? Thank you very much! |
Thanks for reaching out. In Waymo experiments, we used a resolution of 128x192 and a high resolution version with resolution 256x384. There are two parameters to adapt the decoder for different resolutions:
In the resolution you mentioned 192x640, I believe adjusting decoder resolution parameter from (8, 12) --> (12, 40) may be good to try. I expect this to be computationally expensive given the high resolution.
|
Hi!
Thank you for sharing the code of this impressive work.
Since there is no config file and training list for the Waymo dataset, I use a workaround to build a KITTI benchmark based on the KITTI_STEP dataset. It has annotated instance labels for the train/val set, with ~5000 images feasible for evaluating the ARI metric. I choose ~3500 for training and ~1500 for testing using the depth supervision of SAVi++. Since there is no annotation for optical flow, I use only LiDAR points for supervision.
After training the network, both depth and segmentations are not as good as reported in the paper. The FG-ARI is around 8.0, which is quietly low. Visualizations for a single frame of the sequence are shown below (from top to bottom: image, interpolated_depth_gt, segmentations_gt, depth_pred, segmentations_pred).
Considering the large discrepancy between the KITTI and Waymo datasets, I have some queries on the setting you used in Waymo experiments:
log(d+1)
)?It will be great if you can offer other insights on to achieve comparable results on the outdoor dataset. Thank you!
The text was updated successfully, but these errors were encountered: