Skip to content

wyndwarrior/autoregressive-bbox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

28 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction

Pytorch implementation of our autoregressive model formulation for 3D bounding-box estimation & detection.

Autoregressive Uncertainty Modeling for 3D Bounding Box Prediction
YuXuan Liu1,2, Nikhil Mishra1,2, Maximilian Sieb1, Yide Shentu1,2, Pieter Abbeel1,2, Xi Chen1
1Covariant.ai, 2UC Berkeley
in ECCV 2022



Autoregressive 3D Bounding Box Estimation

3D bounding-box estimation assumes that 2D object segmentation has already been performed through any type of segmentation model, e.g. Mask R-CNN. Our autoregressive bounding box estimation model can be found under autoreg-bbox.

Python dependencies are listed in requirements.txt and can be installed via pip install -r requirements.txt We provide two Jupyter notebooks:

  1. visualize_data.ipynb which lets you visualize data samples from our new dataset COB-3D. We provide code to visualize 2D masks and 3D bounding boxes.
  2. inference_example.ipynb which lets you run inference with our newly proposed model architecture for the 3D Bounding Box Estimation task. We provide trained model weights which you can download here. Any use the the dataset, code, and weights is subject to our CC Attribution-NonCommercial-ShareAlike License.

Autoregressive 3D Bounding Box Detection

3D bounding box detection predicts 3D bounding box directly from a point cloud.

We forked repos from two SOTA methods for the detection task, i.e. FCAF3D and PVRCNN, and implemented our autoregressive head on top. The augmented code can be found under the respective folders autoreg-fcaf3d and autoreg-pvrcnn.

COB-3D Dataset

You can download our newly published dataset for common objects in bins for robotic picking applications here. Any use the the dataset, code, and weights is subject to our CC Attribution-NonCommercial-ShareAlike License. All of the data was created by Theory Studios.

Each data point contains the following:

  • RGB image of shape (H, W, 3)
  • Depth map of shape (H, W)
  • Intrinsic Matrix of the camera (3, 3)
  • Normals Map of shape (H, W, 3)
  • Instance Masks of shape (N, H, W) where N is the number of objects
  • Amodal Instance masks of shape (N, H, W) which includes the occluded regions of the object
  • 3D Bounding Box of each object (N, 9) as determined by dimensions, center, and rotation.

For more info and example code on how to load & interact with the data, refer to the visualize_data.ipynb Jupyter notebook.

License

Shield: CC BY-NC-SA 4.0

This work, including the paper, code, weights, and dataset, is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

CC BY-NC-SA 4.0