Skip to content

MultiScan: Scalable RGBD scanning for 3D environments with articulated objects

License

Notifications You must be signed in to change notification settings

smartscenes/multiscan

Repository files navigation

MultiScan: Scalable RGBD scanning for 3D environments with articulated objects

MultiScan is a scalable RGBD dataset construction pipeline leveraging commodity mobile devices to scan indoor scenes with articulated objects and web-based semantic annotation interfaces to efficiently annotate object and part semantics and part mobility parameters.

The repository includes:

  • Source code of iOS and Android scanning apps
  • Processing server for 3D reconstruction, texturing and segmentation
  • Web interface for browsing scans and initiate processings
  • Source code of benchmark dataset preparation
  • Source code of data visualization

MultiScan Dataset

Download and copy MultiScan dataset download script to [PROJECT_ROOT]/dataset directory, and run download script to dowload the dataset:

./dataset/download.sh <output_dir>

Unzip files:

cd <output_dir>
unzip "*.zip"

The downloaded dataset would follow this file system structure.

MultiScan dataset includes:

  1. Acquired data from scanner app: doc
  2. Output data from processing server: doc
  3. Annotation data: doc

MultiScan Benchmark Dataset

Download and copy MultiScan benchmark dataset download script to [PROJECT_ROOT]/dataset directory, and follow the instructions bellow to dowload the MultiScan benchmark dataset:

Object instance segmentation

Preprocessed object instance segmentation data download:

./dataset/download_benchmark_dataset.sh -o <output_dir>

Part instance segmentation

Preprocessed part instance segmentation data download:

./dataset/download_benchmark_dataset.sh -p <output_dir>

Mobility prediction

Preprocessed articulated objects dataset download:

./dataset/download_benchmark_dataset.sh -a <output_dir>

Unzip files with:

cd <output_dir>
unzip "*.zip"

Please checkout benchmark dataset doc for information about preprocessed dataset download and the preprocess scripts.


Scanner App

The Scanner App collects data using sensors on an Android/iOS device. User moves around holding the device with Scanner app installed to scan the scene. Once the scanning is completed, users can upload the data to the processing server.

  • source code for iOS scanning app: iOS code
  • documentation for iOS scanning app: iOS doc
  • source code for Android scanning app: Android code
  • documentation for Android scanning app: Android doc

Processing Server

The staging server has 3 main functionalities:

  1. Stage uploaded scans by the devices (iOS or Android) and trigger scan processing. To ensure that scans can be automatically processed, the scans should be placed in a directory with lots of space and accessible to the scanning processor.
  2. Process staged scans. Handle reconstruction processing request from Web-UI, when user press interactive buttons on Web-UI.
  3. Index staged scans. Go through scan folders and collate information about the scans.
  • source code for processing server: server code
  • installation doc for processing server: install
  • configurations and documentations for processing server: doc

Staging Data Formats

Details about the formats of the uploaded files, and data generated by the processing server are available at here


Web-UI

The Web-UI is an interactive interface for providing an overview of staged scan data, managing scan data, and controlling the reconstruction and mesh annotation pipeline.


Benchmark

With MultiScan dataset, we carry out a series of benchmark experiments to evaluate methods from recent work on object instance segmentation, part instance segmentation, and mobility prediction.

Please checkout benchmark dataset doc for information about preprocessed dataset download and the preprocess scripts.

Benchmark train/val/test split and selected object/part semantic label and IDs:


Visualization

Annotations visualization

Turntable video visualizations for semantic label annotation, semantic OBB annotation, articulation annotation, and textured mesh of the scans. Please checkout the visualization doc for more information.


Citation

If you use the MultiScan data or code please cite:

@inproceedings{mao2022multiscan,
    author = {Mao, Yongsen and Zhang, Yiming and Jiang, Hanxiao and Chang, Angel X, Savva, Manolis},
    title = {MultiScan: Scalable RGBD scanning for 3D environments with articulated objects},
    booktitle = {Advances in Neural Information Processing Systems},
    year = {2022}
}

References

Our work is built on top of the ScanNet dataset acquisition framework, Open3D, and MVS-Texturing for 3D reconstruction. We use the Open3D, Pyrender, MeshLab and Instant Meshes for rendering and post-processing.

@misc{dai2017scannet,
    title={ScanNet: Richly-annotated 3D Reconstructions of Indoor Scenes}, 
    author={Angela Dai and Angel X. Chang and Manolis Savva and Maciej Halber and Thomas Funkhouser and Matthias Nießner},
    year={2017},
    eprint={1702.04405},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}

@article{Zhou2018,
    author    = {Qian-Yi Zhou and Jaesik Park and Vladlen Koltun},
    title     = {{Open3D}: {A} Modern Library for {3D} Data Processing},
    journal   = {arXiv:1801.09847},
    year      = {2018},
}

@inproceedings{Waechter2014Texturing,
    title    = {Let There Be Color! --- {L}arge-Scale Texturing of {3D} Reconstructions},
    author   = {Waechter, Michael and Moehrle, Nils and Goesele, Michael},
    booktitle= {Proceedings of the European Conference on Computer Vision},
    year     = {2014},
    publisher= {Springer},
}

@article{Jakob2015Instant,
    author = {Wenzel Jakob and Marco Tarini and Daniele Panozzo and Olga Sorkine-Hornung},
    title = {Instant Field-Aligned Meshes},
    journal = {ACM Transactions on Graphics (Proceedings of SIGGRAPH ASIA)},
    volume = {34},
    number = {6},
    year = {2015},
    month = nov,
    doi = {10.1145/2816795.2818078},
}

@inproceedings{LocalChapterEvents:ItalChap:ItalianChapConf2008:129-136,
    booktitle = {Eurographics Italian Chapter Conference},
    editor = {Vittorio Scarano and Rosario De Chiara and Ugo Erra},
    title = {{MeshLab: an Open-Source Mesh Processing Tool}},
    author = {Cignoni, Paolo and Callieri, Marco and Corsini, Massimiliano and Dellepiane, Matteo and Ganovelli, Fabio and Ranzuglia, Guido},
    year = {2008},
    publisher = {The Eurographics Association},
    ISBN = {978-3-905673-68-5},
    DOI = {10.2312/LocalChapterEvents/ItalChap/ItalianChapConf2008/129-136}
}