A modular Pytroch library for multi-view research on 3D understanding and 3D generation.
MVTorch provides efficient, reusable components for 3D Computer Vision and Graphics research based on mult-view representation with PyTorch and Pytorch3D.
- Render differentiable multi-view images from meshes and point clouds with 3D-2D correspondances.
- Data loaders for 3D data and multi-view images (posed or unposed )
- Visualizations of 3D mesh,point cloud, multi-view images.
- Modular training of multi-view networks for different 3D tasks
- I/O 3D data and multi-view images.
- Are implemented using PyTorch tensors and on top of Pytorch3D
- Can handle minibatches of hetereogenous data
- Can be differentiated for input gradients.
- Can utilize GPUs for acceleration
For detailed instructions refer to INSTALL.md.
- After installing
mvtorch
, download common 3D datasets (ModelNet40, ScanObjectNN, ShapeNet Parts, nerf_synthetic) and unzip insidedata
directory.
cd data/
wget https://shapenet.cs.stanford.edu/media/shapenet_part_seg_hdf5_data.zip --no-check-certificate # download ShapeNet Parts
# download the other datasets from the browser
- Run any example from
examples
directory
cd examples/ && python classification.py
Get started with MVTorch by trying one of the following tutorials.
Training MVCNN in 10 lines of code for 3D Classification | Training 3D Part Segmentation with Multi-View DeepLabV3 |
Fit A Simple Neural Radiance Field | Create Textured Meshes from Text |
- MVRenderer ( renders multi-view images of both point clouds and meshes )
- MVNetwork ( allow to take any 2D network as input and outputs its multi-view features)
- Visualizer ( handles multi-view and 3D visualization both for server saves and interactive visualization)
- data I/O ( load any dataset: modelnet, shapenet, scanobjectnn, shapenet parts, s3dis, nerf, as well as saving Multi-view datasets.)
- ViewSelector ( multi-view selector to select M viewpoints to render: random, circular ,spherical, mvtn etc ... )
- MVAggregate ( a super model that accepts any 2D network as input and outputs the global multi-view features of input multi-view images: MeanPool, MaxPool)
- MVLifting ( aggregates dense features from multi-view pixel features to 3D features , eg. LabelPool, MeanPool, Voint aggregation and lifting )
- other useful utility functions and operations.
We welcome new contributions to MVTorch by following this procedure for pull requests:
-
For code modifications, create an issue with tag
request
and wait for 10 days for the issue to be resolved. -
If issue not resolved in 10 days, fork the repo and create a pull request on a new branch. Please make sure the main examples can run after your adjustments on the core library.
-
For additional examples, just create a pull request without creating an issue.
-
If you can contribue regularly on the library, please contact Abdullah to be added to the contruters list.
If you find mvtorch useful in your research, please cite the library paper:
@misc{hamdi2022mvtn,
title={MVTN: Learning Multi-View Transformations for 3D Understanding},
author={Abdullah Hamdi and Faisal AlZahrani and Silvio Giancola and Bernard Ghanem},
year={2022},
eprint={2212.13462},
archivePrefix={arXiv},
primaryClass={cs.CV}
}
[July 23 2022]: MVTorch repo created
[December 26 2022]: MVTorch made public
Projects that MVTorch benifited from in devlopment: MVTN, Voint Cloud, Text2Mesh and NeRF
A detailed documentation of the library should be coming soon...
Coming soon ...
MVTorch is released under the BSD License.