Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

Kevin Chen, Christopher Choy, Manolis Savva, Angel Chang, Thomas Funkhouser, Silvio Savarese

Citing

If you find this code useful in your work, please cite us:

@article{chen2018text2shape,
  title={Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings},
  author={Chen, Kevin and Choy, Christopher B and Savva, Manolis and Chang, Angel X and Funkhouser, Thomas and Savarese, Silvio},
  journal={arXiv preprint arXiv:1803.08495},
  year={2018}
}

Data Collection

Code for the turking interface can be found here.

Setup

Downloads

Data

The following data files can be downloaded from the project webpage and are required for running the code:

ShapeNet voxelizations
Primitives dataset
Additional dataset files located here

Due to the complicated and highly variable nature of the meshes collected in ShapeNet, a select few of the models in the ShapeNet dataset result in unexpected voxelizations. We have noted which of these models this phenomenon occurs in the text2shape data zip. If you find any additional such models, please report them to kevin.chen@cs.stanford.edu. Thank you!

Third Party

Please download and build the SmartScenes Toolkit.

Configuration

Under tools/scripts/render.sh, edit the $TOOLKIT_PATH to the path to the SmartScenes Toolkit.

Additionally, set up the configuration file in lib/config.py. The following need to be edited:

General

__C.DIR.DATA_PATH: Directory containing train/val/test splits and json.
__C.DIR.TOOLKIT_PATH: Path to SmartScenes Toolkit.

ShapeNet Paths

__C.DIR.RGB_VOXEL_PATH: Directory containing the RGB voxelizations.
__C.DIR.RAW_CAPTION_CSV: ShapeNet captions CSV.

Primitives Paths

__C.DIR.PRIMITIVES_RGB_VOXEL_PATH: Directory containing the RGB voxelizations and descriptions.

Data Format: NRRD

We store our voxelizations in NRRD format. We read the data in Python using pynrrd.

To visualize your voxels, you can visualize the NRRD using the ssc/render-voxels.js script from SmartScenes Toolkit, or use your own method (e.g. save in a different format and visualize point clouds, etc.).

Usage

Text and shape encoders

ShapeNet

# Train
./run_lba_encoder.sh 0 LBA1 'shapenet/encoder_logdir' 'train encoder on shapenet' '--dataset shapenet --validation --visit_weight 0.25 --learning_rate 2e-4 --lba_mode MM --num_epochs 100 --decay_steps 2500 --lba_test_mode shape --batch_size 100 --lba_unnormalize'

# Generate text and shape embeddings in a subdirectory under the model path called train/val/test
./tools/scripts/generate_text_embeddings.sh LBA1 outputs/shapenet/encoder_logdir/model.ckpt-50 '--dataset shapenet --visit_weight 0.25 --lba_mode MM --num_epochs 10000 --lba_test_mode text --lba_unnormalize'

Primitives

# Train
./run_lba_encoder.sh 0 LBA1 'primitives/encoder_logdir' 'our full method on primitives' '--dataset primitives --validation --visit_weight 0.25 --learning_rate 2e-4 --lba_mode MM --num_epochs 100 --decay_steps 5000 --lba_test_mode shape --batch_size 100 --lba_unnormalize'

# Generate text embeddings
./tools/scripts/generate_text_embeddings.sh LBA1 outputs/primitives/encoder_logdir/model.ckpt-500 '--dataset primitives --visit_weight 0.25 --lba_mode MM --num_epochs 10000 --lba_test_mode text --lba_unnormalize'

# Generate shape embeddings
./tools/scripts/generate_text_embeddings.sh LBA1 outputs/primitives/encoder_logdir/model.ckpt-500 '--dataset primitives --visit_weight 0.25 --lba_mode MM --num_epochs 10000 --lba_test_mode shape --lba_unnormalize'

Conditional Wasserstein GAN

# Train
./run.sh 0 shapenet/cwgan_logdir "CWGAN1 (improved WGAN) on ShapeNet" "--model CWGAN1 --dataset shapenet --cfg ./cfgs/improved_wgan.yaml --shapenet_ct_classifier --learning_rate 5e-5 --queue_capacity 20 --noise_size 8 --uniform_max 0.5 --decay_steps 10000"

# Render
./tools/scripts/render.sh CWGAN1 outputs/shapenet/cwgan_logdir

Shape classifier

ShapeNet

# Train
python main.py --shapenet_ct_classifier --model Classifier128 --classifier --dataset shapenet --validation --log_path outputs/shapenet/shapenet_ct_classifier --batch_size 64

Primitives

# Train on all splits
./run_classifier.sh 0 primitives/classifier128 'classifier on primitives dataset (full dataset with train/val/split)' '--dataset primitives'
# or run:
python main.py --model Classifier128 --batch_size 64 --num_epochs 10000 --learning_rate 1e-3 --decay_steps 10000 --log_path ./outputs/primitives/classifier128 --dataset primitives --validation --classifier

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

Data Collection

Setup

Downloads

Configuration

Data Format: NRRD

Usage

Text and shape encoders

ShapeNet

Primitives

Conditional Wasserstein GAN

Shape classifier

ShapeNet

Primitives

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
cfgs		cfgs
lib		lib
models		models
tools		tools
LICENSE		LICENSE
README.md		README.md
main.py		main.py
requirements.txt		requirements.txt
run.sh		run.sh
run_classifier.sh		run_classifier.sh
run_lba_encoder.sh		run_lba_encoder.sh

License

kchen92/text2shape

Folders and files

Latest commit

History

Repository files navigation

Text2Shape: Generating Shapes from Natural Language by Learning Joint Embeddings

Data Collection

Setup

Downloads

Configuration

Data Format: NRRD

Usage

Text and shape encoders

ShapeNet

Primitives

Conditional Wasserstein GAN

Shape classifier

ShapeNet

Primitives

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages