CapsNet-Keras

A Keras implementation of CapsNet in the paper:
Sara Sabour, Nicholas Frosst, Geoffrey E Hinton. Dynamic Routing Between Capsules. NIPS 2017
The current average test error = 0.34% and best test error = 0.30%.

Differences with the paper:

We use the learning rate decay with decay factor = 0.9 and step = 1 epoch,
while the paper did not give the detailed parameters (or they didn't use it?).
We only report the test errors after 50 epochs training.
In the paper, I suppose they trained for 1250 epochs according to Figure A.1? Sounds crazy, maybe I misunderstood.
We use MSE (mean squared error) as the reconstruction loss and the coefficient for the loss is lam_recon=0.0005*784=0.392.
This should be equivalent with using SSE (sum squared error) and lam_recon=0.0005 as in the paper.

Recent updates:

Correct the Mask operation. Now all digit capsules are connected to the decoder simultaneously.
Change default epochs from 30 to 50.
Define a separate model for test phase which does not require ground truth y as input.
Fix #13
Report test errors on MNIST

TODO

~~The model has 8M parameters, while the paper said it should be 11M.~~
I have figured out the reason: 11M parameters are for the CapsuleNet on MultiMNIST where the image size is 36x36. The CapsuleNet on MNIST should indeed have 8M parameters.
I'll stop pursuing higher accuracy on MNIST. It is time to explore the interacting characteristics of CapsuleNet.

Contacts

Your contributions to the repo are always welcome. Open an issue or contact me with E-mail guoxifeng1990@163.com or WeChat wenlong-guo.

Usage

Step 1. Install Keras with TensorFlow backend.

pip install tensorflow-gpu
pip install keras

Step 2. Clone this repository to local.

git clone https://github.com/XifengGuo/CapsNet-Keras.git
cd CapsNet-Keras

Step 3. Train a CapsNet on MNIST

Training with default settings:

$ python capsulenet.py

Training with one routing iteration (default 3).

$ python capsulenet.py --num_routing 1

Other parameters include batch_size, epochs, lam_recon, shift_fraction, save_dir can be passed to the function in the same way. Please refer to capsulenet.py

Step 4. Test a pre-trained CapsNet model

Suppose you have trained a model using the above command, then the trained model will be saved to result/trained_model.h5. Now just launch the following command to get test results.

$ python capsulenet.py --is_training 0 --weights result/trained_model.h5

It will output the testing accuracy and show the reconstructed images. The testing data is same as the validation data. It will be easy to test on new data, just change the code as you want.

You can also just download a model I trained from https://pan.baidu.com/s/1nv9SMFn

Results

Test Errors

CapsNet classification test error on MNIST. Average and standard deviation results are reported by 3 trials. The results can be reproduced by launching the following commands.

python capsulenet.py --num_routing 1 --lam_recon 0.0    #CapsNet-v1   
python capsulenet.py --num_routing 1 --lam_recon 0.392  #CapsNet-v2
python capsulenet.py --num_routing 3 --lam_recon 0.0    #CapsNet-v3 
python capsulenet.py --num_routing 3 --lam_recon 0.392  #CapsNet-v4

Method	Routing	Reconstruction	MNIST (%)	Paper
Baseline	--	--	--	0.39
CapsNet-v1	1	no	0.39 (0.024)	0.34 (0.032)
CapsNet-v2	1	yes	0.36 (0.009)	0.29 (0.011)
CapsNet-v3	3	no	0.40 (0.016)	0.35 (0.036)
CapsNet-v4	3	yes	0.34 (0.016)	0.25 (0.005)

Losses and accuracies:

Training Speed

About 110s / epoch on a single GTX 1070 GPU.

Reconstruction result

The result of CapsNet-v4 by launching

python capsulenet.py --is_training 0 --weights result/trained_model.h5

Digits at top 5 rows are real images from MNIST and digits at bottom are corresponding reconstructed images.

The model structure:

Other Implementations

Kaggle (this version as self-contained notebook):
- MNIST Dataset running on the standard MNIST and predicting for test data
- MNIST Fashion running on the more challenging Fashion images.
TensorFlow:
- naturomics/CapsNet-Tensorflow
  Very good implementation. I referred to this repository in my code.
- InnerPeace-Wu/CapsNet-tensorflow
  I referred to the use of tf.scan when optimizing my CapsuleLayer.
- LaoDar/tf_CapsNet_simple
PyTorch:
MXNet:
- AaronLeong/CapsNet_Mxnet
Lasagne (Theano):
- DeniskaMazur/CapsNet-Lasagne
Chainer:
- soskek/dynamic_routing_between_capsules

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
result		result
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
capsulelayers.py		capsulelayers.py
capsulenet.py		capsulenet.py
real_and_recon.png		real_and_recon.png
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CapsNet-Keras

Usage

Results

Other Implementations

About

Releases

Packages

Languages

License

wballard/CapsNet-Keras

Folders and files

Latest commit

History

Repository files navigation

CapsNet-Keras

Usage

Results

Other Implementations

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages