-
Notifications
You must be signed in to change notification settings - Fork 321
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is cifar10 num_samples correct? #408
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/ecbb82a057c9be6caa06b0b8885743db8a1a752a/pl_bolts/datamodules/cifar10_datamodule.py#L98
The self.num_samples = 60000 - val_split may be wrong, the dataset indeed contains 60000 examples, but it includes the test set too, which size is 10000.
So I think the correct size would be 50000 - val_split
I checked it with val_split=5000, and the train_length here is 50000:
https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/ecbb82a057c9be6caa06b0b8885743db8a1a752a/pl_bolts/datamodules/cifar10_datamodule.py#L122
the len(dataset_train) is 45000 (not 55000) here:
https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/ecbb82a057c9be6caa06b0b8885743db8a1a752a/pl_bolts/datamodules/cifar10_datamodule.py#L123
It may affects the SimCLR performance, it is used here:
https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/ecbb82a057c9be6caa06b0b8885743db8a1a752a/pl_bolts/models/self_supervised/simclr/simclr_module.py#L400
after here:
https://github.com/PyTorchLightning/pytorch-lightning-bolts/blob/ecbb82a057c9be6caa06b0b8885743db8a1a752a/pl_bolts/models/self_supervised/simclr/simclr_module.py#L136
So it influences the lr schedule.
The text was updated successfully, but these errors were encountered: