SimCLR: add new trainer #1195

adamjstewart · 2023-03-25T17:30:37Z

@nilsleh this is the new LightningModule template I would like to use. If this looks good to people, I'll update our older LightningModules too. Summary of differences:

No *args or **kwargs (hides argument typos, no default values, no type hints)
No typing.Any (disables type checking)
No typing.cast or # type: ignore (not necessary in latest Lightning version)
__init__ first (should be first thing in docs)
No custom functions (stick with LightningModule methods)
Simpler configure_optimizers (no need for fancy dictionary)

adamjstewart · 2023-03-25T21:07:24Z

torchgeo/trainers/simclr.py

+        Args:
+            model: Name of the timm model to use.
+            in_channels: Number of input channels to model.
+            version: Version of SimCLR, 1--2.


This isn't really used at the moment since layers and weight_decay are also parameters, but it could be used to control other things in the future (see TODOs).

adamjstewart · 2023-03-25T21:09:09Z

torchgeo/trainers/simclr.py

+
+        # TODO
+        # v1+: add global batch norm
+        # v2: add selective kernels, channel-wise attention mechanism, memory bank


Not sure exactly how to make these changes, and I don't really want to change the architecture too much to ensure that our pre-trained weights can be loaded in a vanilla model. The memory bank only adds +1% performance, so I don't really think it's worth the complexity.

I would say that when most papers use SimCLR, they use v1 without all the tricks that get the ~1% improvement. I think it would be better to keep it simple.

The majority of the performance bump in v2 is thanks to the deeper projection head, which we have, so we should be good.

Actually, in terms of performance bump:

Bigger ResNets, SK, channel-wise attention: +29%

Deeper projection head: +14%

Memory bank: +1%

So memory bank isn't high on my priority list, but adding SK and channel-wise attention may be worth it.

adamjstewart · 2023-03-25T21:09:59Z

torchgeo/trainers/simclr.py

+        # Find positive example -> batch_size // 2 away from the original example
+        pos_mask = self_mask.roll(shifts=cos_sim.shape[0] // 2, dims=0)
+
+        # NT-Xent loss (aka InfoNCE loss)


Both SimCLR and MoCo use InfoNCE loss, but there is no implementation in PyTorch. There are many libraries that implement it, but I'd rather not add yet another dependency.

adamjstewart · 2023-03-25T21:12:06Z

torchgeo/trainers/simclr.py

+    def test_step(self, batch: Dict[str, Tensor], batch_idx: int) -> None:
+        """No-op, does nothing."""
+        # TODO
+        # v2: add distillation step


This would actually be very useful to add someday. Both using a large model to better train a small model, and self-distillation, have been found to greatly improve performance. I didn't do this because I'm not super familiar with teacher-student distillation methods.

isaaccorley

Overall LGTM

isaaccorley · 2023-03-27T03:29:00Z

torchgeo/trainers/simclr.py

+        # Data augmentation
+        # https://github.com/google-research/simclr/blob/master/data_util.py
+        self.aug = K.AugmentationSequential(
+            K.RandomResizedCrop(size=(96, 96)),


Should we be hardcoding this?

It's hardcoded in BYOL (should actually be 224, not 96, let me fix this). We can make it a parameter if you want, but at the moment I don't know if we need it to be.

I'm okay with fixing it for now but it's only set to 224 because that's what imagenet experiments use. It's probably better to not restrict to 224 in case we use higher res imagery.

isaaccorley · 2023-03-27T03:31:07Z

torchgeo/trainers/simclr.py

+        cos_sim = F.cosine_similarity(x[:, None, :], x[None, :, :], dim=-1)
+
+        # Mask out cosine similarity to itself
+        self_mask = torch.eye(cos_sim.shape[0], dtype=torch.bool, device=cos_sim.device)


We could make NT-XEnt loss it's own nn.Module so we can test it and reuse it. Maybe in a future PR.

I found several repos with their own InfoNCE loss implementation but they all implement it different and I don't know the math well enough to decide which is best. The implementation here assumes that there is exactly 1 positive pair and everything else is a negative pair. A more general implementation, or a faster implementation, is a lot more work to get right.

I think the implementation is fine. I was suggesting that we make it a separate module since other SSL methods use it as well. But until we have another SSL method that uses it, I think it's fine to leave as is.

Well I'm about to add MoCo which also uses it, although their implementation is completely different, and I have no idea what the difference is.

calebrob6 · 2023-03-27T05:47:26Z

Can you test this on a real dataset (maybe eurosat100?) before merging?

nilsleh · 2023-03-27T06:50:39Z

torchgeo/trainers/simclr.py

+            Optimizer and learning rate scheduler.
+        """
+        # Original paper uses LARS optimizer, but this is not defined in PyTorch
+        optimizer = AdamW(


Should the optimizer choice also be user defineable, as different model architectures work better with certain optimizers? Or would you expect/want a user to overwrite the configure_optimizers method in their inherited trainer class?

Our BYOL trainer supports specifying an optimizer, but none of the other trainers do. For now, I'm just using the optimizer used in the original paper. The only difficulty with making it user configurable is that each optimizer has different arguments. We could add a **kwargs that is used in the optimizer to handle this, but then we can't use it anywhere else (without a bit of hacking like we did in NAIPChesapeakeDataModule.

nilsleh · 2023-03-27T06:52:28Z

torchgeo/trainers/simclr.py

+                # For the middle layers, use bias and ReLU
+                self.model.fc = nn.Sequential(
+                    self.model.fc,
+                    nn.ReLU(inplace=True),


Is ReLU the desired/required activation function choice here or should that be more flexible?

It's just what the original paper used. Depends on how much customization we want to support.

adamjstewart · 2023-03-27T15:28:47Z

Can you test this on a real dataset (maybe eurosat100?) before merging?

The following script runs without crashing:

from lightning.pytorch import Trainer

from torchgeo.datamodules import EuroSAT100DataModule
from torchgeo.trainers import SimCLRTask


datamodule = EuroSAT100DataModule(
    root="data/eurosat",
    batch_size=2,
    download=True,
)

model = SimCLRTask(
    model="resnet18",
    in_channels=13,
    max_epochs=1
)

trainer = Trainer(
    accelerator="cpu",
    max_epochs=1,
)

trainer.fit(model=model, datamodule=datamodule)

I'm kind of trusting our tests to make sure things "work". Once I add all of these trainers and @isaaccorley finishes the pretrain+train pipeline I'm planning on testing all of them on SSL4EO-S12 to make sure they actually work.

calebrob6 · 2023-03-29T10:06:04Z

To be a little more specific, I would expect that if you used this trainer with the default settings that you would at least observe the loss decreasing. Tests will check if the code executes but not whether it is doing what you would expect in a ML training sense -- I'm interested in whether this actually does self-supervised learning! Regardless, I think you'll figure that out in short order if you're running experiments.

This reverts commit 39d6941.

* SimCLR: add new trainer * Add tests * Support custom number of MLP layers * Change default params, add TODOs * Fix mypy * Fix docs and most of tests * Fix all tests * Fix support for older Kornia versions * Fix support for older Kornia versions * Crop should be 224, not 96

This reverts commit 39d6941.

adamjstewart added 4 commits March 22, 2023 12:13

SimCLR: add new trainer

10f0bde

Add tests

38d20fb

Support custom number of MLP layers

5e9592d

Change default params, add TODOs

853f7ed

adamjstewart added this to the 0.5.0 milestone Mar 25, 2023

github-actions bot added testing Continuous integration testing trainers PyTorch Lightning trainers labels Mar 25, 2023

adamjstewart added 2 commits March 25, 2023 12:45

Fix mypy

26f79eb

Fix docs and most of tests

ac3d4ae

github-actions bot added the documentation Improvements or additions to documentation label Mar 25, 2023

Fix all tests

d482a0e

adamjstewart commented Mar 25, 2023

View reviewed changes

adamjstewart added 2 commits March 25, 2023 16:13

Fix support for older Kornia versions

4c049c0

Fix support for older Kornia versions

c404e3e

isaaccorley previously approved these changes Mar 27, 2023

View reviewed changes

nilsleh reviewed Mar 27, 2023

View reviewed changes

Crop should be 224, not 96

41a10d2

adamjstewart dismissed stale reviews from isaaccorley and ghost via 41a10d2 March 27, 2023 15:24

adamjstewart dismissed stale reviews from ghost via 41a10d2 March 27, 2023 21:22

calebrob6 approved these changes Mar 29, 2023

View reviewed changes

calebrob6 merged commit 39d6941 into main Mar 29, 2023

calebrob6 deleted the trainers/simclr branch March 29, 2023 10:06

adamjstewart added a commit that referenced this pull request Mar 29, 2023

Revert "SimCLR: add new trainer (#1195)"

63f6ca7

This reverts commit 39d6941.

adamjstewart mentioned this pull request Mar 29, 2023

Revert "SimCLR: add new trainer" #1205

Merged

adamjstewart removed this from the 0.5.0 milestone Mar 29, 2023

isaaccorley pushed a commit that referenced this pull request Mar 29, 2023

Revert "SimCLR: add new trainer (#1195)" (#1205)

d14e254

This reverts commit 39d6941.

adamjstewart mentioned this pull request Apr 16, 2023

Add SimCLR trainer #1252

Merged

yichiac pushed a commit to yichiac/torchgeo that referenced this pull request Apr 29, 2023

Revert "SimCLR: add new trainer (microsoft#1195)" (microsoft#1205)

07b9e88

This reverts commit 39d6941.

adamjstewart mentioned this pull request Sep 1, 2023

Refactor trainers #1541

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SimCLR: add new trainer #1195

SimCLR: add new trainer #1195

adamjstewart commented Mar 25, 2023 •

edited

Loading

adamjstewart Mar 25, 2023

adamjstewart Mar 25, 2023

isaaccorley Mar 27, 2023

adamjstewart Mar 27, 2023

adamjstewart Mar 27, 2023

adamjstewart Mar 25, 2023

adamjstewart Mar 25, 2023

isaaccorley left a comment

isaaccorley Mar 27, 2023

adamjstewart Mar 27, 2023

isaaccorley Mar 27, 2023

isaaccorley Mar 27, 2023

adamjstewart Mar 27, 2023

isaaccorley Mar 27, 2023

adamjstewart Mar 27, 2023

calebrob6 commented Mar 27, 2023

nilsleh Mar 27, 2023

adamjstewart Mar 27, 2023

nilsleh Mar 27, 2023

adamjstewart Mar 27, 2023

adamjstewart commented Mar 27, 2023

calebrob6 commented Mar 29, 2023

SimCLR: add new trainer #1195

SimCLR: add new trainer #1195

Conversation

adamjstewart commented Mar 25, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

isaaccorley left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

calebrob6 commented Mar 27, 2023

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adamjstewart commented Mar 27, 2023

calebrob6 commented Mar 29, 2023

adamjstewart commented Mar 25, 2023 •

edited

Loading