Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add AIM Model from Scalable Pre-training of Large Autoregressive Image Models #1479

Merged
merged 71 commits into from
Jan 23, 2024

Conversation

guarin
Copy link
Contributor

@guarin guarin commented Jan 19, 2024

This PR implements the AIM model proposed in Scalable Pre-training of Large Autoregressive Image Models. The implementation is based on the original code but uses a modified version of the vision transformer from timm as backbone. The backbone is fully compatible with the timm vision transformer and pretrained weights from our backbone should be loadable with the timm vision transformer (state dicts are identical).

The implementation is a best effort. The paper and reference code miss some crucial information. Specifically, the prefix length and detailed description of the MLP architecture for the prediction head are missing. Nevertheless, the current implementation is running and is hopefully a good start. I checked with the authors, the head and prefix masking should be correct now :)

Changes

  • Add MaskedCausalVisionTransformer
  • Add AIMPredictionHead
  • Add AIMTransform
  • Add AIM benchmark module

TODO:

  • causal vision transformer
  • prediction head
  • benchmark model
  • run benchmark
  • add evaluation code

We also have to figure out whether we want to add this to benchmarks/imagenet/vitb16 because the backbone is clearly not vitb16 😅

How was it tested?

  • Manually, I tested a smaller version of the model and it runs well. Couldn't benchmark the full model due to compute limitations as the smallest model version requires 600M params for the backbone and 400M params for the head.
  • Will add unit tests in a follow-up PR.

For Review

Review is only required for the following files/functions:

  • benchmarks/imagenet/vitb16/aim.py
  • lightly/models/modules/__init__.py
  • lightly/models/modules/heads_timm.py
  • lightly/models/modules/masked_causal_vision_transformer.py
  • lightly/models/utils.py -> random_prefix_mask function
  • lightly/transforms/aim_transform.py

The other files/functions have already been reviewed in other PRs but are not yet on master.

guarin and others added 30 commits May 30, 2023 14:25
* This is required as torch.no_grad doesn't change the model configuration
  while manual gradient deactivation/activation can have unintended
  consequences. For example, MAE ViT positional embeddings are parameters
  with requires_grad=False that should never receive an update. But if
  we use activate_requires_grad for finetuning we break those
  parameters.
…om:lightly-ai/lightly into guarin-lig-3056-add-mae-imagenet-benchmark
@guarin guarin mentioned this pull request Jan 22, 2024
6 tasks
@guarin guarin changed the base branch from ersi-lig-3910-update-mae-benchmark-code to master January 23, 2024 07:38
@guarin guarin marked this pull request as ready for review January 23, 2024 08:04
@guarin guarin changed the title Add AIM Add AIM model from Scalable Pre-training of Large Autoregressive Image Models Jan 23, 2024
@guarin guarin changed the title Add AIM model from Scalable Pre-training of Large Autoregressive Image Models Add AIM Model from Scalable Pre-training of Large Autoregressive Image Models Jan 23, 2024
@guarin guarin merged commit 87be5a1 into master Jan 23, 2024
7 of 9 checks passed
@guarin guarin deleted the aim branch January 23, 2024 15:24
@adamjstewart
Copy link
Contributor

FWIW, this PR caused a bit of a headache for us in TorchGeo: microsoft/torchgeo#1824

At the moment, the changes here make lightly v1.4.26 incompatible with any version of segmentation-models-pytorch. This isn't necessarily your fault, but it would help if you could check the version of timm available before importing everything else.

@guarin guarin mentioned this pull request Jan 25, 2024
@guarin guarin mentioned this pull request Feb 12, 2024
2 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants