[Flava] Add ckpt loading and accuracy metric to finetuning #119

ankitade · 2022-06-26T07:46:49Z

Accuracy metric for finetuning
Add checkpoint saving and best ckpt loading based on val accuracy
Load pretrained ckpt by default in classification model
make num gpus 1 in qnli.yaml

Test plan
python -m flava.finetune config=flava/configs/finetuning/qnli.yaml
(val acc : 0.8651)

Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt
/data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using DistributedSampler with the dataloaders. During trainer.validate(), it is recommended to use Trainer(devices=1) to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use DistributedSampler that replicates some samples to make sure all devices have same batch size in case of uneven inputs.
rank_zero_warn(
Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00, 3.15it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Validate metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ validation/accuracy/classification │ 0.8651315569877625 │
│ validation/losses/classification │ 0.4168359339237213 │

Stack from ghstack (oldest at bottom):

Differential Revision: D37444938

[ghstack-poisoned]

ghstack-source-id: 08766d32244fc64c31049cd8003299124c48e66c Pull Request resolved: #119

codecov-commenter · 2022-06-26T17:31:20Z

Codecov Report

❗ No coverage uploaded for pull request base (gh/ankitade/7/base@1ae2073). Click here to learn what that means.
The diff coverage is n/a.

@@                  Coverage Diff                  @@
##             gh/ankitade/7/base     #119   +/-   ##
=====================================================
  Coverage                      ?   88.59%           
=====================================================
  Files                         ?       40           
  Lines                         ?     2087           
  Branches                      ?        0           
=====================================================
  Hits                          ?     1849           
  Misses                        ?      238           
  Partials                      ?        0

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ae2073...5fb84e8. Read the comment docs.

ankitade · 2022-06-26T17:44:38Z

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

- Accuracy metric for finetuning - Add checkpoint saving and best ckpt loading based on val accuracy - Load pretrained ckpt by default in classification model - make num gpus 1 in qnli.yaml Test plan python -m flava.finetune config=flava/configs/finetuning/qnli.yaml (val acc : 0.8651) Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt /data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs. rank_zero_warn( Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00, 3.15it/s] ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ validation/accuracy/classification │ 0.8651315569877625 │ │ validation/losses/classification │ 0.4168359339237213 │ Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938) [ghstack-poisoned]

ghstack-source-id: 39c8d11af6ffae071159f6594a5b24c6a12909bb Pull Request resolved: #119

ankitade · 2022-06-26T19:06:52Z

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

ankitade · 2022-06-27T19:50:51Z

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

- Accuracy metric for finetuning - Add checkpoint saving and best ckpt loading based on val accuracy - Load pretrained ckpt by default in classification model - make num gpus 1 in qnli.yaml Test plan python -m flava.finetune config=flava/configs/finetuning/qnli.yaml (val acc : 0.8651) Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt /data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs. rank_zero_warn( Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00, 3.15it/s] ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ validation/accuracy/classification │ 0.8651315569877625 │ │ validation/losses/classification │ 0.4168359339237213 │ Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938) [ghstack-poisoned]

ghstack-source-id: 2f01a3335be79be13f128455ce898fce8469a185 Pull Request resolved: #119

- Accuracy metric for finetuning - Add checkpoint saving and best ckpt loading based on val accuracy - Load pretrained ckpt by default in classification model - make num gpus 1 in qnli.yaml Test plan python -m flava.finetune config=flava/configs/finetuning/qnli.yaml (val acc : 0.8651) Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt /data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs. rank_zero_warn( Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00, 3.15it/s] ┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓ ┃ Validate metric ┃ DataLoader 0 ┃ ┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩ │ validation/accuracy/classification │ 0.8651315569877625 │ │ validation/losses/classification │ 0.4168359339237213 │ Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938) [ghstack-poisoned]

ebsmothers · 2022-06-30T15:14:24Z

torchmultimodal/models/flava/flava_model.py

@@ -209,6 +209,7 @@ def flava_model_for_classification(
    classifier_activation: Callable[..., nn.Module] = nn.ReLU,
    classifier_normalization: Optional[Callable[..., nn.Module]] = None,
    loss_fn: Optional[Callable[..., Tensor]] = None,
+    pretrained_model_key: Optional[str] = "flava_full",


Just wondering, why do we want to add this as a param to flava_model_for_classification? Feels to me like this class should not care about the pretrained weights. I like TorchVision's approach for handling this, maybe we can do something similar?

I am just doing this for "short term uniformity" with the flava_for_pretraining. But yeah i agree, tv's approach is nicer and something we can potentially follow

ankitade · 2022-07-05T21:52:05Z

@apsdehal I am going to merge this today. please feel free to leave comments either ways and ping me. I can address them in a separate PR if I end up merging before u review

[Flava] Add ckpt loading and accuracy metric to finetuning

ee20413

[ghstack-poisoned]

ankitade mentioned this pull request Jun 26, 2022

[FLAVA]Change some initialization orders and corresponding tests #105

Closed

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2022

ankitade mentioned this pull request Jun 26, 2022

[FLAVA] Make projections part of the core model #106

Closed

Update on "[Flava] Add ckpt loading and accuracy metric to finetuning"

184bb8b

[ghstack-poisoned]

ankitade added a commit that referenced this pull request Jun 26, 2022

[Flava] Add ckpt loading and accuracy metric to finetuning

f4e4284

ghstack-source-id: 08766d32244fc64c31049cd8003299124c48e66c Pull Request resolved: #119

ankitade marked this pull request as ready for review June 26, 2022 17:42

ankitade requested review from apsdehal, ebsmothers, RdoubleA and langong347 June 26, 2022 17:42

ankitade added a commit that referenced this pull request Jun 26, 2022

[Flava] Add ckpt loading and accuracy metric to finetuning

443d04e

ghstack-source-id: 39c8d11af6ffae071159f6594a5b24c6a12909bb Pull Request resolved: #119

ankitade added a commit that referenced this pull request Jun 30, 2022

[Flava] Add ckpt loading and accuracy metric to finetuning

a6df304

ghstack-source-id: 2f01a3335be79be13f128455ce898fce8469a185 Pull Request resolved: #119

ebsmothers reviewed Jun 30, 2022

View reviewed changes

This was referenced Jul 4, 2022

change order of itm loss init #131

Draft

[FLAVA]Move itm head to flava model for pretraining #132

Draft

facebook-github-bot closed this in 855c9ed Jul 6, 2022

facebook-github-bot deleted the gh/ankitade/7/head branch July 9, 2022 14:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Flava] Add ckpt loading and accuracy metric to finetuning #119

[Flava] Add ckpt loading and accuracy metric to finetuning #119

ankitade commented Jun 26, 2022 •

edited

Loading

codecov-commenter commented Jun 26, 2022 •

edited

Loading

ankitade commented Jun 26, 2022

ankitade commented Jun 26, 2022

ankitade commented Jun 27, 2022

ebsmothers Jun 30, 2022

ankitade Jul 5, 2022

ankitade commented Jul 5, 2022

[Flava] Add ckpt loading and accuracy metric to finetuning #119

[Flava] Add ckpt loading and accuracy metric to finetuning #119

Conversation

ankitade commented Jun 26, 2022 • edited Loading

codecov-commenter commented Jun 26, 2022 • edited Loading

Codecov Report

ankitade commented Jun 26, 2022

ankitade commented Jun 26, 2022

ankitade commented Jun 27, 2022

ebsmothers Jun 30, 2022

Choose a reason for hiding this comment

ankitade Jul 5, 2022

Choose a reason for hiding this comment

ankitade commented Jul 5, 2022

ankitade commented Jun 26, 2022 •

edited

Loading

codecov-commenter commented Jun 26, 2022 •

edited

Loading