Can't TorchScript LightningModule when using Metric #4416

hudeven · 2020-10-28T19:45:11Z

🐛 Bug

Please reproduce using the BoringModel and post here

able to reproduce it in https://colab.research.google.com/drive/1MscNHxIc_LIbZxALHbZOAkooNu0TzVly?usp=sharing

To Reproduce

Expected behavior

Able to torchscript a Lightning moduel no matter Metric is used or not

It seems hard to make Metric torchscriptable as *args and **kwargs are useful in Python but not supported in torchscript.
As Metric is not needed for inference, I think it should be excluded when calling LightningModule.to_torchscript().

Environment

Note: Bugs with code are solved faster ! Colab Notebook should be made public !

Colab Notebook: Please copy and paste the output from our environment collection script (or fill out the checklist below manually).

You can get the script and run it with:

wget https://raw.githubusercontent.com/PyTorchLightning/pytorch-lightning/master/tests/collect_env_details.py
# For security purposes, please check the contents of collect_env_details.py before running it.
python collect_env_details.py

CUDA:
- GPU:
  - Tesla T4
- available: True
- version: 10.1
Packages:
- numpy: 1.18.5
- pyTorch_debug: False
- pyTorch_version: 1.6.0+cu101
- pytorch-lightning: 0.10.0
- tqdm: 4.41.1
System:
- OS: Linux
- architecture:
  - 64bit
- processor: x86_64
- python: 3.6.9
- version: Proposal for help #1 SMP Thu Jul 23 08:00:38 PDT 2020

Additional context

The text was updated successfully, but these errors were encountered:

github-actions · 2020-10-28T19:46:33Z

Hi! thanks for your contribution!, great first issue!

ananthsub · 2020-10-28T20:37:15Z

@hudeven as a workaround can you override to_torchscript in your LightningModule?

hudeven · 2020-10-29T02:36:18Z

@hudeven as a workaround can you override to_torchscript in your LightningModule?

yeah, it works by overriding to_torchscript() and deleting the metric attributes there

snisarg · 2020-10-29T05:05:02Z

A controversial suggestion but could we use another method name instead of forward? I understand using the class name directly is convenient.

Also curious about why the metrics class is an nn.Module? Is it to avoid the pains of syncing across distributed envs instead of using torch.distributed?

NumesSanguis · 2020-10-29T05:40:31Z

Having Metrics somehow unique would also be helpful when saving / loading .ckpt. While this is intended behavior (it's a nn.Module), I also run into trouble: #4361

For inference Metrics are not necessary, but if you use them, you need to set strict=False when loading a .ckpt. This creates a risk that actual problems will be ignored (the ones you want loading to fail on).

NumesSanguis · 2020-10-29T05:51:22Z

@hudeven I don't know if you need to use TorchScript in combination with method='script', but with method='trace your code does work.

Add this to your script:

def training_step(self, batch, batch_idx):
    # use first batch to create an example input
    if self.example_input_array is None:
        # we only need 1 samples, not a whole batch, but keep the batch dimension
        self.example_input_array = batch[0, :].unsqueeze(dim=0)

And change model.to_torchscript(method='trace') # method='script'.

This does not solve the actual issue at hand, but can be a workaround for some here.

teddykoker · 2020-10-29T18:35:25Z

Also curious about why the metrics class is an nn.Module? Is it to avoid the pains of syncing across distributed envs instead of using torch.distributed?

@snisarg, we are using an nn.Module for the metrics so that the state of the metric can be passed .to() different devices, which is necessary if you want to use them in both Lightning or just plain PyTorch. We are still using torch.distributed to sync the metric states across GPUs.

hudeven · 2020-11-02T18:56:00Z

@NumesSanguis thanks for the workaround! I intend to use 'script'. This issue is fixed in #4428

hudeven added bug Something isn't working help wanted Open to be worked on labels Oct 28, 2020

SkafteNicki added the Metrics label Oct 28, 2020

hudeven closed this as completed Oct 29, 2020

hudeven reopened this Oct 29, 2020

ananthsub mentioned this issue Oct 29, 2020

Avoid torchscript export for Metric forward #4428

Merged

8 tasks

Borda closed this as completed in #4428 Nov 3, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't TorchScript LightningModule when using Metric #4416

Can't TorchScript LightningModule when using Metric #4416

hudeven commented Oct 28, 2020 •

edited

Loading

github-actions bot commented Oct 28, 2020

ananthsub commented Oct 28, 2020

hudeven commented Oct 29, 2020

snisarg commented Oct 29, 2020

NumesSanguis commented Oct 29, 2020 •

edited

Loading

NumesSanguis commented Oct 29, 2020 •

edited

Loading

teddykoker commented Oct 29, 2020

hudeven commented Nov 2, 2020 •

edited

Loading

Can't TorchScript LightningModule when using Metric #4416

Can't TorchScript LightningModule when using Metric #4416

Comments

hudeven commented Oct 28, 2020 • edited Loading

🐛 Bug

Please reproduce using the BoringModel and post here

To Reproduce

Expected behavior

Environment

Additional context

github-actions bot commented Oct 28, 2020

ananthsub commented Oct 28, 2020

hudeven commented Oct 29, 2020

snisarg commented Oct 29, 2020

NumesSanguis commented Oct 29, 2020 • edited Loading

NumesSanguis commented Oct 29, 2020 • edited Loading

teddykoker commented Oct 29, 2020

hudeven commented Nov 2, 2020 • edited Loading

hudeven commented Oct 28, 2020 •

edited

Loading

NumesSanguis commented Oct 29, 2020 •

edited

Loading

NumesSanguis commented Oct 29, 2020 •

edited

Loading

hudeven commented Nov 2, 2020 •

edited

Loading