Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 [Bug] global partitioner does not work while compiling with dynamo #3157

Open
seymurkafkas opened this issue Sep 11, 2024 · 3 comments
Open
Assignees
Labels
bug Something isn't working

Comments

@seymurkafkas
Copy link
Contributor

seymurkafkas commented Sep 11, 2024

Hi all! Trying to use global partitioning fails with the dynamo backend, and couldn't pinpoint why (tried various compilation parameters).

How to Reproduce:

System:

Cuda Driver Version: 535.104.12
GPU: Nvidia Tesla T4
Python: 3.11.10

Dependencies (wheels):

https://download.pytorch.org/whl/cu121/torch-2.4.1%2Bcu121-cp311-cp311-linux_x86_64.whl
https://download.pytorch.org/whl/cu121/torch_tensorrt-2.4.0%2Bcu121-cp311-cp311-linux_x86_64.whl

Script to reproduce:

import torch
import torch_tensorrt
import torch.nn as nn
import torch.nn.functional as F

class SimpleCNN(nn.Module):
    def __init__(self):
        super(SimpleCNN, self).__init__()
        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1)
        self.conv2 = nn.Conv2d(16, 32, kernel_size=3, stride=1, padding=1)
        self.fc1 = nn.Linear(32 * 134 * 134, 10)

    def forward(self, x):
        x = F.relu(self.conv1(x))
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        x = F.relu(self.conv2(x))
        x = F.max_pool2d(x, kernel_size=2, stride=2)
        x = x.view(x.size(0), -1)
        x = self.fc1(x)
        return x

model = SimpleCNN()

def compile_to_tensorrt() -> None:
    batch_size, tile_size = 1, 538
    model = SimpleCNN().to(dtype = torch.float16, device = torch.device('cuda'))
    model.eval()
    with torch.no_grad():
        inputs = torch.randn(
            batch_size, 3, tile_size, tile_size, device="cuda", dtype=torch.float16
        )
        print("Compiling model...")
        _trt_graph_module = torch_tensorrt.compile(
            model,
            ir="dynamo",
            inputs=[inputs],
            enabled_precisions={torch.float16},
            use_fast_partitioner=False,
        )

if __name__ == "__main__":
    compile_to_tensorrt()

Error and TraceBack:

Traceback (most recent call last):
    compile_to_tensorrt()
  File "reproduce.py", line 37, in compile_to_tensorrt
    _trt_graph_module = torch_tensorrt.compile(
                        ^^^^^^^^^^^^^^^^^^^^^^^
  File "site-packages/torch_tensorrt/_compile.py", line 249, in compile
    trt_graph_module = dynamo_compile(
                       ^^^^^^^^^^^^^^^
  File "site-packages/torch_tensorrt/dynamo/_compiler.py", line 230, in compile
    trt_gm = compile_module(gm, inputs, settings)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "site-packages/torch_tensorrt/dynamo/_compiler.py", line 365, in compile_module
    for node in submodule.graph.nodes
                ^^^^^^^^^^^^^^^
  File "site-packages/torch/nn/modules/module.py", line 1729, in __getattr__
    raise AttributeError(f"'{type(self).__name__}' object has no attribute '{name}'")
AttributeError: 'Module' object has no attribute 'graph'
@seymurkafkas seymurkafkas added the bug Something isn't working label Sep 11, 2024
@seymurkafkas seymurkafkas changed the title global partitioner does not work while compiling with dynamo [Bug] global partitioner does not work while compiling with dynamo Sep 13, 2024
@seymurkafkas seymurkafkas changed the title [Bug] global partitioner does not work while compiling with dynamo 🐛 [Bug] global partitioner does not work while compiling with dynamo Sep 13, 2024
@dgcnz
Copy link
Contributor

dgcnz commented Sep 18, 2024

It seems like the gm.named_children() function returns both the original module (as an nn.Module that doesn't have the attribute graph) and the fused fx.GraphModules. But even by filtering out the original Module, you can't save the exported model because fuse_partitions inserts call_module nodes and that is not compilant with the IR Spec (See this).

upd: filtering the unfused child module and saving the model with torchscript works

@orioninthesky98
Copy link

yes, i am also getting this error

@lanluo-nvidia
Copy link
Collaborator

fix:
#3195

lanluo-nvidia added a commit that referenced this issue Oct 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants