Skip to content

Conversation

TroyGarden
Copy link
Contributor

Summary:

context

  • after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
  • it could be due to D77758554

Rollback Plan:

Differential Revision: D81529616

@meta-cla meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 2, 2025
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 2, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* it could be due to D78326114 according to D77758554

Rollback Plan:

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 2, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 3, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

1 similar comment
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 3, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:
Pull Request resolved: meta-pytorch#3343

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 3, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:
Pull Request resolved: meta-pytorch#3343

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

TroyGarden added a commit to TroyGarden/torchrec that referenced this pull request Sep 3, 2025
… disable_global_flags (meta-pytorch#3343)

Summary:
Pull Request resolved: meta-pytorch#3343

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Differential Revision: D81529616
… disable_global_flags (meta-pytorch#3343)

Summary:
Pull Request resolved: meta-pytorch#3343

# context
* after fix github CI workflow (GPU unit tests) we found lots of errors come from the same root cause:
```
torchrec/test_utils/__init__.py:129: in _wrapper
    return wrapped_func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
torchrec/distributed/test_utils/multi_process.py:131: in setUp
    torch.backends.cudnn.allow_tf32 = False
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _

self = <torch.backends.ContextProp object at 0x7f4e8bb3ba10>
obj = <module 'torch.backends.cudnn' from '/opt/conda/envs/build_binary/lib/python3.11/site-packages/torch/backends/cudnn/__init__.py'>
val = False

    def __set__(self, obj, val):
        if not flags_frozen():
            self.setter(val)
        else:
>           raise RuntimeError(
                f"not allowed to set {obj.__name__} flags "
                "after disable_global_flags; please use flags() context manager instead"
            )
E           RuntimeError: not allowed to set torch.backends.cudnn flags after disable_global_flags; please use flags() context manager instead
```
* according to D77758554, the issue is due to D78326114 introducing `torch.testing._internal.common_utils`
```
# torch/testing/_internal/common_utils.py calls `disable_global_flags()`
# workaround RuntimeError: not allowed to set ... after disable_global_flags
```

Reviewed By: aporialiao

Differential Revision: D81529616
@facebook-github-bot
Copy link
Contributor

This pull request was exported from Phabricator. Differential Revision: D81529616

@TroyGarden TroyGarden deleted the export-D81529616 branch September 19, 2025 22:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. fb-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants