Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] PyTorch-based training on Gaudi HPU #309

Draft
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jmduarte
Copy link
Contributor

@jmduarte jmduarte commented Apr 9, 2024

  • PyTorch-based training on Gaudi HPU
    Issue when importing torch_cluster
>>> import torch_cluster
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.10/dist-packages/torch_cluster/__init__.py", line 18, in <module>
    torch.ops.load_library(spec.origin)
  File "/usr/local/lib/python3.10/dist-packages/torch/_ops.py", line 852, in load_library
    ctypes.CDLL(path)
  File "/usr/lib/python3.10/ctypes/__init__.py", line 374, in __init__
    self._handle = _dlopen(self._name, mode)
OSError: /usr/local/lib/python3.10/dist-packages/torch_cluster/_version_cpu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs

@jpata
Copy link
Owner

jpata commented Apr 9, 2024

torch_cluster comes in via torch_geometric.

We could get rid of the torch_geometric dependency by removing the GravNet code and always just doing 3D padding of the model: https://github.com/jpata/particleflow/blob/main/mlpf/pyg/PFDataset.py#L131.
Technically we don't need it, it's a leftover.

@jpata
Copy link
Owner

jpata commented Apr 10, 2024

FYI in this PR I removed the torch_geometric dependency: #310

@jpata
Copy link
Owner

jpata commented Apr 11, 2024

#310 is merged, you can try updating.

@jpata jpata marked this pull request as draft June 12, 2024 11:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants