-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please document that PJRT_DEVICE=CPU is required #326
Comments
Hi @artemisart, I am able to do the conversion example w/o explicitly setting this variable on the latest code. I do see this warning: WARNING:root:Defaulting to PJRT_DEVICE=CPU What version of AI-Edge-Torch and torch-xla are you using? Also please describe your CPU/GPU/TPU environment. Thanks. |
The package versions are in the <details> tag of my first message. I'm on a GCP n1-standard-8 VM with a T4, NVIDIA-SMI 535.86.10 Driver Version: 535.86.10 CUDA Version: 12.2. |
Hi @artemisart, it seems like you are using the latest stable version (0.2.0)... can you try with the nightly versions? If it's fixed there then it is already fixed and will be fixed on the next release. |
Mitigation for #326 PiperOrigin-RevId: 692227947
Mitigation for #326 PiperOrigin-RevId: 692227947
Mitigation for #326 PiperOrigin-RevId: 692234642
Description of the bug:
All examples crash with a log looking like that (I did this one below with the code in the README to convert the resnet18):
After digging in other issues I discovered that
os.environ["PJRT_DEVICE"] = "CUDA"
was absolutely needed before doing the conversion, I could not find any mention of that in the documentation.Actual vs expected behavior:
Actual: Any model conversion including official examples crash with no explanation.
Expected: they should work.
Any other information you'd like to share?
Tried with "stable" and nightly, I think I had the same issue (but not sure as I also had many ipykernel crash and segfaults, still don't really understand why).
uv pip tree --package ai-edge-torch
The text was updated successfully, but these errors were encountered: