compile_pytorch_model.py compile failures (model_path/constants.pkl not found) #22

ljkeller · 2024-06-04T21:40:13Z

Hello,

I'm having compile failures with compile_pytorch_model.py. Heres my failure:

/drp-ai_tvm/tutorials# python3 compile_pytorch_model.py /home/models/spark_torch.pt -o spark_torch -s 1,3,28,28
[Check arguments]
  Input AI model         :  /home/models/spark_torch.pt
  SDK path               :  /opt/poky/3.1.21
  DRP-AI Translator path :  /opt/drp-ai_translator_release
  Output dir             :  spark_torch
  Input shape            :  (1, 3, 28, 28)
Traceback (most recent call last):
  File "compile_pytorch_model.py", line 69, in <module>
    model = torch.jit.load(model_file)
  File "/usr/local/lib/python3.8/dist-packages/torch/jit/_serialization.py", line 161, in load
    cpp_module = torch._C.import_ir_module(cu, str(f), map_location, _extra_files)
RuntimeError: [enforce fail at inline_container.cc:222] . file not found: v1.0.1_9_epochs_no_norm_97.27/constants.pkl

Interestingly, I trained this model and deployed to both torch and onnx formats. The onnx export works python3 compile_onnx_model.py /home/models/spark.onnx -o spark -s 1,3,28,28 -i input.

I'm guessing there is a version incompatibility with the torch I trained/exported on and the torch used here for the conversion? I don't see any documentation about expected torch training versions. I don't have my model training PC with me right now, or I'd report the torch version.

Here are the models I've tried spark.zip

Environment

I'm running out of a docker container I built ~6 months ago with docker build -t rzv2l_ai_sdk_image --build-arg SDK="/opt/poky/3.1.21" --build-arg PRODUCT="V2L" as far as I know.

The text was updated successfully, but these errors were encountered:

matinlotfali · 2024-10-17T01:41:13Z

@ljkeller I have the exact same issue. Were you able to fix it?

ljkeller · 2024-10-17T04:04:02Z

@ljkeller I have the exact same issue. Were you able to fix it?

I remember having multiple issues that day.

I know I put a PR; I don't think its related? #23 worth looking at.

Otherwise, IIRC, this was a torch or python versioning issue. Unfortunately the best advice I'd have is to binary search through python/torch versions. I think one of the python versions changed model exporting. I think a shortcut is to check for constants.pkl, but I don't remember very well.

I've found ONNX to be much friendlier to use, but even that has an implicit versioning requirement.

@matinlotfali please let me know if you get around the issue.

matinlotfali · 2024-10-23T00:54:44Z

I just learned that the PyTorch model file should be converted to a TorchScripted model via torch.jit.trace to work.

ljkeller · 2024-10-23T18:00:34Z

I just learned that the PyTorch model file should be converted to a TorchScripted model via torch.jit.trace to work.

I've definitely compiled torch models without doing this explicitly. Do you have a link so I can read up on this? That's frustrating.

matinlotfali · 2024-10-23T22:40:43Z

I think this is a nice reading material: https://www.geeksforgeeks.org/what-are-torch-scripts-in-pytorch/

ljkeller · 2024-10-24T16:28:20Z

Yes, but I was looking for an explicit callout of the necessity of the jit. Even the TVM docs don't appear to say much from what I've seen.

We grab the TorchScripted model via tracing

is all their torch compile guide says..

I think this is a recipe for wasted developer time. I could add a failure log warning to tutorials/compile_pytorch_model*.py files on the jit.load if anyone thinks that would be useful? That or some documentation could be updated- I'm not sure where.

@hiroyuki-sakamoto seems to be listening. What do you think?

hiroyuki-sakamoto · 2024-11-05T13:16:02Z

Sorry for the delay in responding due to the confusion that accompanied the v2.4.0 release.
I have been looking at this issue 22 and feel it is necessary to add a note that the TorchScript model is required. Also, for various reasons, we are not accustomed to receiving contributions and incorporating them into the Quick, but we appreciate your offer. The next update is scheduled for the end of this year, when we hope to include documentation improvements, etc.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

compile_pytorch_model.py compile failures (model_path/constants.pkl not found) #22

compile_pytorch_model.py compile failures (model_path/constants.pkl not found) #22

ljkeller commented Jun 4, 2024

matinlotfali commented Oct 17, 2024

ljkeller commented Oct 17, 2024 •

edited

Loading

matinlotfali commented Oct 23, 2024

ljkeller commented Oct 23, 2024 •

edited

Loading

matinlotfali commented Oct 23, 2024

ljkeller commented Oct 24, 2024 •

edited

Loading

hiroyuki-sakamoto commented Nov 5, 2024

compile_pytorch_model.py compile failures (model_path/constants.pkl not found) #22

compile_pytorch_model.py compile failures (model_path/constants.pkl not found) #22

Comments

ljkeller commented Jun 4, 2024

Environment

matinlotfali commented Oct 17, 2024

ljkeller commented Oct 17, 2024 • edited Loading

matinlotfali commented Oct 23, 2024

ljkeller commented Oct 23, 2024 • edited Loading

matinlotfali commented Oct 23, 2024

ljkeller commented Oct 24, 2024 • edited Loading

hiroyuki-sakamoto commented Nov 5, 2024

ljkeller commented Oct 17, 2024 •

edited

Loading

ljkeller commented Oct 23, 2024 •

edited

Loading

ljkeller commented Oct 24, 2024 •

edited

Loading