Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Llama3 Conversion to TFLite version gets prematurely killed when running the convert_to_tflite.py script #300

Open
Arya-Hari opened this issue Oct 17, 2024 · 5 comments

Comments

@Arya-Hari
Copy link

Description of the bug:

I installed the library and all the requirements for trying out the Llama 3 1B model conversion to TFLite format. However, on running the convert_to_tflite.py script I always end up the process getting killed. Any idea why? This is what get printed on the console. Btw, I'm working on WSL.

2024-10-17 14:06:29.116876: I tensorflow/core/util/port.cc:153] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variableTF_ENABLE_ONEDNN_OPTS=0. 2024-10-17 14:06:29.978986: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations. To enable the following instructions: AVX2 AVX_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags. /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( /home/venv/lib/python3.11/site-packages/torch/_subclasses/functional_tensor.py:362: UserWarning: At pre-dispatch tracing, we will assume that any custom op that is marked with CompositeImplicitAutograd and functional are safe to not decompose. We found xla.mark_tensor.default to be one such op. warnings.warn( WARNING:root:PJRT is now the default runtime. For more information, see https://github.com/pytorch/xla/blob/master/docs/pjrt.md WARNING:root:Defaulting to PJRT_DEVICE=CPU WARNING: All log messages before absl::InitializeLog() is called are written to STDERR I0000 00:00:1729174062.687510 372 cpu_client.cc:467] TfrtCpuClient created. Killed

Actual vs expected behavior:

Expected Behavior : It should ideally run without issues and create a tflite file after conversion.
Actual Behavior : Conversion process gets prematurely killed.

Any other information you'd like to share?

No response

@Arya-Hari Arya-Hari added the type:bug Bug label Oct 17, 2024
@haozha111
Copy link
Contributor

The converter may need access 3x model weight's RAM on your machine, what's your machine's CPU RAM size?

@haozha111 haozha111 self-assigned this Oct 17, 2024
@Arya-Hari
Copy link
Author

Hello. My machine has 16 GB of installed RAM.

@haozha111
Copy link
Contributor

I see. The current conversion might require you to have a machine w/ 32 GB Ram, we are working on improvements.

@Arya-Hari
Copy link
Author

Okay I see. Are there any cloud-based alternatives that I can use to run the scripts?

@haozha111
Copy link
Contributor

yes, can you try colab pro? or if you have a remote cloud instance that has sufficient memory, that will work too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants