Argument passing for TP degree #772

VenkateshPasumarti · 2025-01-29T09:47:16Z

Feature request

I was trying for generating text-embeddings for mistral based model using sentence transformers, but facing an issue with memory since it is trying to download the complete model in one core and throwing memory constraint issues , since mistral model requires 16GB and one neuron core is of size is 16GB. So, i wanted to activate multiple cores using an argument in order to generate using optimum neuron.

Motivation

Need to activate multiple cores and also such that i can run two models in parellel using different cores

Your contribution

Was able to run smaller models, but for larger models facing issues.

dacorvo · 2025-01-29T10:02:24Z

@VenkateshPasumarti thank you for your feedback. You can use the num_cores argument to increase the number of cores on which the model is deployed.

dacorvo · 2025-01-29T10:03:05Z

https://huggingface.co/docs/optimum-neuron/guides/export_model#exporting-llms-to-neuron

VenkateshPasumarti · 2025-01-30T05:26:14Z

Thanks for the reply @dacorvo , can i know how can i run different models in parallel, something like locking those cores for a certain task
ex: If i take inf2.48x large i will be having multiple cores so for first 2 cores i wanted to run one sentence transformers based embeddings model and then in next multiple cores i wanted to run a re-ranking model

dacorvo · 2025-01-30T10:11:22Z

@VenkateshPasumarti you can restrict the number of visible cores by using environment variables, but for that each model must run in a separate process (please refer to the AWS Neuron SDK documentation to see how).
Alternatively, you can run each model in a container, mapping only some devices to each container (you can see an example of that using docker compose in the benchmark/text-generation-inference/performance subdirectory).

VenkateshPasumarti · 2025-01-31T05:29:17Z

Thanks for the reply and information @dacorvo

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Argument passing for TP degree #772

Argument passing for TP degree #772

VenkateshPasumarti commented Jan 29, 2025 •

edited

Loading

dacorvo commented Jan 29, 2025

dacorvo commented Jan 29, 2025

VenkateshPasumarti commented Jan 30, 2025

dacorvo commented Jan 30, 2025

VenkateshPasumarti commented Jan 31, 2025

Argument passing for TP degree #772

Argument passing for TP degree #772

Comments

VenkateshPasumarti commented Jan 29, 2025 • edited Loading

Feature request

Motivation

Your contribution

dacorvo commented Jan 29, 2025

dacorvo commented Jan 29, 2025

VenkateshPasumarti commented Jan 30, 2025

dacorvo commented Jan 30, 2025

VenkateshPasumarti commented Jan 31, 2025

VenkateshPasumarti commented Jan 29, 2025 •

edited

Loading