-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Neuron backend #3033
Open
dacorvo
wants to merge
20
commits into
main
Choose a base branch
from
neuron_backend
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Add Neuron backend #3033
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
This is the continuation of #3018. |
Closed
Closed
470c526
to
f40f4e4
Compare
bd528c4
to
9c1d121
Compare
The base image used to compile the rust components seems to have a low ulimit for opened files, which leads to errors during compilation.
The neuron tests require models to have been previously exported and cached on the hub. This is done automatically by the neuron.model fixture the first time the tests are ran for a specific version. This fixture used to export the models using optimum-neuron directly, but this package is not necessarily present on the system. Instead, it is now done through the neuron TGI itself, since it contains all the tools required to export the models. Note that since the CI runs docker in docker (dind) it does not seem possible to share a volume between the CI container and the container used to export the model. For that reason, a specific image with a modified entrypoint is built on-the-fly when a model export is required.
The SageMaker image is built differently anyway.
832ba35
to
73c1dd2
Compare
40ce497
to
e86544e
Compare
e86544e
to
262abb8
Compare
We now manually evaluate the apparent hash of the neuron backend by combining the hash of the neuron backend directory and Dockerfile. This new hash is used to identify exported neuron models instead of the image sha. This has two benefits: - it changes less frequently (only hwen the neuron backend changes), which means less neuron models being pushed to the hub, - it can be evaluated locally, meaning that running the tests once locally will export the models before the CI uses them.
091bff5
to
19166bc
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This adds the neuron backend that was previously maintained in the optimum-neuron repository.
This backend is built on top of the AWS Neuron SDK, and comprises:
Documentation
A dedicated documentation page has been added in the backends subsection.
Tests
The backend comes with some dedicated tests:
Both set of tests require some models to be pre-exported and cached to test:
The server tests are not run for the moment, only the integration tests.
Since these tests are very specific to neuron, they can only be activated by specifying the new
--neuron
python option.Conversely, as soon as the
--neuron
option is set, all tests that do not have theneuron
marker are disabled.The neuron integraiton tests use also specific fixtures that have been added as local plugins:
Next steps