-
Notifications
You must be signed in to change notification settings - Fork 175
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Habana to main
#1884
Habana to main
#1884
Conversation
* added cr * added accelerator profile crd * added to kustomize
add copy to clipboard icon to tooltips
fixed detected accelerator count connected accelerator detection added accelerator UI user flow hide accelerator dropdown when empty switched the format of the notebook identifier added accelerator name to serving runtime resource added serving runtimes accelerators
commit 9387956 Author: Gage Krumbach <gkrumbach@gmail.com> Date: Fri Jun 30 14:56:37 2023 -0500 added accelerator UI user flow fixed detected accelerator count connected accelerator detection added accelerator UI user flow hide accelerator dropdown when empty switched the format of the notebook identifier added accelerator name to serving runtime resource added serving runtimes accelerators
commit 26da289 Author: Gage Krumbach <gkrumbach@gmail.com> Date: Tue Aug 1 16:40:25 2023 -0500 fix error state in migration commit 391cbca Author: Gage Krumbach <gkrumbach@gmail.com> Date: Tue Aug 1 15:09:25 2023 -0500 added accelerator detection line commit 50839ac Author: Gage Krumbach <gkrumbach@gmail.com> Date: Thu Jul 27 13:52:24 2023 -0500 added gpu migration
Accelerator user flow
added gpu migration
added accelerator detection
fix lint errors in accelerator support
update deployed notebooks and sr on migrate fixed error logging remove container migration Added support for "keep what i have" soft migrate nvidia gpus to profiles fix handle exisiting settings refactored hooks remove useRef simplify functions small changes to hook merge hooks together update cluster role small changes bug fixes small type fix fixed type issues
Fix bug in migration for GPUS
revert add rbac accelerator role
add rbac accelerator role
making image/servingruntime naming dynamic fix count going back to 0 prevent 0 count and ux style fix removed usage of unknown when not needed remove double array usage improved backend logging fix logging undefined error make ?? consistent fixed "||" and fixed unknown / none details
Minor accelerator fixes
fix accelerator detection logic
update cluster role to allow accelerator profile creation
move from cluster role to role for accelerator create
@andrewballantyne added testing instructions |
/lgtm |
/approve Going ahead with the merge. We'll have another week before code freeze / release. We can adjust small efforts along the way. QE has been verifying the feature for the past couple weeks. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: andrewballantyne The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Closes: #1450
Description
Merging Habana Feature into main.
How Has This Been Tested?
How I am testing Habana
NOTE
Migration and accelerator detection cannot be tested without a gpu cluster
Test Impact
Tests will be coming after the feature is merged
Request review criteria:
Self checklist (all need to be checked):
If you have UI changes:
After the PR is posted & before it merges:
main