-
Notifications
You must be signed in to change notification settings - Fork 123
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Cherry-pick Kserve with IRSA and Notebook culling Doc (#512)
* Cherry-pick Kserve with IRSA and Notebook culling Doc into v1.6.1 release
- Loading branch information
Showing
11 changed files
with
163 additions
and
0 deletions.
There are no files selected for viewing
87 changes: 87 additions & 0 deletions
87
website/content/en/docs/component-guides/kserve/access-aws-services-from-kserve.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,87 @@ | ||
+++ | ||
title = "Configure inferenceService to Access AWS Services from KServe" | ||
description = "Configuration for accessing AWS services for inference services such as pulling images from private ECR and downloading models from S3 bucket." | ||
weight = 10 | ||
+++ | ||
|
||
## Access AWS Service from Kserve with IAM Roles for ServiceAccount(IRSA) | ||
1. Export env values: | ||
```bash | ||
export CLUSTER_NAME="<>" | ||
export CLUSTER_REGION="<>" | ||
export PROFILE_NAMESPACE=kubeflow-user-example-com | ||
export SERVICE_ACCOUNT_NAME=aws-sa | ||
# 123456789.dkr.ecr.us-west-2.amazonaws.com/kserve/sklearnserver:v0.8.0 | ||
export ECR_IMAGE_URL="<>" | ||
# s3://your-s3-bucket/model | ||
export S3_BUCKET_URL="<>" | ||
``` | ||
|
||
|
||
1. Create Service Account with IAM Role using [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html). The following command attaches both `AmazonEC2ContainerRegistryReadOnly` and `AmazonS3ReadOnlyAccess` IAM policies: | ||
``` | ||
eksctl create iamserviceaccount --name ${SERVICE_ACCOUNT_NAME} --namespace ${PROFILE_NAMESPACE} --cluster ${CLUSTER_NAME} --region ${CLUSTER_REGION} --attach-policy-arn=arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly --attach-policy-arn=arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess --override-existing-serviceaccounts --approve | ||
``` | ||
> NOTE: You can use ECR (`AmazonEC2ContainerRegistryReadOnly`) and S3 (`AmazonS3ReadOnlyAccess`) ReadOnly managed policies. We recommend creating fine grained policy for production usecase. | ||
|
||
### Deploy models from S3 Bucket | ||
1. Create Secret with empty AWS Credential: | ||
```sh | ||
cat <<EOF > secret.yaml | ||
apiVersion: v1 | ||
kind: Secret | ||
metadata: | ||
name: aws-secret | ||
namespace: ${PROFILE_NAMESPACE} | ||
annotations: | ||
serving.kserve.io/s3-endpoint: s3.amazonaws.com | ||
serving.kserve.io/s3-usehttps: "1" | ||
serving.kserve.io/s3-region: ${CLUSTER_REGION} | ||
type: Opaque | ||
data: | ||
AWS_ACCESS_KEY_ID: "" | ||
AWS_SECRET_ACCESS_KEY: "" | ||
EOF | ||
kubectl apply -f secret.yaml | ||
``` | ||
> NOTE: The **empty** keys for `AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY` force it to add the env vars to the init containers but don't override the actual credentials from the IAM role (which happens if you add dummy values). These **empty** keys are needed for IRSA to work in current version and will not be needed in future release. | ||
1. Attach secret to IRSA in your profile namespace: | ||
``` | ||
kubectl patch serviceaccount ${SERVICE_ACCOUNT_NAME} -n ${PROFILE_NAMESPACE} -p '{"secrets": [{"name": "aws-secret"}]}' | ||
``` | ||
### Create an InferenceService | ||
1. Specify the service account in the model server spec : | ||
> NOTE: make sure you have workable image in `${ECR_IMAGE_URL}`and model in `${S3_BUCKET_URL}` for the inferenceService to work. Versioning of model and image must be consistent: eg. you can not use a v1 model then a v2 image. | ||
```sh | ||
cat <<EOF > inferenceService.yaml | ||
apiVersion: serving.kserve.io/v1beta1 | ||
kind: InferenceService | ||
metadata: | ||
name: "sklearn-iris" | ||
namespace: ${PROFILE_NAMESPACE} | ||
annotations: | ||
sidecar.istio.io/inject: "false" | ||
spec: | ||
predictor: | ||
serviceAccountName: ${SERVICE_ACCOUNT_NAME} | ||
model: | ||
modelFormat: | ||
name: sklearn | ||
image: ${ECR_IMAGE_URL} | ||
storageUri: ${S3_BUCKET_URL} | ||
EOF | ||
kubectl apply -f inferenceService.yaml | ||
``` | ||
1. Check the InferenceService status: | ||
```sh | ||
kubectl get inferenceservices sklearn-iris -n ${PROFILE_NAMESPACE} | ||
NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE | ||
sklearn-iris http://sklearn-iris.kubeflow-user-example-com.example.com True 100 sklearn-iris-predictor-default-00001 105s | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
47 changes: 47 additions & 0 deletions
47
website/content/en/docs/deployment/configure-notebook-culling.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,47 @@ | ||
+++ | ||
title = "Configure Culling for Notebooks" | ||
description = "Automatically stop your notebooks based on idleness" | ||
weight = 80 | ||
+++ | ||
|
||
The culling feature allows you to stop a Notebook Server based on its **Last Activity**. The Notebook Controller updates the respective `notebooks.kubeflow.org/last-activity` annotation of each Notebook resource according to the execution state of the kernels. When this feature is enabled, the notebook instances will be "culled" (scaled to zero) if none of the kernels are performing computations for a specified period of time (`CULL_IDLE_TIME`). More information about this feature can be found in the [Jupyter notebook idleness proposal](https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md). | ||
|
||
1. Export the following values values to configure the [culling policy parameters](https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md#api-changes): | ||
```bash | ||
# whether to enable culling feature (true/false). ENABLE_CULLING must be set to “true” for this feature to take work | ||
export ENABLE_CULLING="true" | ||
# specified idleness time (minutes) that notebook instance to be culled since last activity | ||
export CULL_IDLE_TIMEOUT="30" | ||
# controller will update each notebook's LAST_ACTIVITY_ANNOTATION every IDLENESS_CHECK_PERIOD (minutes) | ||
export IDLENESS_CHECK_PERIOD="5" | ||
``` | ||
|
||
1. The following commands will inject those values in a configuration file for setting up Notebook culling: | ||
Select the package manager of your choice. | ||
- For Kustomize and Helm: | ||
{{< tabpane persistLang=false >}} | ||
{{< tab header="Kustomize" lang="sh" >}} | ||
printf ' | ||
enableCulling='$ENABLE_CULLING' | ||
cullIdleTime='$CULL_IDLE_TIMEOUT' | ||
idlenessCheckPeriod='$IDLENESS_CHECK_PERIOD' | ||
' > awsconfigs/apps/notebook-controller/params.env | ||
{{< /tab >}} | ||
{{< tab header="Helm" lang="sh" >}} | ||
yq e '.cullingPolicy.enableCulling = env(ENABLE_CULLING)' -i charts/apps/notebook-controller/values.yaml | ||
yq e '.cullingPolicy.cullIdleTime = env(CULL_IDLE_TIMEOUT)' -i charts/apps/notebook-controller/values.yaml | ||
yq e '.cullingPolicy.idlenessCheckPeriod = env(IDLENESS_CHECK_PERIOD)' -i charts/apps/notebook-controller/values.yaml | ||
{{< /tab >}} | ||
{{< /tabpane >}} | ||
|
||
- For Terraform, append the notebook culling parameters in the `sample.auto.tfvars` file with chosen deployment option: [Vanilla]({{< ref "/docs/deployment/vanilla/guide-terraform.md#" >}}), [Cognito]({{< ref "/docs/deployment/cognito/guide-terraform.md#" >}}), [RDS-S3]({{< ref "/docs/deployment/rds-s3/guide-terraform.md#" >}}), and [Cognito-RDS-S3]({{< ref "/docs/deployment/cognito-rds-s3/guide-terraform.md#" >}}). | ||
|
||
```sh | ||
cat <<EOT >> sample.auto.tfvars | ||
notebook_enable_culling="${ENABLE_CULLING}" | ||
notebook_cull_idle_time="${CULL_IDLE_TIMEOUT}" | ||
notebook_idleness_check_period="${IDLENESS_CHECK_PERIOD}" | ||
EOT | ||
``` | ||
1. Continue deploying Kubeflow based on your [Deployment Option]({{< ref "/docs/deployment/_index.md#" >}}). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters