From cb464e399be43341f145265630d437c86e50025d Mon Sep 17 00:00:00 2001 From: jsitu777 <59303945+jsitu777@users.noreply.github.com> Date: Tue, 22 Nov 2022 17:35:34 -0800 Subject: [PATCH 1/2] Kserve Doc for configuring IRSA to access AWS services (#501) -add documentation for accessing aws service in Kserve with IRSA https://github.com/awslabs/kubeflow-manifests/issues/436 https://github.com/kserve/kserve/issues/2003#issuecomment-1053530385 Reference: https://github.com/kserve/website/blob/main/docs/modelserving/storage/s3/s3.md (cherry picked from commit 8e2ffc5958da6fd6e17790236c484938c20e318b) --- .../kserve/access-aws-services-from-kserve.md | 87 +++++++++++++++++++ 1 file changed, 87 insertions(+) create mode 100644 website/content/en/docs/component-guides/kserve/access-aws-services-from-kserve.md diff --git a/website/content/en/docs/component-guides/kserve/access-aws-services-from-kserve.md b/website/content/en/docs/component-guides/kserve/access-aws-services-from-kserve.md new file mode 100644 index 0000000000..0d72219520 --- /dev/null +++ b/website/content/en/docs/component-guides/kserve/access-aws-services-from-kserve.md @@ -0,0 +1,87 @@ ++++ +title = "Configure inferenceService to Access AWS Services from KServe" +description = "Configuration for accessing AWS services for inference services such as pulling images from private ECR and downloading models from S3 bucket." +weight = 10 ++++ + +## Access AWS Service from Kserve with IAM Roles for ServiceAccount(IRSA) +1. Export env values: + ```bash + export CLUSTER_NAME="<>" + export CLUSTER_REGION="<>" + export PROFILE_NAMESPACE=kubeflow-user-example-com + export SERVICE_ACCOUNT_NAME=aws-sa + # 123456789.dkr.ecr.us-west-2.amazonaws.com/kserve/sklearnserver:v0.8.0 + export ECR_IMAGE_URL="<>" + # s3://your-s3-bucket/model + export S3_BUCKET_URL="<>" + ``` + + +1. Create Service Account with IAM Role using [IRSA](https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html). The following command attaches both `AmazonEC2ContainerRegistryReadOnly` and `AmazonS3ReadOnlyAccess` IAM policies: + ``` + eksctl create iamserviceaccount --name ${SERVICE_ACCOUNT_NAME} --namespace ${PROFILE_NAMESPACE} --cluster ${CLUSTER_NAME} --region ${CLUSTER_REGION} --attach-policy-arn=arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly --attach-policy-arn=arn:aws:iam::aws:policy/AmazonS3ReadOnlyAccess --override-existing-serviceaccounts --approve + ``` + > NOTE: You can use ECR (`AmazonEC2ContainerRegistryReadOnly`) and S3 (`AmazonS3ReadOnlyAccess`) ReadOnly managed policies. We recommend creating fine grained policy for production usecase. + +### Deploy models from S3 Bucket +1. Create Secret with empty AWS Credential: + ```sh + cat < secret.yaml + apiVersion: v1 + kind: Secret + metadata: + name: aws-secret + namespace: ${PROFILE_NAMESPACE} + annotations: + serving.kserve.io/s3-endpoint: s3.amazonaws.com + serving.kserve.io/s3-usehttps: "1" + serving.kserve.io/s3-region: ${CLUSTER_REGION} + type: Opaque + data: + AWS_ACCESS_KEY_ID: "" + AWS_SECRET_ACCESS_KEY: "" + EOF + + kubectl apply -f secret.yaml + ``` + > NOTE: The **empty** keys for `AWS_ACCESS_KEY_ID/AWS_SECRET_ACCESS_KEY` force it to add the env vars to the init containers but don't override the actual credentials from the IAM role (which happens if you add dummy values). These **empty** keys are needed for IRSA to work in current version and will not be needed in future release. + +1. Attach secret to IRSA in your profile namespace: + ``` + kubectl patch serviceaccount ${SERVICE_ACCOUNT_NAME} -n ${PROFILE_NAMESPACE} -p '{"secrets": [{"name": "aws-secret"}]}' + ``` + + +### Create an InferenceService +1. Specify the service account in the model server spec : +> NOTE: make sure you have workable image in `${ECR_IMAGE_URL}`and model in `${S3_BUCKET_URL}` for the inferenceService to work. Versioning of model and image must be consistent: eg. you can not use a v1 model then a v2 image. + + ```sh + cat < inferenceService.yaml + apiVersion: serving.kserve.io/v1beta1 + kind: InferenceService + metadata: + name: "sklearn-iris" + namespace: ${PROFILE_NAMESPACE} + annotations: + sidecar.istio.io/inject: "false" + spec: + predictor: + serviceAccountName: ${SERVICE_ACCOUNT_NAME} + model: + modelFormat: + name: sklearn + image: ${ECR_IMAGE_URL} + storageUri: ${S3_BUCKET_URL} + EOF + + kubectl apply -f inferenceService.yaml + ``` + +1. Check the InferenceService status: + ```sh + kubectl get inferenceservices sklearn-iris -n ${PROFILE_NAMESPACE} + NAME URL READY PREV LATEST PREVROLLEDOUTREVISION LATESTREADYREVISION AGE + sklearn-iris http://sklearn-iris.kubeflow-user-example-com.example.com True 100 sklearn-iris-predictor-default-00001 105s + ``` From 4a20506cea002f9f4a29b6dfee686109c1401bcb Mon Sep 17 00:00:00 2001 From: jsitu777 <59303945+jsitu777@users.noreply.github.com> Date: Tue, 22 Nov 2022 15:15:44 -0800 Subject: [PATCH 2/2] Add Notebook Culling Feature Doc (#490) **Which issue is resolved by this Pull Request:** Resolves # https://github.com/awslabs/kubeflow-manifests/issues/439 **Description of your changes:** -Added documentation in notebook component guide for culling feature (cherry picked from commit 3e605fb44c6c0bdfcacba0e87b6682f608ff2685) --- .../cognito-rds-s3/guide-terraform.md | 3 ++ .../docs/deployment/cognito-rds-s3/guide.md | 4 ++ .../deployment/cognito/guide-terraform.md | 3 ++ .../cognito/manifest/guide-automated.md | 3 ++ .../docs/deployment/cognito/manifest/guide.md | 3 ++ .../deployment/configure-notebook-culling.md | 47 +++++++++++++++++++ .../docs/deployment/rds-s3/guide-terraform.md | 3 ++ .../en/docs/deployment/rds-s3/guide.md | 3 ++ .../deployment/vanilla/guide-terraform.md | 3 ++ .../en/docs/deployment/vanilla/guide.md | 4 ++ 10 files changed, 76 insertions(+) create mode 100644 website/content/en/docs/deployment/configure-notebook-culling.md diff --git a/website/content/en/docs/deployment/cognito-rds-s3/guide-terraform.md b/website/content/en/docs/deployment/cognito-rds-s3/guide-terraform.md index 769cfdf53f..66254cd2a7 100644 --- a/website/content/en/docs/deployment/cognito-rds-s3/guide-terraform.md +++ b/website/content/en/docs/deployment/cognito-rds-s3/guide-terraform.md @@ -102,6 +102,9 @@ pwd EOF ``` +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ### View all Configurations View [all possible configuration options of the terraform stack](https://github.com/awslabs/kubeflow-manifests/blob/main/deployments/cognito-rds-s3/terraform/variables.tf) in the `variables.tf` file. diff --git a/website/content/en/docs/deployment/cognito-rds-s3/guide.md b/website/content/en/docs/deployment/cognito-rds-s3/guide.md index 3184890053..ed8eb9d6d8 100644 --- a/website/content/en/docs/deployment/cognito-rds-s3/guide.md +++ b/website/content/en/docs/deployment/cognito-rds-s3/guide.md @@ -26,6 +26,10 @@ Refer to the [general prerequisites guide]({{< ref "/docs/deployment/prerequisit 1. Create TLS certificates for the domain 1. Create a Cognito Userpool 1. Configure Ingress + +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + 2. Deploy Kubeflow. 1. Install Kubeflow using the following command: {{< tabpane persistLang=false >}} diff --git a/website/content/en/docs/deployment/cognito/guide-terraform.md b/website/content/en/docs/deployment/cognito/guide-terraform.md index eed0b9854d..e0695ed980 100644 --- a/website/content/en/docs/deployment/cognito/guide-terraform.md +++ b/website/content/en/docs/deployment/cognito/guide-terraform.md @@ -80,6 +80,9 @@ pwd EOF ``` +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ### View all configurations View [all possible configuration options of the terraform stack](https://github.com/awslabs/kubeflow-manifests/blob/main/deployments/cognito/terraform/variables.tf) in the `variables.tf` file. diff --git a/website/content/en/docs/deployment/cognito/manifest/guide-automated.md b/website/content/en/docs/deployment/cognito/manifest/guide-automated.md index 797b3c872d..e08e54a18d 100644 --- a/website/content/en/docs/deployment/cognito/manifest/guide-automated.md +++ b/website/content/en/docs/deployment/cognito/manifest/guide-automated.md @@ -102,6 +102,9 @@ Each section is detailed in [Cognito Manual Deployment Guide]({{< ref "/docs/dep us-east-1-certARN: arn:aws:acm:us-east-1:123456789012:certificate/373cc726-f525-4bc7-b7bf-d1d7b641c238 ``` +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ## 2.0 Install Kubeflow Install Kubeflow using the following command: diff --git a/website/content/en/docs/deployment/cognito/manifest/guide.md b/website/content/en/docs/deployment/cognito/manifest/guide.md index 6e22b256f7..72c5dab347 100644 --- a/website/content/en/docs/deployment/cognito/manifest/guide.md +++ b/website/content/en/docs/deployment/cognito/manifest/guide.md @@ -157,6 +157,9 @@ yq e '.LOGOUT_URL = env(CognitoLogoutURL)' -i charts/common/aws-authservice/valu 1. Follow the [Configure Load Balancer Controller]({{< ref "/docs/add-ons/load-balancer/guide.md#configure-load-balancer-controller" >}}) section of the load balancer guide to setup the resources required by the load balancer controller. +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ## 4.0 Build the manifests and deploy Kubeflow {{< tabpane persistLang=false >}} diff --git a/website/content/en/docs/deployment/configure-notebook-culling.md b/website/content/en/docs/deployment/configure-notebook-culling.md new file mode 100644 index 0000000000..8f6aa88f59 --- /dev/null +++ b/website/content/en/docs/deployment/configure-notebook-culling.md @@ -0,0 +1,47 @@ ++++ +title = "Configure Culling for Notebooks" +description = "Automatically stop your notebooks based on idleness" +weight = 80 ++++ + + The culling feature allows you to stop a Notebook Server based on its **Last Activity**. The Notebook Controller updates the respective `notebooks.kubeflow.org/last-activity` annotation of each Notebook resource according to the execution state of the kernels. When this feature is enabled, the notebook instances will be "culled" (scaled to zero) if none of the kernels are performing computations for a specified period of time (`CULL_IDLE_TIME`). More information about this feature can be found in the [Jupyter notebook idleness proposal](https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md). + +1. Export the following values values to configure the [culling policy parameters](https://github.com/kubeflow/kubeflow/blob/master/components/proposals/20220121-jupyter-notebook-idleness.md#api-changes): + ```bash + # whether to enable culling feature (true/false). ENABLE_CULLING must be set to “true” for this feature to take work + export ENABLE_CULLING="true" + # specified idleness time (minutes) that notebook instance to be culled since last activity + export CULL_IDLE_TIMEOUT="30" + # controller will update each notebook's LAST_ACTIVITY_ANNOTATION every IDLENESS_CHECK_PERIOD (minutes) + export IDLENESS_CHECK_PERIOD="5" + ``` + +1. The following commands will inject those values in a configuration file for setting up Notebook culling: + Select the package manager of your choice. + - For Kustomize and Helm: + {{< tabpane persistLang=false >}} + {{< tab header="Kustomize" lang="sh" >}} +printf ' +enableCulling='$ENABLE_CULLING' +cullIdleTime='$CULL_IDLE_TIMEOUT' +idlenessCheckPeriod='$IDLENESS_CHECK_PERIOD' +' > awsconfigs/apps/notebook-controller/params.env + {{< /tab >}} + {{< tab header="Helm" lang="sh" >}} +yq e '.cullingPolicy.enableCulling = env(ENABLE_CULLING)' -i charts/apps/notebook-controller/values.yaml +yq e '.cullingPolicy.cullIdleTime = env(CULL_IDLE_TIMEOUT)' -i charts/apps/notebook-controller/values.yaml +yq e '.cullingPolicy.idlenessCheckPeriod = env(IDLENESS_CHECK_PERIOD)' -i charts/apps/notebook-controller/values.yaml + {{< /tab >}} + {{< /tabpane >}} + + - For Terraform, append the notebook culling parameters in the `sample.auto.tfvars` file with chosen deployment option: [Vanilla]({{< ref "/docs/deployment/vanilla/guide-terraform.md#" >}}), [Cognito]({{< ref "/docs/deployment/cognito/guide-terraform.md#" >}}), [RDS-S3]({{< ref "/docs/deployment/rds-s3/guide-terraform.md#" >}}), and [Cognito-RDS-S3]({{< ref "/docs/deployment/cognito-rds-s3/guide-terraform.md#" >}}). + + ```sh + cat <> sample.auto.tfvars + notebook_enable_culling="${ENABLE_CULLING}" + notebook_cull_idle_time="${CULL_IDLE_TIMEOUT}" + notebook_idleness_check_period="${IDLENESS_CHECK_PERIOD}" + EOT + ``` + +1. Continue deploying Kubeflow based on your [Deployment Option]({{< ref "/docs/deployment/_index.md#" >}}). diff --git a/website/content/en/docs/deployment/rds-s3/guide-terraform.md b/website/content/en/docs/deployment/rds-s3/guide-terraform.md index 4cde5868b8..a8255fe3fd 100644 --- a/website/content/en/docs/deployment/rds-s3/guide-terraform.md +++ b/website/content/en/docs/deployment/rds-s3/guide-terraform.md @@ -77,6 +77,9 @@ pwd EOF ``` +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ### All Configurations A full list of inputs for the terraform stack can be found [here](https://github.com/awslabs/kubeflow-manifests/blob/main/deployments/rds-s3/terraform/variables.tf). diff --git a/website/content/en/docs/deployment/rds-s3/guide.md b/website/content/en/docs/deployment/rds-s3/guide.md index d047fd48d6..72fdd95df5 100644 --- a/website/content/en/docs/deployment/rds-s3/guide.md +++ b/website/content/en/docs/deployment/rds-s3/guide.md @@ -238,6 +238,9 @@ yq e '.s3.minioServiceRegion = env(CLUSTER_REGION)' -i charts/apps/kubeflow-pipe {{< /tab >}} {{< /tabpane >}} +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ## 3.0 Build Manifests and install Kubeflow diff --git a/website/content/en/docs/deployment/vanilla/guide-terraform.md b/website/content/en/docs/deployment/vanilla/guide-terraform.md index b2d860cce0..0b423bfc75 100644 --- a/website/content/en/docs/deployment/vanilla/guide-terraform.md +++ b/website/content/en/docs/deployment/vanilla/guide-terraform.md @@ -54,6 +54,9 @@ pwd EOF ``` +### (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + ### All Configurations A full list of inputs for the terraform stack can be found [here](https://github.com/awslabs/kubeflow-manifests/blob/main/deployments/vanilla/terraform/variables.tf). diff --git a/website/content/en/docs/deployment/vanilla/guide.md b/website/content/en/docs/deployment/vanilla/guide.md index cca9c6c0f9..5a59be9af3 100644 --- a/website/content/en/docs/deployment/vanilla/guide.md +++ b/website/content/en/docs/deployment/vanilla/guide.md @@ -14,6 +14,10 @@ Be sure that you have satisfied the installation prerequisites before working th - [Set up your deployment environment]({{< ref "prerequisites.md" >}}) - [Create an EKS Cluster]({{< ref "create-eks-cluster.md" >}}) +## (Optional) Configure Culling for Notebooks +Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide. + + ## Build Manifests and install Kubeflow > ⚠️ Warning: We use a default email (`user@example.com`) and password (`12341234`) for our guides. For any production Kubeflow deployment, you should change the default password by following the steps in [Change default user password]({{< ref "../connect-kubeflow-dashboard#change-the-default-user-password-kustomize" >}}).