Skip to content

Commit

Permalink
Add documentation about creating Pipeline Profiles (awslabs#700)
Browse files Browse the repository at this point in the history
**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**
Add information about how to create Profiles that use IRSA and have
correct s3 bucket access for Pipelines.

**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
  • Loading branch information
ryansteakley authored and jsitu777 committed Jun 27, 2023
1 parent b1d5e1d commit 6c8dfa0
Show file tree
Hide file tree
Showing 7 changed files with 157 additions and 10 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -144,6 +144,12 @@ Run the following command:
make deploy
```
## Creating Profiles
A default profile named `kubeflow-user-example-com` for email `user@example.com` has been configured with this deployment. If you are using IRSA as `PIPELINE_S3_CREDENTIAL_OPTION`, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the [pipeline profiles]({{< ref "/docs/deployment/create-profiles-with-iam-role.md" >}}) for instructions on how to create additional profiles.
If you are not using this feature, you can create a profile by just specifying email address of the user.
## Connect to your Kubeflow dashboard
1. Head over to your user pool in the Cognito console and create a user with email `user@example.com` in `Users and groups`.
Expand Down
18 changes: 15 additions & 3 deletions website/content/en/docs/deployment/cognito-rds-s3/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,16 @@ Enable culling for notebooks by following the [instructions]({{< ref "/docs/depl
2. Deploy Kubeflow.

1. Export your pipeline-s3-credential-option
```bash
export PIPELINE_S3_CREDENTIAL_OPTION=<IRSA/STATIC>
```
{{< tabpane persistLang=false >}}
{{< tab header="IRSA" lang="toml" >}}
# Pipeline S3 Credential Option to configure
export PIPELINE_S3_CREDENTIAL_OPTION="irsa"
{{< /tab >}}
{{< tab header="IAM User" lang="toml" >}}
# Pipeline S3 Credential Option to configure
export PIPELINE_S3_CREDENTIAL_OPTION="static"
{{< /tab >}}
{{< /tabpane >}}

1. Install Kubeflow using the following command:

Expand All @@ -56,6 +63,11 @@ make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito-rds-s3 P
1. Create a profile for the user from the user pool
1. Connect to the central dashboard

## Creating Profiles
A default profile named `kubeflow-user-example-com` for email `user@example.com` has been configured with this deployment. If you are using IRSA as `PIPELINE_S3_CREDENTIAL_OPTION`, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the [pipeline profiles]({{< ref "/docs/deployment/create-profiles-with-iam-role.md" >}}) for instructions on how to create additional profiles.

If you are not using this feature, you can create a profile by just specifying email address of the user.

## Uninstall Kubeflow
> Note: Delete all the resources you might have created in your profile namespaces before running these steps.
1. Run the following commands to delete the profiles, ingress and corresponding ingress managed load balancer
Expand Down
4 changes: 2 additions & 2 deletions website/content/en/docs/deployment/cognito/manifest/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -204,8 +204,8 @@ make deploy-kubeflow INSTALLATION_OPTION=helm DEPLOYMENT_OPTION=cognito
Before connecting to the dashboard:

* Go to the Cognito console and create some users in `Users and groups`. These are the users who will log in to the central dashboard.
![cognito-user-pool-created](https://raw.githubusercontent.com/awslabs/kubeflow-manifests/main/website/content/en/docs/images/cognito/cognito-user-pool-created.png)

- Create a user with email address `user@example.com`. This user and email address come preconfigured and have a Profile created by default.
![cognito-user-pool-created](https://raw.githubusercontent.com/awslabs/kubeflow-manifests/main/website/content/en/docs/images/cognito/cognito-user-pool-created.png)
* Create a Profile for a user by following the steps in the [Manual Profile Creation](https://www.kubeflow.org/docs/components/multi-tenancy/getting-started/#manual-profile-creation).
The following is a Profile example for reference:
```bash
Expand Down
120 changes: 120 additions & 0 deletions website/content/en/docs/deployment/create-profiles-with-iam-role.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,120 @@
+++
title = "Create Profiles with IAM role"
description = "Use AWS IAM roles for service accounts with Kubeflow Profiles"
weight = 70
+++

In a multi tenant Kubeflow installation, the pods created by pipelines workflow and the pipelines frontend services run in an user profile namespace. The service account (`default-editor`) used for these pods needs permissions for the S3 bucket used by pipelines to read and write artifacts from S3. When using IRSA (IAM roles for service accounts) as your `PIPELINE_S3_CREDENTIAL_OPTION`, any additional profiles created as part of a multi-user deployment besides the preconfigured `kubeflow-user-example-com` will need to be configured with permissions to S3 bucket using IRSA.

The `default-editor` SA needs to be annotated with an IAM role with sufficient permissions to access your S3 Bucket to run your pipelines. In the below steps we will be configuring a profile an IAM role with restricted access to a specific S3 Bucket using the `AwsIamForServiceAccount` plugin for Profiles. To learn more about the `AwsIamForServiceAccount` plugin for Profiles read the [Profiles component guide]({{< ref "/docs/component-guides/profiles.md" >}}).

> Note: If you choose to run your pipeline with a service account other than the default which is `default-editor`, you must make sure to annotate that service account with an IAM role with sufficient S3 permissions.
## Create a Profile

After installing Kubeflow on AWS with one of the available [deployment options]({{< ref "/docs/deployment" >}}), you can configure Kubeflow Profiles with the following steps:

1. Define the following environment variables:

The `S3_BUCKET` that is exported should be the same bucket that is used by Kubeflow Pipelines.
```bash
# Your cluster name
export CLUSTER_NAME=
# Your cluster region
export CLUSTER_REGION=
# The S3 Bucket that is used by Kubeflow Pipelines
export S3_BUCKET=
# Your AWS Acconut ID
export AWS_ACCOUNT_ID=$(aws sts get-caller-identity --query "Account" --output text)
# Name of the profile to create
export PROFILE_NAME=
```
2. Retrieve OIDC Provider URL

```bash
aws --region $CLUSTER_REGION eks update-kubeconfig --name $CLUSTER_NAME

export OIDC_URL=$(aws eks describe-cluster --region $CLUSTER_REGION --name $CLUSTER_NAME --query "cluster.identity.oidc.issuer" --output text | cut -c9-)
```

3. Create an IAM trust policy to authorize federated requests from the OIDC provider.

```bash

cat <<EOF > trust.json
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::${AWS_ACCOUNT_ID}:oidc-provider/${OIDC_URL}"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"${OIDC_URL}:aud": "sts.amazonaws.com",
"${OIDC_URL}:sub": "system:serviceaccount:kubeflow-user-example-com:default-editor"
}
}
}
]
}
EOF
```
4. [Create an IAM policy](https://docs.aws.amazon.com/IAM/latest/UserGuide/access_policies_create.html) with access to the S3 bucket where pipeline artifacts will be stored. The following policy grants full access to the S3 bucket, you can scope it down by giving read, write and GetBucketLocation permissions.
```bash
printf '{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::${S3_BUCKET}",
"arn:aws:s3::::${S3_BUCKET}/*"
]
}
]
}
' > ./s3_policy.json
```
5. [Create an IAM role](https://docs.aws.amazon.com/IAM/latest/UserGuide/id_roles_create.html) for the Profile using the scoped policy from the previous step.
```bash
aws iam create-role --role-name $PROFILE_NAME-$CLUSTER_NAME-role --assume-role-policy-document file://trust.json
aws --region $CLUSTER_REGION iam put-role-policy --role-name $PROFILE_NAME-$CLUSTER_NAME-role --policy-name kf-$PROFILE_NAME-pipeline-s3 --policy-document file://s3_policy.json
```
6. Create a user in your configured auth provider (e.g. Cognito or Dex).
Export the user as an environment variable.
```bash
export PROFILE_USER=""
```
7. Create a Profile using the `PROFILE_NAME`.
> Note: annotateOnly has been set to true. This means that the Profile Controller will not mutate your IAM Role and Policy.
```bash
cat <<EOF > profile_iam.yaml
apiVersion: kubeflow.org/v1
kind: Profile
metadata:
name: ${PROFILE_NAME}
spec:
owner:
kind: User
name: ${PROFILE_USER}
plugins:
- kind: AwsIamForServiceAccount
spec:
awsIamRole: $(aws iam get-role --role-name $PROFILE_NAME-$CLUSTER_NAME-role --output text --query 'Role.Arn')
annotateOnly: true
EOF
kubectl apply -f profile_iam.yaml
```
5 changes: 5 additions & 0 deletions website/content/en/docs/deployment/rds-s3/guide-terraform.md
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,11 @@ Run the following command:
make deploy
```
## Creating Profiles
A default profile named `kubeflow-user-example-com` for email `user@example.com` has been configured with this deployment. If you are using IRSA as `PIPELINE_S3_CREDENTIAL_OPTION`, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the [pipeline profiles]({{< ref "/docs/deployment/create-profiles-with-iam-role.md" >}}) for instructions on how to create additional profiles.
If you are not using this feature, you can create a profile by just specifying email address of the user.
## Connect to your Kubeflow dashboard
For information on connecting to your Kubeflow dashboard depending on your deployment environment, see [Port-forward (Terraform deployment)]({{< ref "../connect-kubeflow-dashboard/#port-forward-terraform-deployment" >}}). Then, [log into the Kubeflow UI]({{< ref "../connect-kubeflow-dashboard/#log-into-the-kubeflow-ui" >}}).
Expand Down
14 changes: 9 additions & 5 deletions website/content/en/docs/deployment/rds-s3/guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -403,7 +403,6 @@ yq e '.s3.minioServiceRegion = env(CLUSTER_REGION)' -i charts/apps/kubeflow-pipe
### (Optional) Configure Culling for Notebooks
Enable culling for notebooks by following the [instructions]({{< ref "/docs/deployment/configure-notebook-culling.md#" >}}) in configure culling for notebooks guide.
## 3.0 Build Manifests and install Kubeflow
Once you have the resources ready, you can deploy the Kubeflow manifests for one of the following deployment options:
Expand Down Expand Up @@ -458,9 +457,14 @@ Once everything is installed successfully, you can access the Kubeflow Central D
You can now start experimenting and running your end-to-end ML workflows with Kubeflow!
## 4.0 Verify the installation
## 4.0 Creating Profiles
A default profile named `kubeflow-user-example-com` for email `user@example.com` has been configured with this deployment. If you are using IRSA as `PIPELINE_S3_CREDENTIAL_OPTION`, any additional profiles that you create will also need to be configured with IRSA and S3 Bucket access. Follow the [pipeline profiles]({{< ref "/docs/deployment/create-profiles-with-iam-role.md" >}}) for instructions on how to create additional profiles.
If you are not using this feature, you can create a profile by just specifying email address of the user.
## 5.0 Verify the installation
### 4.1 Verify RDS
### 5.1 Verify RDS
1. Connect to your RDS instance from a pod within the cluster with the following command:
```bash
Expand Down Expand Up @@ -536,7 +540,7 @@ mysql> use kubeflow; show tables;
mysql> select * from observation_logs;
```
### 4.2 Verify S3
### 5.2 Verify S3
1. Access the Kubeflow Central Dashboard [by logging in to your cluster]({{< ref "/docs/deployment/vanilla/guide.md#connect-to-your-kubeflow-cluster" >}}) and navigate to Kubeflow Pipelines (under Pipelines).
Expand All @@ -546,7 +550,7 @@ mysql> select * from observation_logs;
4. Verify that the bucket is not empty and was populated by the outputs of the experiment.
## 5.0 Uninstall Kubeflow
## 6.0 Uninstall Kubeflow
Run the following command to uninstall your Kubeflow deployment:
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 6c8dfa0

Please sign in to comment.