Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Katib S3-only Helm Path cherrypick #507

Merged
merged 1 commit into from
Nov 15, 2022

Conversation

jsitu777
Copy link
Contributor

Cherry pick the helm path installation config fix for s3-only deployment.

**Which issue is resolved by this Pull Request:**
N.A

**Description of your changes:**
helm installation path is incorrect, results in `Chart.yaml` not found
error.
```
==========Installing katib==========
Error: Chart.yaml file is missing
Traceback (most recent call last):
  File "utils/kubeflow_installation.py", line 249, in <module>
    install_kubeflow(
```

**Testing:**
- [ N.A] Unit tests pass
- [ N.A] e2e tests pass

`make deploy-kubeflow INSTALLATION_OPTION=helm
DEPLOYMENT_OPTION=s3-only` runs without errors now.

```
Waiting for katib pods to be ready ...
running command: kubectl wait --for=condition=ready pod -l 'katib.kubeflow.org/component in (controller, db-manager, ui)' --timeout=240s -n kubeflow
pod/katib-controller-75b988dccc-4r5j8 condition met
pod/katib-db-manager-5d46869758-4lvvp condition met
pod/katib-ui-766d5dc8ff-47md2 condition met
All katib pods are running!
```

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.
@jsitu777 jsitu777 merged commit 5cb2557 into release-v1.6.1-aws-b1.0.0 Nov 15, 2022
@jsitu777 jsitu777 deleted the katib-s3-cherrypick branch November 16, 2022 18:27
styoung89 added a commit to Hyperfine/kubeflow-manifests that referenced this pull request Jun 15, 2023
* Update website links to v1.6.1 (awslabs#487)

Update website links to v1.6.1

* Update git clone command (awslabs#488)

Update git clone command

* Katib S3-only Helm Path cherrypick (awslabs#507)

Cherry pick the helm path installation config fix for s3-only
deployment.

Co-authored-by: Pei Ran Li <prli@users.noreply.github.com>

* Cherry-pick Kserve with IRSA and Notebook culling Doc (awslabs#512)

* Cherry-pick Kserve with IRSA and Notebook culling Doc into v1.6.1
release

* Cherrypick: Add permission configuration steps to SageMaker KFP docs (awslabs#506) (awslabs#528)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**

- Adds required permission steps for using Sagemaker v1/v2 integration
with KFP

**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* New blog/workshop update - 1.6 (awslabs#533)

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* docsearch backport 1.6 (awslabs#534)

docsearch first integration
By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* 1.6

* helm provider

* blueprints

* fix oidc

* Cherry-Pick terraform bug fixes (awslabs#590)

**Which issue is resolved by this Pull Request:**
**Description of your changes:**
Update RDS version
awslabs#584
Update AWS blueprints - gpu bug
awslabs#516

**Testing:**
Cognito-rds-s3 passed `6 passed, 13 warnings in 4594.63s (1:16:34)`
rds-s3 passed `7 passed, 49 warnings in 4490.15s (1:14:50)`

- [ ] Unit tests pass
- [x] e2e tests pass

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Shadab Hussain <shadab.entrepreneur@outlook.com>
Co-authored-by: Gerhard Häring <gh@ghaering.de>

* Updating website to reflect latest version (awslabs#596)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**


**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Ubuntu <ubuntu@ip-172-31-58-39.us-west-2.compute.internal>

* Cherry pick sagemaker fix and move of kfp SM test (awslabs#619)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**
Cherry pick sagemaker fix and move of kfp SM test

**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* fix secrets

* [Cherry pick] Change Dex Service from Nodeport to ClusterIP (awslabs#630)

**Description of your changes:**
There are occasional times when installing dex we see the following
error message:
```
The Service "dex" is invalid: spec.ports[0].nodePort: Invalid value: 32000: provided port is already allocated
```

It indicates that there is already a service using port 32000 on the
same node trying to create the "dex" service.
Dex service was modified to be exposed as ClusterIP instead of NodePort
in the new release KF v1.7.0 along with Istio for security upgrade
(kubeflow/manifests#2357)

This is a mirror patch from the above PR to change Dex service from
NodePort to ClusterIP to solve the error message seen.



**Testing:**
- [ ] Unit tests pass
- [x] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* [Cherry-Pick] increase cert-manager wait time for kubeflow-issuer to be install (awslabs#632)

Cherry pick this commit from main to temporarily solve:
```
: Internal error occurred: failed calling webhook "webhook.cert-manager.io": failed to call webhook: Post "[https://cert-manager-webhook.cert-manager.svc:443/mutate?timeout=10s](https://cert-manager-webhook.cert-manager.svc/mutate?timeout=10s)": context deadline exceeded
```

* variables

* [Cherry Pick] Enable canary report generation (awslabs#634)

-enable canary report generation after each canary run.

* Update blog content (awslabs#638)

**Description of your changes:**
- Remove Trainium (pending future release)
- Update link to AWS Docs (was localhost)
- Update ordered list numbering

**Testing:**
- Tested local Hugo build

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

---------

Co-authored-by: Suraj Kota <surakota@amazon.com>

* add pipeline pd

* [Cherry-pick] Push Canary Success Rate to Cloudwatch Metric (awslabs#662)

- cherry-pick pushing success_rate into cloudwatch metrics

* Release v1.6.1 aws b1.0.2 (awslabs#666)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**

Note: As a result of the depreciation of k8s.gcr.io, images have moved
to registry.k8s.io. This only effects csi-secrets-driver for those who
have installed either RDS-S3/Cognito-RDS-S3. To update your pulled image
source run the following kubectl commands

```
kubectl set image daemonset/csi-secrets-store node-driver-registrar=registry.k8s.io/sig-storage/csi-node-driver-registrar:v2.3.0 -n kube-system
kubectl set image daemonset/csi-secrets-store secrets-store=registry.k8s.io/csi-secrets-store/driver:v1.0.0-rc.1 -n kube-system
kubectl set image daemonset/csi-secrets-store liveness-probe=registry.k8s.io/sig-storage/livenessprobe:v2.4.0 -n kube-system
```

**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* update website to release-v1.6.1-aws-b1.0.2 (awslabs#667)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**
Points toml and documentation to latest release version

**Testing:**
- [ ] Unit tests pass
- [ ] e2e tests pass
- Details about new tests (If this PR adds a new feature)
- Details about any manual tests performed

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* Update config.toml for v1.6.1 (awslabs#705)

- change version from `latest` to `1.6`

* [Cherry pick] Kserve Indentation doc fix v1.6.1 (awslabs#721)

- cherry-pick for this PR
awslabs#719
- Keep empty AWS credential strings

* fix evictions

* value needed for defaults

* Website changes for release branch (awslabs#758)

**Which issue is resolved by this Pull Request:**
Resolves #

**Description of your changes:**
- Same as title

By submitting this pull request, I confirm that my contribution is made
under the terms of the Apache 2.0 license.

* refactor
---------

Co-authored-by: Kartik Kalamadi <kalamadi@amazon.com>
Co-authored-by: jsitu777 <59303945+jsitu777@users.noreply.github.com>
Co-authored-by: Pei Ran Li <prli@users.noreply.github.com>
Co-authored-by: ryansteakley <37981995+ryansteakley@users.noreply.github.com>
Co-authored-by: Nadege PEPIN <5490706+npepin-hub@users.noreply.github.com>
Co-authored-by: ananth102 <abashyam@amazon.com>
Co-authored-by: Shadab Hussain <shadab.entrepreneur@outlook.com>
Co-authored-by: Gerhard Häring <gh@ghaering.de>
Co-authored-by: Ubuntu <ubuntu@ip-172-31-58-39.us-west-2.compute.internal>
Co-authored-by: Kevin Hoyt <parkerkrhoyt@gmail.com>
Co-authored-by: Suraj Kota <surakota@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants