TACODEV-909: workaround for using machinepool on CAPA #68

intelliguy · 2021-08-11T02:56:47Z

add job to get subnet and add a machinepool with the info

한번에 aws로 k8s 클러스터를 배포할 수 있습니다.
기존에 최종적으로 update하던 value를 처음부터 주고 돌리면 됩니다.
이미지용 도커파일 및 사용 코드도 모두 artfacts 디렉토리에 넣었습니다

테스트 시 10분이상 돌아야 하고 defult가 5분이므로 timeout 설정을 추가해야 합니다.
--timeout=20 추가해서 배포해야 합니다.

argocd를 통한 배포시에는 timeout이내에 잘 동작함을 확인

cluster-api-aws/templates/job-generate-machine-pool.yaml

cluster-api-aws/artifacts/generate_machine_pool.py

cluster-api-aws/artifacts/Dockerfile

cluster-api-aws/templates/job-generate-machine-pool.yaml

cluster-api-aws/values.yaml

ktkfree · 2021-08-11T08:34:25Z

major 수정이므로, Chart.yaml 의 version 을 0.3.1 정도로 올리는게 좋겠습니다.

ktkfree · 2021-08-11T08:37:34Z

2step 설치가 더는 필요가 없으므로, 기존 수정분( 2step 설치 )은 모두 빼는 것이 좋겠습니다.
그냥 두는 것이 의미가 있는지 의견주시면, 제가 이 PR merge 후 제거토록 하겠습니다.

cluster-api-aws/artifacts/generate_machine_pool.py

- add job to get subnet and add a machinepool with the info

github-actions · 2021-08-15T13:32:57Z

This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days.

github-actions · 2021-08-18T13:34:25Z

This PR was closed because it has been stalled for 10 days with no activity.

cluster-api-aws/templates/kubeadm-config.yaml

Jaesang

코드 리뷰외 추가) cluster-api-aws/templates/mt-control.yaml 이름이 Machine Template Control로 보이는데, 파일이름이 적절치 않아 보입니다.

Jaesang · 2021-08-25T07:27:37Z

cluster-api-aws/templates/job-generate-machine-pool.yaml

@@ -0,0 +1,36 @@
+{{- if .Values.machinePool }}


helm install 시 이 Job이 끝날때까지 계속 멈춰있는 상태입니다. async하게 바꿀 필요는 없을까요?

node가 만들어지고 lable을 붙이는 job을 추가해야해서 더 오래걸리게 될 듯합니다.
async하게 하기 위해서는 helm chart를 띄어내서 따로 만들어야 하는데
일련의 작업 실행을 생각한다면 본 차트에서 추가하는 것이 좋아 보입니다.
따라서 async는 안될것 같습니다.

Jaesang · 2021-08-25T07:46:41Z

cluster-api-aws/templates/mp/_mp.yaml

+{{- $envAll := . }}
+{{- range .Values.machinePool }}
+{{ .name }}:
+  MP:


MachinePool, AWSMachinePool, KubeadmConfig 값을 이렇게 MP, AMP, KCP 하위로 만들면, K8s가 어떻게 인식하나요?

Jaesang · 2021-08-25T09:21:03Z

@intelliguy 코멘트 달았습니다.

Jaesang

helm install 시 timeout 에러가 발생하며 Job이 수행되지 않습니다.

$ helm install jaesang-909-2 cluster-api-aws -f cluster-api-aws/val
ues-tacodev-909.yaml
Error: failed post-install: timed out waiting for the condition

timeout 을 10분으로 준 뒤 실행하니 Job BackoffLimitExceeded 에러가 발생합니다.

$ time helm install jaesang-909-3 cluster-api-aws -f cluster-api-aw
s/values-tacodev-909.yaml --timeout 10m --debug
install.go:173: [debug] Original chart version: ""
install.go:190: [debug] CHART PATH: /home/ubuntu/helm-charts/cluster-api-aws

client.go:122: [debug] creating 8 resource(s)
client.go:122: [debug] creating 1 resource(s)
client.go:493: [debug] Watching for changes to Job jaesang-909-3-cluster-api-aws with timeout of 10m0s
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: ADDED
client.go:560: [debug] jaesang-909-3-cluster-api-aws: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: MODIFIED
client.go:560: [debug] jaesang-909-3-cluster-api-aws: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: MODIFIED
Error: failed post-install: job failed: BackoffLimitExceeded
helm.go:81: [debug] failed post-install: job failed: BackoffLimitExceeded

real    5m16.054s
user    0m0.894s
sys     0m0.322s

job describe

$ kubectl describe jobs jaesang-909-3-cluster-api-aws
Events:
  Type     Reason                Age                From            Message
  ----     ------                ----               ----            -------
  Normal   SuccessfulCreate      6m36s              job-controller  Created pod: jaesang-909-3-cluster-api-aws-w6fzm
  Normal   SuccessfulDelete      82s                job-controller  Deleted pod: jaesang-909-3-cluster-api-aws-w6fzm
  Warning  BackoffLimitExceeded  82s (x2 over 82s)  job-controller  Job has reached the specified backoff limit

github-actions · 2021-08-30T13:34:21Z

This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days.

github-actions · 2021-09-03T13:34:22Z

This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days.

github-actions · 2021-09-09T01:48:25Z

This PR was closed because it has been stalled for 10 days with no activity.

intelliguy · 2021-09-10T12:24:01Z

테스트 시 10분이상 돌아야 하고 defult가 5분이므로 timeout 설정을 추가해야 합니다.
--timeout=20 추가해서 배포해야 합니다.

argocd를 통한 배포시에는 timeout이내에 잘 동작함을 확인

Jaesang · 2021-09-13T01:56:16Z

cluster-api-aws/artifacts/wait_and_k8s_init.sh

+while [ $(kubectl get machinepool -n $3 $1-$2-mp-0 --ignore-not-found | wc -l) == 0 ]
+do
+  echo "> Wait for machinepools deployed (30s)"
+  sleep 30


sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요

Jaesang · 2021-09-13T01:56:34Z

cluster-api-aws/artifacts/wait_and_k8s_init.sh

+while [ $(kubectl get machinepool -n $3 $1-$2-mp-0 -o=jsonpath='{.status.nodeRefs}' | wc -c) == 0 ]
+do
+  echo "> Wait for instance is ready (20s)"
+  sleep 20


sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요

Jaesang · 2021-09-13T01:57:25Z

cluster-api-aws/artifacts/wait_for_kubeconfig.sh

+set -ex
+
+while  [ $(kubectl get secret -n $2 $1-kubeconfig --ignore-not-found | wc -l) == 0 ]; do
+  echo "sleep 30 second"


sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요

intelliguy requested review from Jaesang, ktkfree, robertchoi80, seungkyua and zugwan August 11, 2021 03:11

zugwan requested changes Aug 11, 2021

View reviewed changes

ktkfree reviewed Aug 11, 2021

View reviewed changes

cluster-api-aws/artifacts/generate_machine_pool.py Outdated Show resolved Hide resolved

TACODEV-909: workaround for using machinepool on CAPA

b7ec9bb

- add job to get subnet and add a machinepool with the info

intelliguy force-pushed the TACODEV-909 branch from b3ccdd1 to b7ec9bb Compare August 12, 2021 07:06

github-actions bot added the Stale There has been no activity on this label Aug 15, 2021

github-actions bot closed this Aug 18, 2021

intelliguy removed the Stale There has been no activity on this label Aug 23, 2021

TACODEV909: support multiple machinepool

5ff7058

intelliguy reopened this Aug 23, 2021

bluejayA assigned intelliguy Aug 25, 2021

Jaesang reviewed Aug 25, 2021

View reviewed changes

cluster-api-aws/templates/kubeadm-config.yaml Outdated Show resolved Hide resolved

TACODEV-909: support multiple deploy

fdf1dbe

intelliguy force-pushed the TACODEV-909 branch from eaf92c5 to fdf1dbe Compare August 25, 2021 04:52

Jaesang suggested changes Aug 25, 2021

View reviewed changes

TACODEV-909: add argo registration and node labling

bc1900e

Jaesang self-requested a review August 27, 2021 09:09

Jaesang suggested changes Aug 27, 2021

View reviewed changes

github-actions bot added the Stale There has been no activity on this label Aug 30, 2021

intelliguy removed the Stale There has been no activity on this label Aug 31, 2021

github-actions bot added the Stale There has been no activity on this label Sep 3, 2021

Jaesang mentioned this pull request Sep 6, 2021

[Dev] argocd 에 신규 Target 클러스터를 등록하는 방안 구현 openinfradev/decapod-issues#19

Closed

github-actions bot closed this Sep 9, 2021

intelliguy reopened this Sep 9, 2021

intelliguy requested a review from Jaesang September 10, 2021 12:17

TACODEV-909: prevent error during wating for cluster

6028e15

intelliguy force-pushed the TACODEV-909 branch from 8b9d03d to 6028e15 Compare September 10, 2021 12:19

intelliguy mentioned this pull request Sep 10, 2021

TACODEV-909: create user cluster and other openinfradev/tks-flow#4

Merged

github-actions bot removed the Stale There has been no activity on this label Sep 10, 2021

Jaesang suggested changes Sep 13, 2021

View reviewed changes

seungkyua approved these changes Sep 14, 2021

View reviewed changes

seungkyua merged commit 75c698b into main Sep 15, 2021

zugwan deleted the TACODEV-909 branch September 16, 2021 08:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TACODEV-909: workaround for using machinepool on CAPA #68

TACODEV-909: workaround for using machinepool on CAPA #68

intelliguy commented Aug 11, 2021 •

edited

Loading

ktkfree commented Aug 11, 2021

ktkfree commented Aug 11, 2021

github-actions bot commented Aug 15, 2021

github-actions bot commented Aug 18, 2021

Jaesang left a comment •

edited

Loading

Jaesang Aug 25, 2021

intelliguy Aug 27, 2021

Jaesang Aug 25, 2021

Jaesang commented Aug 25, 2021 •

edited

Loading

Jaesang left a comment

github-actions bot commented Aug 30, 2021

github-actions bot commented Sep 3, 2021

github-actions bot commented Sep 9, 2021

intelliguy commented Sep 10, 2021

Jaesang Sep 13, 2021

Jaesang Sep 13, 2021

Jaesang Sep 13, 2021

TACODEV-909: workaround for using machinepool on CAPA #68

TACODEV-909: workaround for using machinepool on CAPA #68

Conversation

intelliguy commented Aug 11, 2021 • edited Loading

ktkfree commented Aug 11, 2021

ktkfree commented Aug 11, 2021

github-actions bot commented Aug 15, 2021

github-actions bot commented Aug 18, 2021

Jaesang left a comment • edited Loading

Choose a reason for hiding this comment

Jaesang Aug 25, 2021

Choose a reason for hiding this comment

intelliguy Aug 27, 2021

Choose a reason for hiding this comment

Jaesang Aug 25, 2021

Choose a reason for hiding this comment

Jaesang commented Aug 25, 2021 • edited Loading

Jaesang left a comment

Choose a reason for hiding this comment

github-actions bot commented Aug 30, 2021

github-actions bot commented Sep 3, 2021

github-actions bot commented Sep 9, 2021

intelliguy commented Sep 10, 2021

Jaesang Sep 13, 2021

Choose a reason for hiding this comment

Jaesang Sep 13, 2021

Choose a reason for hiding this comment

Jaesang Sep 13, 2021

Choose a reason for hiding this comment

intelliguy commented Aug 11, 2021 •

edited

Loading

Jaesang left a comment •

edited

Loading

Jaesang commented Aug 25, 2021 •

edited

Loading