-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TACODEV-909: workaround for using machinepool on CAPA #68
Conversation
major 수정이므로, Chart.yaml 의 version 을 0.3.1 정도로 올리는게 좋겠습니다. |
2step 설치가 더는 필요가 없으므로, 기존 수정분( 2step 설치 )은 모두 빼는 것이 좋겠습니다. |
- add job to get subnet and add a machinepool with the info
b3ccdd1
to
b7ec9bb
Compare
This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
This PR was closed because it has been stalled for 10 days with no activity. |
eaf92c5
to
fdf1dbe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
코드 리뷰외 추가) cluster-api-aws/templates/mt-control.yaml 이름이 Machine Template Control로 보이는데, 파일이름이 적절치 않아 보입니다.
@@ -0,0 +1,36 @@ | |||
{{- if .Values.machinePool }} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helm install 시 이 Job이 끝날때까지 계속 멈춰있는 상태입니다. async하게 바꿀 필요는 없을까요?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
node가 만들어지고 lable을 붙이는 job을 추가해야해서 더 오래걸리게 될 듯합니다.
async하게 하기 위해서는 helm chart를 띄어내서 따로 만들어야 하는데
일련의 작업 실행을 생각한다면 본 차트에서 추가하는 것이 좋아 보입니다.
따라서 async는 안될것 같습니다.
{{- $envAll := . }} | ||
{{- range .Values.machinePool }} | ||
{{ .name }}: | ||
MP: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MachinePool, AWSMachinePool, KubeadmConfig 값을 이렇게 MP, AMP, KCP 하위로 만들면, K8s가 어떻게 인식하나요?
@intelliguy 코멘트 달았습니다. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
helm install 시 timeout 에러가 발생하며 Job이 수행되지 않습니다.
$ helm install jaesang-909-2 cluster-api-aws -f cluster-api-aws/val
ues-tacodev-909.yaml
Error: failed post-install: timed out waiting for the condition
timeout 을 10분으로 준 뒤 실행하니 Job BackoffLimitExceeded 에러가 발생합니다.
$ time helm install jaesang-909-3 cluster-api-aws -f cluster-api-aw
s/values-tacodev-909.yaml --timeout 10m --debug
install.go:173: [debug] Original chart version: ""
install.go:190: [debug] CHART PATH: /home/ubuntu/helm-charts/cluster-api-aws
client.go:122: [debug] creating 8 resource(s)
client.go:122: [debug] creating 1 resource(s)
client.go:493: [debug] Watching for changes to Job jaesang-909-3-cluster-api-aws with timeout of 10m0s
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: ADDED
client.go:560: [debug] jaesang-909-3-cluster-api-aws: Jobs active: 0, jobs failed: 0, jobs succeeded: 0
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: MODIFIED
client.go:560: [debug] jaesang-909-3-cluster-api-aws: Jobs active: 1, jobs failed: 0, jobs succeeded: 0
client.go:521: [debug] Add/Modify event for jaesang-909-3-cluster-api-aws: MODIFIED
Error: failed post-install: job failed: BackoffLimitExceeded
helm.go:81: [debug] failed post-install: job failed: BackoffLimitExceeded
real 5m16.054s
user 0m0.894s
sys 0m0.322s
job describe
$ kubectl describe jobs jaesang-909-3-cluster-api-aws
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal SuccessfulCreate 6m36s job-controller Created pod: jaesang-909-3-cluster-api-aws-w6fzm
Normal SuccessfulDelete 82s job-controller Deleted pod: jaesang-909-3-cluster-api-aws-w6fzm
Warning BackoffLimitExceeded 82s (x2 over 82s) job-controller Job has reached the specified backoff limit
This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
This PR is stale because it has been open 3 days with no activity. Remove stale label or comment or this will be closed in 3 days. |
This PR was closed because it has been stalled for 10 days with no activity. |
8b9d03d
to
6028e15
Compare
테스트 시 10분이상 돌아야 하고 defult가 5분이므로 timeout 설정을 추가해야 합니다. argocd를 통한 배포시에는 timeout이내에 잘 동작함을 확인 |
while [ $(kubectl get machinepool -n $3 $1-$2-mp-0 --ignore-not-found | wc -l) == 0 ] | ||
do | ||
echo "> Wait for machinepools deployed (30s)" | ||
sleep 30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요
while [ $(kubectl get machinepool -n $3 $1-$2-mp-0 -o=jsonpath='{.status.nodeRefs}' | wc -c) == 0 ] | ||
do | ||
echo "> Wait for instance is ready (20s)" | ||
sleep 20 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요
set -ex | ||
|
||
while [ $(kubectl get secret -n $2 $1-kubeconfig --ignore-not-found | wc -l) == 0 ]; do | ||
echo "sleep 30 second" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sleep 시간이 너무 길어 helm 설치에 소요되는 시간이 많이 깁니다. sleep 1이나 sleep 2는 어떨까요
한번에 aws로 k8s 클러스터를 배포할 수 있습니다.
기존에 최종적으로 update하던 value를 처음부터 주고 돌리면 됩니다.
이미지용 도커파일 및 사용 코드도 모두 artfacts 디렉토리에 넣었습니다
테스트 시 10분이상 돌아야 하고 defult가 5분이므로 timeout 설정을 추가해야 합니다.
--timeout=20 추가해서 배포해야 합니다.
argocd를 통한 배포시에는 timeout이내에 잘 동작함을 확인