-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
무중단배포 중단 문제 해결 #794
base: develop-backend
Are you sure you want to change the base?
무중단배포 중단 문제 해결 #794
Changes from all commits
fe9ad0f
4f5c9d3
936134c
515108a
dd82b18
037bffa
5d7f017
c7a13f4
84aebc6
514ddc7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -14,6 +14,10 @@ jobs: | |
- name: Checkout code | ||
uses: actions/checkout@v2 | ||
|
||
- name: Prepare Deploy | ||
run: | | ||
cd ~/deploy && ./prepare-deploy.sh | ||
|
||
- name: Run Prod1 instance deploy script | ||
run: | | ||
cd ~/deploy && ./deploy.sh | ||
|
@@ -25,16 +29,24 @@ jobs: | |
|
||
steps: | ||
- name: Wait for Prod1 instance to be ready | ||
run: sleep 30 | ||
run: sleep 25 | ||
|
||
- name: Health check for Prod1 instance | ||
run: | | ||
RESPONSE=$(curl --write-out '%{http_code}' --silent --output /dev/null http://localhost:8080/health) | ||
if [ $RESPONSE -ne 200 ]; then | ||
echo "Prod1 instance deployment failed." | ||
exit 1 | ||
fi | ||
echo "Prod1 instance is healthy." | ||
ATTEMPTS=0 | ||
MAX_ATTEMPTS=3 | ||
until [ $ATTEMPTS -ge $MAX_ATTEMPTS ]; do | ||
RESPONSE=$(curl --write-out '%{http_code}' --silent --output /dev/null http://localhost:8080/health) | ||
if [ $RESPONSE -eq 200 ]; then | ||
echo "Prod1 instance is healthy." | ||
exit 0 | ||
fi | ||
echo "Health check failed, attempt $((ATTEMPTS+1))/$MAX_ATTEMPTS. Retrying in 5 seconds..." | ||
ATTEMPTS=$((ATTEMPTS+1)) | ||
sleep 5 | ||
done | ||
echo "Prod1 instance deployment failed after $MAX_ATTEMPTS attempts." | ||
exit 1 | ||
Comment on lines
+36
to
+49
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 로컬에서 헬스체크를 하고 있네여 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 프로젝트 규모가 좀 더 커지면 25초 보다 오래 걸릴 가능성이 있다고 판단했습니다. 규모에 따라 대기 시간을 계속 조절하는 것이 불편할 . 것같아 25초 이후로 5초마다 헬스체크를 합니다. 한번이라도 성공하면 그 이후로 진행하지는 않아용 |
||
|
||
deploy-prod2: | ||
name: Deploy to Prod2 Instance | ||
|
@@ -45,6 +57,10 @@ jobs: | |
- name: Checkout code | ||
uses: actions/checkout@v2 | ||
|
||
- name: Prepare Deploy | ||
run: | | ||
cd ~/deploy && ./prepare-deploy.sh | ||
|
||
- name: Run Prod2 instance deploy script | ||
run: | | ||
cd ~/deploy && ./deploy.sh | ||
|
@@ -56,13 +72,21 @@ jobs: | |
|
||
steps: | ||
- name: Wait for Prod2 instance to be ready | ||
run: sleep 30 | ||
run: sleep 25 | ||
|
||
- name: Health check for Prod2 instance | ||
run: | | ||
RESPONSE=$(curl --write-out '%{http_code}' --silent --output /dev/null http://localhost:8080/health) | ||
if [ $RESPONSE -ne 200 ]; then | ||
echo "Prod2 instance deployment failed." | ||
exit 1 | ||
fi | ||
echo "Prod2 instance is healthy." | ||
ATTEMPTS=0 | ||
MAX_ATTEMPTS=3 | ||
until [ $ATTEMPTS -ge $MAX_ATTEMPTS ]; do | ||
RESPONSE=$(curl --write-out '%{http_code}' --silent --output /dev/null http://localhost:8080/health) | ||
if [ $RESPONSE -eq 200 ]; then | ||
echo "Prod2 instance is healthy." | ||
exit 0 | ||
fi | ||
echo "Health check failed, attempt $((ATTEMPTS+1))/$MAX_ATTEMPTS. Retrying in 5 seconds..." | ||
ATTEMPTS=$((ATTEMPTS+1)) | ||
sleep 5 | ||
done | ||
echo "Prod2 instance deployment failed after $MAX_ATTEMPTS attempts." | ||
exit 1 | ||
Comment on lines
+79
to
+92
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. prod1과 prod2가 인스턴스가 다르긴 하지만 동일한 코드로 배포하는 만큼 prod1에서 헬스체크가 완료되었으면 prod2에서는 별도로 진행하지 않아도 될 것 같아요 ~ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 인스턴스가 다르기 때문에 헬스체크가 있어도 좋다고 생각해요 (필수는 아니어도 되지만!) 배포 후 LB나 로그를 확인하는 것보다 actions에서 한번에 확인할 수 있다는 장점이 있을 것 같아요 |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,14 +1,39 @@ | ||
package mouda.backend.common; | ||
|
||
import java.util.concurrent.atomic.AtomicBoolean; | ||
|
||
import org.springframework.http.HttpStatus; | ||
import org.springframework.http.ResponseEntity; | ||
import org.springframework.web.bind.annotation.GetMapping; | ||
import org.springframework.web.bind.annotation.PostMapping; | ||
import org.springframework.web.bind.annotation.RestController; | ||
|
||
import jakarta.servlet.http.HttpServletRequest; | ||
|
||
@RestController | ||
public class HealthCheckController { | ||
|
||
private static final String HOST_IPV4 = "127.0.0.1"; | ||
private static final String HOST_IPV6 = "0:0:0:0:0:0:0:1"; | ||
private static final String HOST_NAME = "localhost"; | ||
|
||
private final AtomicBoolean isTerminating = new AtomicBoolean(false); | ||
|
||
@GetMapping("/health") | ||
public ResponseEntity<Void> checkHealth() { | ||
if (isTerminating.get()) { | ||
return ResponseEntity.status(HttpStatus.BAD_GATEWAY).build(); | ||
} | ||
return ResponseEntity.ok().build(); | ||
} | ||
|
||
@PostMapping("/termination") | ||
public ResponseEntity<Void> terminate(HttpServletRequest request) { | ||
String remoteHost = request.getRemoteHost(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 이 값이 인스턴스 내부에서의 요청인 경우에만 Localhost로 지정됨을 보장하나요? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 요거는 고려하지 못한 상황이네요! 한 번 알아보도록 하겠슴다 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
알아보셨나요? ㅋㅋ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 알아봤는데 nginx 설정을 저희가 바꾸지 않는 이상 localhost가 들어올 수 있나용 ? 잘 모르겠습니다 ㅎ .. ㅎ |
||
if (HOST_IPV6.equals(remoteHost) || HOST_IPV4.equals(remoteHost) || HOST_NAME.equals(remoteHost)) { | ||
isTerminating.set(true); | ||
return ResponseEntity.ok().build(); | ||
} | ||
Comment on lines
+33
to
+36
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 애플리케이션에서 막는 것도 좋지만, 외부 요청을 확실하게 막으려면 nginx 설정에 해당 api에 대한 요청을 처리하는 코드를 추가하는게 좋을 것 같아요~ location /termination {
return 403(또는 404);
} There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 좋은 의견이네요!! 반영하겠슴다1 |
||
return ResponseEntity.status(HttpStatus.FORBIDDEN).build(); | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -64,3 +64,4 @@ server: | |
tomcat: | ||
mbeanregistry: | ||
enabled: true | ||
shutdown: graceful | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 이건 무슨 설정인가요? 우아하게 서버를 내리는건가요? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @테니의 테코톡 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 제 테코톡을 다 잊으셨군요 요청 처리중에 서버를 내리려고 하면 요청을 마저 처리하고 다운시키는 설정입니닷 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 지금은 POST /termination을 보낸 뒤 20초를 대기하는 것 같은데요, 20초라는 특정 시간을 대기할 게 아니라 등록 취소 지연(deregistration delay) 시간값과 graceful shutdown의 timeout값을 동일하게 설정해도 될 것 같아요 ~ There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. graceful shutdown timeout은 deploy.sh 를 실행하여 프로세스를 종료했을 때 대기하는 시간이고, 20초는 /termination 요청(prepare-deploy.sh 실행) 후 대기하는 시간인데요! 이 두 가지가 어떤 관계를 가져서 동일하게 설정하도록 의견을 주신 걸까요 ?? graceful shutdown downtime은 외부 API를 사용하는 등 응답까지 시간이 오래걸리는 API를 기준으로 설정하는 게 좋을 것 같은데, 어떻게 생각하시나요 ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
현재 배포 순서를 보면,
지금 구조대로라면, prepare-deploy가 끝나고 deploy.sh 가 실행되는 시점에서는 ELB가 해당 인스턴스로 트래픽을 보내지 않을텐데 graceful shutdown을 지정하는 것이 크게 의미가 없는 것 같아요 ~ 그래서,, 제가 생각하는 graceful shutdown을 제대로 확인하는 방법은
이 방법으로 하면 prepare-deploy에서 sleep 20 을 호출할 필요가 없이 springboot를 종료하고 새로운 버전의 애플리케이션을 빌드하는 사이에 springboot가 정상 종료될 것 같아요~(경험상 우리 프로젝트에서의 gradle 빌드가 20~30 초 정도 걸림..) 일단 이 부분은 개인적인 추측이고, 제 개인 서버에서 돌려보고 있으니 완료되면 같이 확인해보시죠! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
요건 맞아요. 사실 graceful shutdown 학습한김에 써봤습니다 !! 하핫
이건 수동으로(aws에 들어가서 직접) 한다는 의미인가용?
혹시 제가 잘못 이해한 부분이 있다면 말씀해주세옹 .. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
거창한게 아니라 deregister는 cli를 인스턴스에 설치해서 명령어로 가능해요~ 해당 부분을 스크립트에 넣으면 될 것 같구요 자세한건 다음에 얘기해보시죠~~ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
지금은 별도로 만든 헬스체크용 api를 만들어서 사용하고 있는데요, 최근에 추가된 spring actuator를 활용하도록 할 수도 있을 것 같아요 ~