Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Existing A record error #1562

Closed
bertinatto opened this issue Apr 9, 2019 · 7 comments
Closed

Existing A record error #1562

bertinatto opened this issue Apr 9, 2019 · 7 comments

Comments

@bertinatto
Copy link
Member

Version

$ bin/openshift-install version
bin/openshift-install unreleased-master-729-g83ca64d34e7e16935cbaa39fb42a6c8ba9d58c33
built from commit 83ca64d34e7e16935cbaa39fb42a6c8ba9d58c33
release image registry.svc.ci.openshift.org/openshift/origin-release:v4.0

What happened?

I've seen this error occasionally:

ERROR                                              
ERROR Error: Error applying plan:                  
ERROR                                              
ERROR 1 error occurred:                            
ERROR 	* module.dns.aws_route53_record.api_external: 1 error occurred: 
ERROR 	* aws_route53_record.api_external: [ERR]: Error building changeset: InvalidChangeBatch: [Tried to create resource record set [name='api.myname.mydevcluster.com.', type='A'] but it already exists] 
ERROR 	status code: 400, request id: 2315fe75-5ac2-11e9-9982-7b4f4ee60637 
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR                                              
ERROR Terraform does not automatically rollback in the face of errors. 
ERROR Instead, your Terraform state file has been partially updated with 
ERROR any resources that successfully completed. Please address the error 
ERROR above and apply again to incrementally change your infrastructure. 
ERROR                                              
ERROR                                              
FATAL failed to fetch Cluster: failed to generate asset "Cluster": failed to create cluster: failed to apply using Terraform 

Then I head to the AWS Console and delete the records manually.

I'm not sure how often this happen out there, but it might be worthwhile to parse this AWS error and print a more helpful message to the user.

Ideally, the message would instruct the user to delete the records manually and then proceed with the installation.

@wking
Copy link
Member

wking commented Apr 9, 2019

Ideally, the message would instruct the user to delete the records manually and then proceed with the installation.

That's not always the appropriate response. See rhbz#1659970 (fixed by #1442) for why we error out in this case. Your issue is almost certainly a failed/forgotten uninstall of an earlier cluster; have you been running destroy cluster when you're done with old clusters? But I'm ok with elaborating on this and other Terraform errors (see my stub in #1452) if we wanted to pursue that direction.

@bertinatto
Copy link
Member Author

Your issue is almost certainly a failed/forgotten uninstall of an earlier cluster

Yep, that's my case. No doubt that it was my fault, but since we want to make the installer very accessible to our users, it might be worthwhile to point them some directions on how to solve common errors (if that's possible). #1452 looks very useful, perhaps we could have a pointer to it in the error message?

@chmouel
Copy link
Member

chmouel commented May 4, 2019

I have the issue happening to me every time (openshift-dev us-east-2 account), running a "destroy cluster"

@wking
Copy link
Member

wking commented May 4, 2019

I have the issue happening to me every time...

Have you removed the leaked A record? You need to recover this manually after the buggy openshift-dev reaper partially removes the cluster. Running destroy cluster before the reaper gets to your cluster will keep this from happening, but it won't help after the reaper removes your private zone.

@abhinavdahiya
Copy link
Contributor

openshift-dev account should be cleaning them correctly, if the installer doesn't find the private zone, the public records cannot be delete for safety.

/close

@openshift-ci-robot
Copy link
Contributor

@abhinavdahiya: Closing this issue.

In response to this:

openshift-dev account should be cleaning them correctly, if the installer doesn't find the private zone, the public records cannot be delete for safety.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@lukas-vlcek
Copy link

lukas-vlcek commented Aug 16, 2019

I am running into similar issue but I am unable to find my domain in Hosted zones list so I can not delete my Route53 record in AWS console. Am I looking into wrong dashboard?


Never mind, I found it. I need to click the parent domain first and then I found it...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants