-
-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Wait cluster responsive #639
Changes from all commits
32ab0ec
c0748bc
6d011d3
ed6a51e
1b07322
67afd55
5f6f7a3
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -31,6 +31,11 @@ resource "aws_eks_cluster" "this" { | |
aws_iam_role_policy_attachment.cluster_AmazonEKSServicePolicy, | ||
aws_cloudwatch_log_group.this | ||
] | ||
provisioner "local-exec" { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Any chance to avoid There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ah ... missed that part of your comment; sorry! Hmm... I'd worry about three things: (1) That the http provider would sometimes timeout, and it doesn't provide any control over timeouts, (2) when not combined with the cluster resource itself, terraform could schedule other resources that depend on the cluster before its ready, and outside the module, its a bit opaque to know what to do (though it could be documented). (3) Doc says they verify chain of trust for https, and certificate is likely to be unchained. I guess (2) could be possibly solved by having (say) cluster_id depend on an expression involving some output from http resource. (3) just testing will confirm/deny :).... vs (1) -- I'm a terraform newbie ... do you know anything about http data source timeout behavior? Would seem to be a risk to me, but I'll go ahead and try it if you guys want to go that way. UPDATE: Unfortunately, test gives:
Doesn't say if failed because not up yet, or if failed because insecure. Hmm... could the local-exec provisioner branch by OS and appropriately install curl? Or perhaps it should use a tool ("curl" or alternative appropriate for the OS) passed in in a variable (including no-op)? Your call -- I can give it a whirl if not too complex. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Most general would be to have variable for command be a template required to embed (say) ${URL}. I guess the shell varies by platform, but I guess this type of variable substitution should be broadly compatible? I'm not sure if the "until" loop is available "everywhere"? In any case, AFAIK the options are to code up a custom provider (or patch the http provider, to accept options for retry on error, and for insecure https), or to try to make local-exec be as cross-platform as possible. UPDATE ... If you are going to patch/create providers, then perhaps changing the kubernetes provider to wait until endpoint is up would also be an option. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what we should do now but open to opinions. Either merge with a new local-exec or think of something else 🙂 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I would think easiest way to fix something without upstream changes would be to use local-exec, but make it configurable for cross-platform support. Do you have an idea how you'd like to configure it? Do you want me to propose something? (Adding variables for configuration -- what program to use for "curl", for instance, or possibly whether to install curl appropriate for platform.)? Together with this you could open ticket(s) for upstream change(s) -- for instance if the kubernetes or the http provider had a "wait for cluster"/"wait for url" configuration. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. From what I'm reading from stackoverflow, it sounds like since powershell 3.O, Any windows guy around ? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we forget about Windows. It's proved to be too much effort to support both platforms natively in the past. Windows users can use Docker I think. |
||
command = <<EOT | ||
until curl -k ${aws_eks_cluster.this[0].endpoint}/healthz >/dev/null; do sleep 4; done | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This with no limit is OK? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (I thought that overall terraform timeout would be by default how user would control) |
||
EOT | ||
} | ||
} | ||
|
||
resource "aws_security_group" "cluster" { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this explicit
depends_on
required? The provider doesn't work before the cluster's creation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It depends on how the kubernetes provider is set up(?) I wasn't sure it wasn't impossible make dependence on cluster opaque to terraform, depending on how the provider is setup, so thought explicit dependency would be useful to avoid bug in edge case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very true. It's not necessarily implicit from other variables.