Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add StatusCheck field to skaffold.yaml (#4904) #5706

Merged
merged 5 commits into from
Apr 30, 2021

Conversation

maggieneterval
Copy link
Contributor

@maggieneterval maggieneterval commented Apr 19, 2021

Fixes: #4904
Merge before/after: Requires rebase after #5700, #5712, #5748 are merged.

Description
Add StatusCheck field to skaffold.yaml.

User facing changes

Before:

--status-check flag healthcheck enabled?
unset yes
true yes
false no

After:

--status-check flag StatusCheck in skaffold.yaml healthcheck enabled?
unset unset yes
unset false no
unset true yes
false unset no
false false no
false true no
true unset yes
true false yes
true true yes

@gsquared94
Copy link
Contributor

From the linked issue:

Upon further investigation, it appears that there is no way to distinguish a true from an unset --status-check flag value (unset defaults to true).

Can we change the type of status-check from a boolean to a custom type with the three states True, False and UnSpecified. Otherwise I think that this status-check stands out as an exception to the behavior of all the other flags that override what is specified in the skaffold.yaml.

@MarlonGamez
Copy link
Contributor

@maggieneterval @gsquared94 there's a newly introduced Nillable interface to use for types similar to what you're suggesting

// Nillable is used to reset objects that implement pflag's `Value` and `SliceValue`.
// Some flags, like `--default-repo`, use nil to indicate that they are unset, which
// is different from the empty string.
type Nillable interface {
SetNil() error
}

You can see an example implementation on the StringOrUndefined type in pkg/skaffold/config/types.go:

// StringOrUndefined holds the value of a flag of type `string`,
// that's by default `undefined`.
// We use this instead of just `string` to differentiate `undefined`
// and `empty string` values.
type StringOrUndefined struct {
value *string
}

@maggieneterval
Copy link
Contributor Author

Thanks for the comments, @gsquared94 and @marlon-gamez! In the spirit of small PRs, I will open a separate PR to make the status-check flag implement Nillable and then rebase this PR on that commit.

@maggieneterval
Copy link
Contributor Author

Hi @marlon-gamez & @gsquared94, this should nearly be ready for review as I've rebased it off #5712. Just need to resolve an integration test error: Type mismatch in profile overlay for field 'statusCheck' with type bool; falling back to original config values that occurs when running skaffold while specifying a profile. Will continue to debug this but please let me know if I've overlooked something obvious! Thank you!

@MarlonGamez
Copy link
Contributor

Hey @maggieneterval, I think this issue is caused by this function not having a case for handling bools.

func overlayProfileField(fieldName string, config interface{}, profile interface{}) interface{} {
v := reflect.ValueOf(profile) // the profile itself
t := reflect.TypeOf(profile) // the type of the profile, used for getting struct field types
logrus.Debugf("overlaying profile on config for field %s", fieldName)
switch v.Kind() {
case reflect.Struct:
// check the first field of the struct for a oneOf yamltag.
if util.IsOneOfField(t.Field(0)) {
return overlayOneOfField(config, profile)
}
return overlayStructField(config, profile)
case reflect.Slice:
// either return the values provided in the profile, or the original values if none were provided.
if v.Len() == 0 {
return config
}
return v.Interface()
case reflect.Ptr:
// either return the values provided in the profile, or the original values if none were provided.
if v.IsNil() {
return config
}
return v.Interface()
case reflect.Int:
if v.Interface() == reflect.Zero(v.Type()).Interface() {
return config
}
return v.Interface()
case reflect.String:
if reflect.DeepEqual("", v.Interface()) {
return config
}
return v.Interface()
default:
logrus.Fatalf("Type mismatch in profile overlay for field '%s' with type %s; falling back to original config values", fieldName, v.Kind())
return config
}
}

I'm actually curious about how this wasn't caught before with other bool fields. I'm asking the team about why this is and I'll get back to you

@MarlonGamez
Copy link
Contributor

It seems like we'll need to update this function to handle bool values as well. Trying locally, it seems that just adding

	case reflect.Bool:
		if v.Interface() == reflect.Zero(v.Type()).Interface() {
			return config
		}
		return v.Interface()

should be good

@@ -497,6 +497,9 @@ type TestCase struct {
type DeployConfig struct {
DeployType `yaml:",inline"`

// StatusCheck *beta* enables waiting for deployments to stabilize.
StatusCheck bool `yaml:"statusCheck,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think omitempty only works if this is *bool

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, this needs to be *bool anyways. Otherwise it'll get a value of false by default and you can't distinguish if that value was set by the user or not (same problem as with the status-check flag that you just fixed).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Updated! (My initial implementation had skaffold fix automatically add a StatusCheck: true stanza to the skaffold.yaml to preserve the existing enabled-by-default behavior, so I also removed that logic.)

Comment on lines 161 to 166
scOpts := rc.Opts.StatusCheck.Value()
scPipelines := rc.Pipelines.StatusCheck()
if scOpts == nil {
return scPipelines
}
return *sc
return *scOpts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this logic breaks current behavior?

eg: If I don't define statusCheck in my skaffold.yaml it'll be set to false. Since I don't set the flag either, it'll skip status check (line 164).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should rather be:

	scOpts := rc.Opts.StatusCheck.Value()
	scConfig := rc.Pipelines.StatusCheck()
	switch {
	case scOpts != nil:
		return *scOpts
	case scConfig != nil:
		return *scConfig
	default:
		return true
	}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Updated! (My initial implementation had skaffold fix automatically add a StatusCheck: true stanza to the skaffold.yaml in order to not break the enabled-by-default behavior, but changing its type to *bool instead to distinguish false from unspecified is much better, thanks!)

@codecov
Copy link

codecov bot commented Apr 26, 2021

Codecov Report

Merging #5706 (75ace97) into master (03e9a20) will decrease coverage by 0.02%.
The diff coverage is 62.50%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #5706      +/-   ##
==========================================
- Coverage   70.68%   70.65%   -0.03%     
==========================================
  Files         423      428       +5     
  Lines       16168    16193      +25     
==========================================
+ Hits        11428    11441      +13     
- Misses       3900     3908       +8     
- Partials      840      844       +4     
Impacted Files Coverage Δ
pkg/skaffold/schema/latest/v1/config.go 58.82% <ø> (ø)
pkg/skaffold/schema/latest/v2/config.go 0.00% <0.00%> (ø)
pkg/skaffold/schema/latest/v2/upgrade.go 0.00% <0.00%> (ø)
pkg/skaffold/schema/v3alpha1/config.go 0.00% <0.00%> (ø)
pkg/skaffold/schema/v3alpha1/upgrade.go 0.00% <0.00%> (ø)
pkg/skaffold/runner/deploy.go 53.00% <50.00%> (-0.61%) ⬇️
pkg/skaffold/runner/runcontext/context.go 83.82% <81.81%> (-0.93%) ⬇️
pkg/skaffold/util/tar.go 50.66% <0.00%> (-5.34%) ⬇️
pkg/skaffold/kubernetes/log.go
pkg/skaffold/kubernetes/image_list.go 100.00% <0.00%> (ø)
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d5acad2...75ace97. Read the comment docs.

func (ps Pipelines) StatusCheck() *bool {
var sc *bool
// set the group status check to disabled if any pipeline has StatusCheck
// set to false.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was not sure what would be reasonable here; how/why would a user set only a single pipeline's StatusCheck value to false? (Not 100% clear on the relationship between a skaffold.yaml file, a profile, and a pipeline, to be honest...)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not completely sure the reason a user would disable it in one, @gsquared94 might know a situation tho.

a skaffold.yaml can contain one or more skaffod configs. A config defines a pipeline so that skaffold knows how to build, test, deploy, and do statuscheck for a project. A config can also define profiles that replace parts of main config when the user activates a profile

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a skaffold.yaml can contain one or more skaffod configs. A config defines a pipeline

Ah, that makes sense. Thank you for clarifying! Let me know if it would be more reasonable to (1) disable healthcheck if any pipeline has StatusCheck set to false, or (2) enable healthcheck unless all pipelines have StatusCheck set to false. Thanks again for all your help with this!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

had a small discussion on this out of band, but this is probably a new scenario where we want to have values set on a per-config basis. there are surely situations where users want to enable health check for a single module, but leave it disabled for others (e.g. if one module is deploying something long running background task, health checks aren't useful), so it seems strange to disable it altogether if one particular module wants it off.

looking at how we have the code structured around the RunContext right now, this isn't a trivial change, so in interest of not completely blocking this implementation I think we can do this in a follow up PR. in the meantime, we should do two things here:

  1. disable the health check if any pipelines explicitly disable it, since the alternative would be enabling it for long-running deployments that could cause frustrating UX
  2. error out if users are explicitly setting status check configuration in one pipeline, and explicitly disabling it in any other. this will prevent surprising behavior, and at least give users a thread they can follow to get things to work in a less confusing way (we should probably point to docs)

@maggieneterval would you mind also creating an issue to track the follow up work here, and tie it back to this PR? something like "Support configuring status check independently between modules"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sorry I was out for a few days, but @nkubala's point above is exactly what I wanted to say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you both so much for your help with this. I opened #5746 to document this follow-up work.

@MarlonGamez MarlonGamez added the kokoro:run runs the kokoro jobs on a PR label Apr 26, 2021
@kokoro-team kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Apr 26, 2021
@MarlonGamez
Copy link
Contributor

hey @maggieneterval, we just did a release and we need to generate a new schema version. Once #5748 goes in you'll just need to rebase on it and I think this PR should pretty much be good to merge after that!

@maggieneterval
Copy link
Contributor Author

This should be ready for a final look. Thank you! 😃

@tejal29 tejal29 added the kokoro:run runs the kokoro jobs on a PR label Apr 30, 2021
@kokoro-team kokoro-team removed the kokoro:run runs the kokoro jobs on a PR label Apr 30, 2021
Copy link
Contributor

@tejal29 tejal29 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a small nit, you can short circuit the enabled && disabled check and return early if both are set in the for loop.

@maggieneterval
Copy link
Contributor Author

CI is now failing with the following error:

anyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

Is this a known issue? I can try re-triggering the CI on Monday.

@tejal29
Copy link
Contributor

tejal29 commented Apr 30, 2021

CI is now failing with the following error:

anyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

Is this a known issue? I can try re-triggering the CI on Monday.

I will try retrying now. But yes this has been happening a lot lately and @nkubala did some digging into it.

@tejal29 tejal29 enabled auto-merge (squash) April 30, 2021 21:13
@tejal29 tejal29 added kokoro:force-run forces a kokoro re-run on a PR kokoro:run runs the kokoro jobs on a PR labels Apr 30, 2021
@kokoro-team kokoro-team removed kokoro:run runs the kokoro jobs on a PR kokoro:force-run forces a kokoro re-run on a PR labels Apr 30, 2021
@tejal29 tejal29 merged commit ebe2f6d into GoogleContainerTools:master Apr 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add status-check option to the skaffold.yaml
6 participants