-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Server Side Field Validation #2
Server Side Field Validation #2
Conversation
fbe25a5
to
434ce43
Compare
f0caae4
to
8bc736a
Compare
@@ -855,6 +855,7 @@ var defaultKubernetesFeatureGates = map[featuregate.Feature]featuregate.FeatureS | |||
genericfeatures.WarningHeaders: {Default: true, PreRelease: featuregate.GA, LockToDefault: true}, // remove in 1.24 | |||
genericfeatures.OpenAPIEnums: {Default: false, PreRelease: featuregate.Alpha}, | |||
genericfeatures.CustomResourceValidationExpressions: {Default: false, PreRelease: featuregate.Alpha}, | |||
genericfeatures.FieldValidation: {Default: false, PreRelease: featuregate.Alpha}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not completely sure about the name yet, what about ... StrictValidation
, or both StrictFieldsValidation
?
return obj, gvk, err | ||
} | ||
return nil, gvk, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've never seen this type of return signature where one doesn't check for err != nil
but obj != nil
and I can see how it'd be confusing for people.
staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go
Show resolved
Hide resolved
@@ -1321,10 +1327,12 @@ func (v *unstructuredSchemaCoercer) apply(u *unstructured.Unstructured) error { | |||
if err != nil { | |||
return err | |||
} | |||
|
|||
pruned := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could possibly have used an empty struct here, struct{}
rather than bool, though I don't know if it matters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also prefer var pruned map[string]...
which avoids an unused map allocation
pruned = structuralpruning.Prune(u.Object, v.structuralSchemas[gv.Version], false) | ||
structuraldefaulting.PruneNonNullableNullsWithoutDefaults(u.Object, v.structuralSchemas[gv.Version]) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So if we're not pruning, we can't fail on unknown fields?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems correct... if we're preserving fields not mentioned in the schema, we're effectively saying all field names are permissible
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the other hand, this mostly exists for compatibility reason, doesn't mean people necessarily wanted arbitrary fields? I think it's useful even as a warning for example.
scope.err(err, w, req) | ||
return | ||
} | ||
|
||
decoder := scope.Serializer.DecoderToVersion(s.Serializer, scope.HubGroupVersion) | ||
trace.Step("About to convert to expected version") | ||
obj, gvk, err := decoder.Decode(body, &defaultGVK, original) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That answers one of my question above. It doesn't look like ignore and warn are fundamentally different, do we need to keep ignore then? Or if we keep it, does it need to be the default?
return | ||
} | ||
if validationDirective == runtime.WarnFieldValidation { | ||
warning.AddWarning(req.Context(), "", err.Error()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have expected one warning per unknown field, this looks like it would build a very large warning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made an attempt at doing that now. Let me know if it's correct
|
||
// smpTestSetup applies an object that will later be patched | ||
// in the actual test/benchmark. | ||
func smpTestSetup(t testing.TB, client clientset.Interface) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a big fan of this function for tests in general
// smpTestSetup applies an object that will later be patched | ||
// in the actual test/benchmark. | ||
func smpTestSetup(t testing.TB, client clientset.Interface) { | ||
bodyBase, err := os.ReadFile("./testdata/deploy-small.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd also ideally avoid having files read from disk for tests, since it makes it harder to understand what the test does.
var testcases = []struct { | ||
name string | ||
opts metav1.UpdateOptions | ||
errContains string | ||
warnContains string |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this test (and another), I don't know if I enjoy the value of table-driven testing here, as I feel this obfuscates what the test does for very limited code re-use.
I don't have a strong opinion about any of my comments and this is super good and could probably be merged as is (except maybe we should answer to a few TODOs ;-). Thanks Kevin! |
I spent all of today focused on the SMP unstructured converter because that is where I felt the least comfortable and wanted to get some resolution on first. In addressing your feedback @apelisse , I think I finally understood the "pre-flattening" algo we had discussed before that I was previously failing to grok. I've updated the converter to use this now which gets rid of the wonky "fields" field we had to add to the various method signatures that you had commented on (and clears up some of the other concerns you had). On the performance side , the way I had implemented converter strict validation was about 300% slower and more memory intensive than what's currently in master "no validation" (and Ignore also suffered the penalty because it was doing the unknown fields check and just not returning the error). Now, I'm seeing roughly parity between strict validation and the old "no validation" (actually, strict appears to be slightly faster and less memory (which is weird and I need to look into it on monday)). One thing I did not implement was the "full path to unknown field" that we discussed offline. I noticed that the json deserializer also does not indicate the full path, so I wasn't sure if we actually wanted to do that here in SMP. I think it's easy enough to add (although I think it will require changing the method signatures to store the path down successive recursive calls of Anyways, I know this is a lot of text, so let's sync on Monday. If you want, feel free to take another look at |
kubernetes#105030 is merged if you want to transition this to k/k before the next round of feedback |
if len(pruned) > 0 && v.persistStrictDecodingErrors { | ||
allStrictErrs := make([]error, len(pruned)) | ||
i := 0 | ||
for unknownField, _ := range pruned { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
iterating over a map makes the order non-deterministic, which is not ideal
also see suggestion above about making this low-level function just return the list of unknown field paths and letting the caller turn that into warnings/messages
@@ -1321,10 +1327,12 @@ func (v *unstructuredSchemaCoercer) apply(u *unstructured.Unstructured) error { | |||
if err != nil { | |||
return err | |||
} | |||
|
|||
pruned := map[string]bool{} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also prefer var pruned map[string]...
which avoids an unused map allocation
pruned = structuralpruning.Prune(u.Object, v.structuralSchemas[gv.Version], false) | ||
structuraldefaulting.PruneNonNullableNullsWithoutDefaults(u.Object, v.structuralSchemas[gv.Version]) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems correct... if we're preserving fields not mentioned in the schema, we're effectively saying all field names are permissible
structuralSchemas map[string]*structuralschema.Structural | ||
structuralSchemaGK schema.GroupKind | ||
preserveUnknownFields bool | ||
persistStrictDecodingErrors bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"persist" reads strangely... maybe returnUnknownFieldPaths
?
structuralSchemas map[string]*structuralschema.Structural | ||
structuralSchemaGK schema.GroupKind | ||
preserveUnknownFields bool | ||
persistStrictDecodingErrors bool | ||
} | ||
|
||
func (v *unstructuredSchemaCoercer) apply(u *unstructured.Unstructured) error { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since this signature is private, would it be clearer to make the list of unknown fields be returned as a separate return value?
func (v *unstructuredSchemaCoercer) apply(u *unstructured.Unstructured) (unknownFieldPaths []string, err error)
That would let the caller determine how to use the list of unknown field paths, would preserve order, and would distinguish between validation errors and unknown fields
trace.Step("About to convert to expected version") | ||
obj, gvk, err := decoder.Decode(body, &defaultGVK, original) | ||
|
||
validationDirective, err := fieldValidation(req) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there a reason not to do the validation in ValidateCreateOptions above (which presumably already extracted the desired directive from the req parameters), and use options.FieldValidation
instead of a second local var? same question applies to all the places fieldValidation(req) is being used
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Somehow I completely missed the Validate*Options
. I've done that now
@@ -424,7 +438,7 @@ type applyPatcher struct { | |||
userAgent string | |||
} | |||
|
|||
func (p *applyPatcher) applyPatchToCurrentObject(obj runtime.Object) (runtime.Object, error) { | |||
func (p *applyPatcher) applyPatchToCurrentObject(_ context.Context, obj runtime.Object) (runtime.Object, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a TODO? duplicate and unknown fields are detectable here, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not meant as a TODO, because SSA already defaults to strict validation on unknown fields. But I just realized that we had spoke about how SSA does not error on duplicate fields, so I’ve added a TODO to capture this.
if contentType != "" { | ||
supported = false | ||
for _, v := range strings.Split(contentType, ",") { | ||
t, _, err := mime.ParseMediaType(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ParseMediaType has non-trivial cost and is already done in NegotiateInputSerializer for create/update, and is unused for patch (we require Content-Type to be exactly one of the supported patch types) ... I didn't expect to be doing this parsing a second time here
is there a reason we can't define a strict version of each deserializer and select that in create.go / update.go if FieldValidation is set to Warn or Strict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this as part of the Validate*Options
work
} | ||
} | ||
|
||
validationParam := req.URL.Query()["fieldValidation"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pull this from create/update/patch options rather than re-extracting from the query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
case 0: | ||
return runtime.IgnoreFieldValidation, nil | ||
case 1: | ||
switch strings.ToLower(validationParam[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure why we'd want to be case-insensitive here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to be case-sensitive. Wasn’t sure what the right way was based on this comment kubernetes#104433 (comment)
Has this been re-opened in k/k? I wanted to do another pass. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Addressed most of the feedback. Still a couple TODOs around test formatting and defaulting to warning, but wanted to get this up for another round of review ASAP.
Currently opening up a PR on k/k, will have it up in a few minutes.
// - Ignore: ignore's unknown/duplicate fields | ||
// - Strict: fail the request on unknown/duplicate fields | ||
// - Warn: respond with a warning, but successfully serve the request. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reordered the values. Waiting on another round of feedback before defaulting to warn to confirm that is what we want to do
staging/src/k8s.io/apiextensions-apiserver/pkg/apiserver/customresource_handler.go
Show resolved
Hide resolved
@@ -23,7 +23,8 @@ import ( | |||
// Prune removes object fields in obj which are not specified in s. It skips TypeMeta and ObjectMeta fields | |||
// if XEmbeddedResource is set to true, or for the root if isResourceRoot=true, i.e. it does not | |||
// prune unknown metadata fields. | |||
func Prune(obj interface{}, s *structuralschema.Structural, isResourceRoot bool) { | |||
// It returns the set of fields that it prunes. | |||
func Prune(obj interface{}, s *structuralschema.Structural, isResourceRoot bool) map[string]bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -424,7 +438,7 @@ type applyPatcher struct { | |||
userAgent string | |||
} | |||
|
|||
func (p *applyPatcher) applyPatchToCurrentObject(obj runtime.Object) (runtime.Object, error) { | |||
func (p *applyPatcher) applyPatchToCurrentObject(_ context.Context, obj runtime.Object) (runtime.Object, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was not meant as a TODO, because SSA already defaults to strict validation on unknown fields. But I just realized that we had spoke about how SSA does not error on duplicate fields, so I’ve added a TODO to capture this.
return runtime.IgnoreFieldValidation, nil | ||
} | ||
|
||
// TODO: Should we blocklist unsupportedContentTypes (just protobuf) rather than allowlisting everything that isn't protobuf? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed as part of the Validation*Options
work
if contentType != "" { | ||
supported = false | ||
for _, v := range strings.Split(contentType, ",") { | ||
t, _, err := mime.ParseMediaType(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
removed this as part of the Validate*Options
work
} | ||
} | ||
|
||
validationParam := req.URL.Query()["fieldValidation"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
case 0: | ||
return runtime.IgnoreFieldValidation, nil | ||
case 1: | ||
switch strings.ToLower(validationParam[0]) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed it to be case-sensitive. Wasn’t sure what the right way was based on this comment kubernetes#104433 (comment)
This is the commit message for patch #2 (refresh-temp):
Signed-off-by: Nilekh Chaudhari <1626598+nilekhc@users.noreply.github.com> fix (#2) Signed-off-by: Monis Khan <mok@microsoft.com>
What type of PR is this?
/kind feature
What this PR does / why we need it:
Performs strict server side schema validation via the
fieldValidation=[Strict,Warn,Ignore]
query parameter.This change is dependent on
liggitt/json-stdlib
and so this PR is opened on this fork in order to begin collecting feedback on it untilliggitt/json-stdlib
is merged into upstream k/k.It introduces the
fieldValidation
query parameter that when set toStrict
causes requests to error when unknown fields are present, when set toWarn
succeeds the request but returns a warning, and when set toIgnore
it ignores unknown fields.For create/update requests, we use strict json unmarshalling to determine the unknown fields and error/warn based on those fields.
For JSON Patch requests (i.e. CRDs), we use the prune mechanism in the customresource handler to resolve any unknown fields.
For SMP Patch request, we use the the unstructured converter to resolve unknown fields.
Apply Patch requests already require strict schemas and will error on unknown fields.
Which issue(s) this PR fixes:
First step in solving kubernetes#39434 and kubernetes#5889
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
KEP-2885