Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

omitempty corrections #2255

Merged

Conversation

Tom-Newton
Copy link
Contributor

@Tom-Newton Tom-Newton commented Oct 14, 2024

Purpose of this PR

Some corrections to make a couple of required fields required and optional fields genuinely optional.

Proposed changes:

  • Fix [BUG] minor: Several sparkUIOptions have become required #2215. This problem was caused by some optional fields in the CRD not using omitempty. That meant that valid SparkApplication configs got nulls added were they weren't allowed during some encode and decode in the SparkApplication default webhook.
  • Use regex \+optional\n^(?!.*omitempty).*$ to find all the optional parameters that don't use omitempty and add omitempty to them, to avoid the same problem in other places.
  • Remove omitempty on metadata and spec fields of SparkApplication and ScheduledSparkApplication. I think these should be required and they used to be in the past. and I suspect this may have been changed accidentally. Obviously let me know if there is good reason to add omitempty on these.
  • Make mainApplicationFile a required field
  • Write unittests for both of the above. I wasn't really sure how to do this but I came up with something. I don't know if its a good way to do it, but it adds the required coverage. My approach was to read the openapiv3 schema from the CRD and validate against that. Ended up removing this maybe we can add something better as a follow up.

Change Category

Indicate the type of change by marking the applicable boxes:

  • Bugfix (non-breaking change which fixes an issue)
  • Feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that could affect existing functionality)
  • Documentation update

Checklist

Before submitting your PR, please review the following:

  • I have conducted a self-review of my own code.
  • I have updated documentation accordingly - Ran make build-api-docs to capture making mainApplicationFile required.
  • I have added tests that prove my changes are effective or that my feature works.
  • Existing unit tests pass locally with my changes.

Additional Notes

I'm still a golang noob.

@Tom-Newton Tom-Newton force-pushed the tomnewton/omitempty_corrections branch from 6b26711 to 284d5ab Compare October 14, 2024 22:36
@ChenYi015
Copy link
Contributor

Remove omitempty on metadata and spec fields of SparkApplication and ScheduledSparkApplication. I think these should be required and they used to be in the past. and I suspect this may have been changed accidentally. Obviously let me know if there is good reason to add omitempty on these.

@Tom-Newton Since v2, we use kubebuilder to create api for SparkApplication/SchedulerSparkApplication with the command like follows:

kubebuilder create api --version v1beta2 --kind SparkApplication

and the omitempty annotation exists by default.

@Tom-Newton
Copy link
Contributor Author

@Tom-Newton Since v2, we use kubebuilder to create api for SparkApplication/SchedulerSparkApplication with the command like follows:

kubebuilder create api --version v1beta2 --kind SparkApplication

and the omitempty annotation exists by default.

Thanks for the info. Does that mean that mean we should keep them as they were? or is there somewhere else that is source of truth to mark them as required when running kubebuilder?

@ChenYi015
Copy link
Contributor

Does that mean that mean we should keep them as they were?

I think it is better to keep the metadata and spec annotations as default unless there is a good reason. For many other types, such as the ones in the kubeflow/training-operator, the omitempty annotation is retained:

https://github.com/kubeflow/training-operator/blob/6965c1a92462d46981071748936eb135a4584f3d/pkg/apis/kubeflow.org/v1/pytorch_types.go#L56-L68

@ChenYi015
Copy link
Contributor

The serviceType in driverIngressConfigurations also need to add omitempty annotations:

// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType"`

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
@Tom-Newton
Copy link
Contributor Author

The serviceType in driverIngressConfigurations also need to add omitempty annotations:

// ServiceType allows configuring the type of the service. Defaults to ClusterIP.
// +optional
ServiceType *corev1.ServiceType `json:"serviceType"`

Yes, I think I got that one https://github.com/kubeflow/spark-operator/pull/2255/files#diff-fae12edea9174bb072c60830180dc6e65aeaf098de9dfb7ac69240c3e1c347b1R315

@Tom-Newton
Copy link
Contributor Author

I think it is better to keep the metadata and spec annotations as default unless there is a good reason.

My reasoning is:

  1. It seems wrong to make something optional in the CRD schema when we know its actually required. Currently if I submit a SparkApplication with no spec it fails in the defaulting webhook with * spec.type: Unsupported value: "": supported values: "Java", "Python", "Scala", "R". This is better than just admitting it and causing a panic in the controller but it does show that spec is a required field really.
  2. Other code that iteracts with SparkApplications using the CRD schema now has to add extra checks for whether these fields are null. We have a case like this.

Let me know if you think these constitute good reason - if not I'll revert them.

@Tom-Newton Tom-Newton force-pushed the tomnewton/omitempty_corrections branch from 9674cfc to 18e0e3d Compare October 18, 2024 08:37
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
@Tom-Newton Tom-Newton force-pushed the tomnewton/omitempty_corrections branch from 18e0e3d to df0cc95 Compare October 18, 2024 08:38
@google-oss-prow google-oss-prow bot added size/M and removed size/L labels Oct 18, 2024
@ChenYi015
Copy link
Contributor

ChenYi015 commented Oct 18, 2024

I think it is better to keep the metadata and spec annotations as default unless there is a good reason.

My reasoning is:

  1. It seems wrong to make something optional in the CRD schema when we know its actually required. Currently if I submit a SparkApplication with no spec it fails in the defaulting webhook with * spec.type: Unsupported value: "": supported values: "Java", "Python", "Scala", "R". This is better than just admitting it and causing a panic in the controller but it does show that spec is a required field really.
  2. Other code that iteracts with SparkApplications using the CRD schema now has to add extra checks for whether these fields are null. We have a case like this.

@Tom-Newton

  1. When creating a SparkApplication with no spec, webhook will not be called because the API server will first fail to validate the CRD schema. The omitempty annotation does not imply a field is optional. For fields that are optional, we use a kubebuilder marker +optional to indicate this.

  2. Since metadata and spec fields are structs, not pointers, so there is no need to consider the null problem when interacting with the CRD schema.

@Tom-Newton
Copy link
Contributor Author

Tom-Newton commented Oct 18, 2024

When creating a SparkApplication with no spec, webhook will not be called because the API server will first fail to validate the CRD schema.

From my testing that is not correct. If I use the use the following config

apiVersion: sparkoperator.k8s.io/v1beta2
kind: SparkApplication
metadata:
  name: test
  namespace: default

and I test against kind running the spark operator built from the latest master. Then I get

$ kubectl apply -f ~/Downloads/config.yml 
The SparkApplication "test" is invalid: 
* spec.type: Unsupported value: "": supported values: "Java", "Python", "Scala", "R"
* spec.mainApplicationFile: Invalid value: "null": spec.mainApplicationFile in body must be of type string: "null"

and I see from the webhook logs that this did call the webhook

2024-10-18T09:30:08.421Z	INFO	webhook/sparkapplication_defaulter.go:57	Defaulting SparkApplication	{"name": "test", "namespace": "default", "state": ""}

BTW with this change the second of those validation errors would go away.

the API server will first fail to validate the CRD schema.

This is exactly what removing the omitempty achieves. With the omitempty in place the generated CRD schema will not mark the spec as required. However after removing the omitempty and re-running make update-crd (as I've done in this PR) then the CRD now does mark the spec as required https://github.com/kubeflow/spark-operator/pull/2255/files#diff-929d75537a6ddea49aa2609c3f42e8bf15c99f31a82aae9b8c5a7933d78199ccR11558-R11560.

@ChenYi015
Copy link
Contributor

I have just tested it and you are right. The webhook is called and the metadata and spec fields become required after removing the omitempty annotation.

Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
@Tom-Newton Tom-Newton force-pushed the tomnewton/omitempty_corrections branch from d1c4946 to a1c249c Compare October 18, 2024 10:40
Signed-off-by: Thomas Newton <thomas.w.newton@gmail.com>
@Tom-Newton Tom-Newton force-pushed the tomnewton/omitempty_corrections branch from 8f5b549 to 8a591fd Compare October 18, 2024 10:56
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ChenYi015

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ChenYi015
Copy link
Contributor

@Tom-Newton Well done! Thanks for reporting the issue and fixing it.
/lgtm

@google-oss-prow google-oss-prow bot added the lgtm label Oct 18, 2024
@google-oss-prow google-oss-prow bot merged commit 5ff8dcf into kubeflow:master Oct 18, 2024
8 checks passed
@Tom-Newton
Copy link
Contributor Author

@Tom-Newton Well done! Thanks for reporting the issue and fixing it.

Thanks for reviewing 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] minor: Several sparkUIOptions have become required
2 participants