Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 fix: Don't flatten schemas for type idents we don't know about #627

Conversation

abayer
Copy link

@abayer abayer commented Sep 29, 2021

I hit this problem with cases where there are multiple packages with the same group and version. The schema patcher would hit a nil pointer trying to flatten a schema for a kind that exists in one package, but not another, because it seems to assume that everything's in one package. By skipping cases where the parser hasn't already found the given kind in the given package, we got past that panic and everything works correctly.

This adds an option for schemapatch, allowMultiPackageGroup - when that is true, we won't error out in scenarios where a group is used in multiple packages, while in the default case, we'll error out with the package path and kind that couldn't be found.

I believe this fixes #624, though I can't be sure if this is the only path that can lead to that panic.

Signed-off-by: Andrew Bayer andrew.bayer@gmail.com

@k8s-ci-robot k8s-ci-robot added do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Sep 29, 2021
@k8s-ci-robot
Copy link
Contributor

Welcome @abayer!

It looks like this is your first PR to kubernetes-sigs/controller-tools 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes-sigs/controller-tools has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 29, 2021
@k8s-ci-robot
Copy link
Contributor

Hi @abayer. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: abayer
To complete the pull request process, please assign pwittrock after the PR has been reviewed.
You can assign the PR to them by writing /assign @pwittrock in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

I hit this problem with cases where there are multiple packages with the
same group and version. The schema patcher would hit a nil pointer
trying to flatten a schema for a kind that exists in one package, but
not another, because it seems to assume that everything's in one
package. By skipping cases where the parser hasn't already found the
given kind in the given package, we got past that panic and everything
works correctly.

I believe this should address kubernetes-sigs#624, though I can't be sure if this is
the only path that can lead to that panic.

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
@abayer abayer force-pushed the ignore-groupkind-from-other-package branch from 9e0d3f0 to 8eda5d7 Compare September 29, 2021 20:04
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/invalid-commit-message Indicates that a PR should not merge because it has an invalid commit message. label Sep 29, 2021
@alvaroaleman
Copy link
Member

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 30, 2021
@alvaroaleman
Copy link
Member

So you have multiple go packages that all have types for the same api group and version? Why?

@abayer
Copy link
Author

abayer commented Sep 30, 2021

@alvaroaleman Honestly? The best explanation is probably that we screwed up. =)

@alvaroaleman
Copy link
Member

@alvaroaleman Honestly? The best explanation is probably that we screwed up. =)

So what keeps you from reorganizing them?

I am fine better handling this case and emit a proper error rather than panicking. But I am not sure we want to actually support it.

@abayer
Copy link
Author

abayer commented Sep 30, 2021

That's a good question - Tekton's pretty established at this point, which obviously would make any significant reorganization more difficult. I'll raise the topic, but given that this is the first time I'm aware of us hitting problems due to our quirky structure, and we're only hitting the problem now that we're trying to add auto-generated OpenAPI schemas to our previously-handcrafted CRDs, I'm not sure if there'll be much, if any, willingness to shift things around.

@abayer
Copy link
Author

abayer commented Sep 30, 2021

Ok, so https://github.com/tektoncd/pipeline/tree/main/pkg/apis has three relevant subdirectories - pipeline, run, and resource. run and resource just have v1alpha1 subdirs/versions, while pipeline has both v1alpha1 and v1beta1. I'm trying to see if pkg/apis/resource/v1alpha1's contents can be moved to pkg/apis/pipeline/v1alpha1, but due to how we have pipeline's v1alpha1 and v1beta1 set up, with many types in v1alpha1 just being aliases for what's in v1beta1, and both pipeline/v1alpha1 and pipeline/v1beta1 having dependencies on resource/v1alpha1, everything gets really messy trying to move it around...

So yeah, we definitely screwed up - since Tekton came out of Knative, I assume our original pkg/apis/... layout was based on how it was done in Knative projects - i.e., https://github.com/knative/serving/tree/main/pkg/apis has autoscaling and serving (and around the time Tekton started as Knative Build, it also had networking. And then when we added .../pipeline/v1beta1, we probably didn't do that quite right either (the type aliasing to v1beta1 from v1alpha1 like this doesn't feel right to me, but I can't claim to be anything like an expert!), just making things more confusing, while we left run and resource at v1alpha1 with another bunch of type aliasing into pipeline/v1beta1 (and pipeline/v1alpha1, for that matter). It's definitely a mess.

I'll continue to see if we can reorganize, and definitely understand not wanting to support a weird edge case of a layout like Tekton's, but it'd be really handy for us (and possibly for others) if we could at least have an option to enable the behavior I add in this PR (as well as default behavior that errors out nicely rather than panicking).

…llow instead

Signed-off-by: Andrew Bayer <andrew.bayer@gmail.com>
@abayer
Copy link
Author

abayer commented Sep 30, 2021

I've updated the PR to add an allowMultiPackageGroup option, defaulting to erroring out.

@vdemeester
Copy link

We did not screwed up per se, it was done on purpose with only go type (and aliasing stuff in mind), to ease the migration of the whole code base from v1alpha1 to v1beta1. The assumption we made (or at least I made) was that nothing would/should prevent us from having multiple packages presenting one api version. I would still make that assumption but I also understand that it can be trickier to support.

The real trick in our case is the deprecated, in alpha, resources that we didn’t want to move to the v1beta1 package (and api version) - but still allowing our user to refer to these resources from the v1beta1 for the time being.

The plan is to get rid of v1alpha1 at some point. I also see no real problems to remove type aliasing, etc.. and having some duplication to allow us to fix this (sometime duplication is better than the wrong abstraction).

@@ -75,6 +75,10 @@ type Generator struct {

// GenerateEmbeddedObjectMeta specifies if any embedded ObjectMeta in the CRD should be generated
GenerateEmbeddedObjectMeta *bool `marker:",optional"`

// AllowMultiPackageGroups specifies whether cases where a group is used for multiple packages should be allowed,
// rather than the default behavior of failing.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After adding this PR thedefault behaviour would not be a failing right?
Could you please fix the comment to allow it describes what the variable represents and when it will be used such as we have for the other ones?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default behavior would still be failing in the multiple-packages-for-a-single-group scenario. The new option allows you to change that behavior, but the default stays the same.

@@ -41,7 +41,7 @@ var _ = Describe("CRD Patching From Parsing to Editing", func() {
var crdSchemaGen genall.Generator = &Generator{
ManifestsPath: "./invalid",
}
rt, err := genall.Generators{&crdSchemaGen}.ForRoots("./...")
rt, err := genall.Generators{&crdSchemaGen}.ForRoots("./apis/kubebuilder/...", "./apis/legacy/...")
Copy link
Member

@camilamacedo86 camilamacedo86 Dec 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you need to change it for the specific pkgs?
Where/what is getting started to fail after these changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because I made the (probably wrong!) decision to reuse the contents of pkg/schemapatcher/testdata for both the existing tests and the new testing of multiple-packages-per-group. I'll separate the test data for those situations.

@@ -67,7 +67,7 @@ var _ = Describe("CRD Patching From Parsing to Editing", func() {
var crdSchemaGen genall.Generator = &Generator{
ManifestsPath: "./valid",
}
rt, err := genall.Generators{&crdSchemaGen}.ForRoots("./...")
rt, err := genall.Generators{&crdSchemaGen}.ForRoots("./apis/kubebuilder/...", "./apis/legacy/...")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same. I think it needs to be reverted to ensure that the changes made here do not break any current test. Could you please clarify why this change was required?

Copy link
Member

@camilamacedo86 camilamacedo86 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#627 (comment)
+1
Hi @abayer, @vdemeester,

IHMO it is missing ( probably in the issue )

  • An example of the scenario
  • A description of when it is required and why

Also, we would need to check if it is a valid scenario or if this PR is trying to add support to a scenario that ought not to be done.

Are you trying here to allow in the same project you specific the same GVK?
If you deprecated the version, should you not have a different GVK since the version would not be the same?

If I understood your scenario adequately, I think the controller-gen out to raise an error saying that the user is trying to specify the same GVK more than once instead to add support and begin to allow it.

@abayer
Copy link
Author

abayer commented Jan 19, 2022

@camilamacedo86 Sorry for the delay! I missed the notifications originally and just remembered to check this PR. 🤦 Thanks for your comments, and I'll address them today.

@abayer
Copy link
Author

abayer commented Jan 19, 2022

@camilamacedo86 So the situation in Tekton Pipeline is that we have pkg/apis/pipeline/v1alpha1, pkg/apis/pipeline/v1beta1, pkg/apis/resource/v1alpha1, and pkg/apis/run/v1alpha1. They all have tekton.dev as their group. See above for some details on why we have things that way. schema-gen fails because it doesn't expect to see the same group and version for different packages. It fails with the same panic that's in #624, but I don't know if that's actually caused by the same thing or not. I should make a new issue, and will do that.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle stale
  • Mark this issue or PR as rotten with /lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Apr 19, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Close this issue or PR with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels May 19, 2022
@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

@k8s-ci-robot
Copy link
Contributor

@k8s-triage-robot: Closed this PR.

In response to this:

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs.

This bot triages issues and PRs according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Reopen this issue or PR with /reopen
  • Mark this issue or PR as fresh with /remove-lifecycle rotten
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

panic: runtime error: invalid memory address or nil pointer dereference
6 participants