Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor err check in GS controller a bit. Add event if Pod was not created #1499

Merged
merged 1 commit into from
Apr 29, 2020

Conversation

aLekSer
Copy link
Collaborator

@aLekSer aLekSer commented Apr 29, 2020

Fix panic when error is nil. Propagate error to event, to see which validation has failed.
Example of this if branch:

Operation cannot be fulfilled on resourcequotas "gke-resource-quotas"

What type of PR is this?

Uncomment only one /kind <> line, hit enter to put that in a new line, and remove leading whitespace from that line:

/kind breaking
/kind bug
/kind cleanup
/kind documentation

/kind feature

/kind hotfix

What this PR does / Why we need it:

Which issue(s) this PR fixes:

For #1475 .

Special notes for your reviewer:

@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 4f8f474a-a085-4606-b9bd-f5a5abe05796

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1499/head:pr_1499 && git checkout pr_1499
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.6.0-4e1a7d0

@aLekSer aLekSer force-pushed the event-on-pod-error branch from 4e1a7d0 to 6072398 Compare April 29, 2020 15:31
@aLekSer aLekSer changed the title Add event if Pod was not created Fix possible panic in GS controller. Add event if Pod was not created due to not handled type of error Apr 29, 2020
@aLekSer aLekSer changed the title Fix possible panic in GS controller. Add event if Pod was not created due to not handled type of error Fix possible panic in GS controller. Add event if Pod was not created Apr 29, 2020
@aLekSer
Copy link
Collaborator Author

aLekSer commented Apr 29, 2020

Stackdriver logs for this situation:

D 2020-04-09T14:17:19.853226091Z Creating Pod for GameServer 
D 2020-04-09T14:17:19.872478854Z Enqueuing 
D 2020-04-09T14:17:22.045607572Z Event(v1.ObjectReference{Kind:"GameServer", Namespace:"default", Name:"scale-fleet-0-2w6nx-4snpm-rgcwn", UID:"d233233b-7a6c-11ea-9275-42010a8a0150", APIVersion:"agones.dev/v1", ResourceVersion:"848172", FieldPath:""}): type: 'Normal' reason: 'Creating' Pod scale-fleet-0-2w6nx-4snpm-rgcwn created 
D 2020-04-09T14:17:22.467890467Z Syncing Starting GameServerState 
E 2020-04-09T14:17:22.468075187Z {"queue":"agones.dev.GameServerControllerCreation","error":"pod \"scale-fleet-0-2w6nx-4snpm-rgcwn\" not found","source":"*gameservers.Controller","message":"","subqueue":"creation","gsKey":"default/scale-fleet-0-2w6nx-4snpm-rgcwn"} 
E 2020-04-09T14:17:22.468116569Z pod "scale-fleet-0-2w6nx-4snpm-rgcwn" not found 
D 2020-04-09T14:17:22.468178046Z Processing 
D 2020-04-09T14:17:22.468220119Z Synchronising 
D 2020-04-09T14:17:22.468280433Z Syncing Create State 
D 2020-04-09T14:17:22.468501433Z Creating Pod for GameServer 
D 2020-04-09T14:17:22.601539991Z Enqueuing 
D 2020-04-09T14:17:22.624646062Z Processing 
D 2020-04-09T14:17:22.624668062Z Synchronising 
D 2020-04-09T14:17:22.624690106Z Syncing Starting GameServerState 
E 2020-04-09T14:17:22.624768285Z {"gsKey":"default/scale-fleet-0-2w6nx-4snpm-rgcwn","error":"pod \"scale-fleet-0-2w6nx-4snpm-rgcwn\" not found","source":"*gameservers.Controller","queue":"agones.dev.GameServerController","message":""} 
E 2020-04-09T14:17:22.624798856Z pod "scale-fleet-0-2w6nx-4snpm-rgcwn" not found 
D 2020-04-09T14:17:22.664804712Z Processing 
D 2020-04-09T14:17:22.664935379Z Synchronising 
D 2020-04-09T14:17:22.665038199Z Syncing Starting GameServerState 
...

2020-04-09 06:17:40.670 UTC-8 Event(v1.ObjectReference{Kind:"GameServer", Namespace:"default", Name:"scale-fleet-0-2w6nx-4snpm-rgcwn", UID:"d233233b-7a6c-11ea-9275-42010a8a0150", APIVersion:"agones.dev/v1", ResourceVersion:"849521", FieldPath:""}): type: 'Normal' reason: 'Starting' Deleting Pod scale-fleet-0-2w6nx-4snpm-rgcwn
2020-04-09 06:17:40.674 UTC-8 Processing
2020-04-09 06:17:40.674 UTC-8 Synchronising
2020-04-09 06:17:40.811 UTC-8 Processing
2020-04-09 06:17:40.812 UTC-8 Synchronising
2020-04-09 06:17:40.812 UTC-8 Syncing with Deletion Timestamp
2020-04-09 06:17:40.812 UTC-8 No pods found, removing finalizer agones.dev
2020-04-09 06:17:40.949 UTC-8 Syncing deleted GameServer

@aLekSer
Copy link
Collaborator Author

aLekSer commented Apr 29, 2020

How to reproduce panic:
https://github.com/kubernetes/apimachinery/blob/master/pkg/api/errors/errors_test.go#L37-L46
Change test to the following:

func TestErrorNew(t *testing.T) {
	err := NewAlreadyExists(resource("tests"), "1")
	if !IsAlreadyExists(err) {
		t.Errorf("expected to be %s", metav1.StatusReasonAlreadyExists)
	}
	err = nil
	if IsAlreadyExists(err) {
		t.Errorf("expected not to be %s", metav1.StatusReasonAlreadyExists)
	}

But this is ok, like in Agones master, do not panic:

func TestErrorNew(t *testing.T) {
	var err error
	err = nil
	if IsAlreadyExists(err) {
		t.Errorf("expected not to be %s", metav1.StatusReasonAlreadyExists)
	}
	err = NewAlreadyExists(resource("tests"), "1")
	if !IsAlreadyExists(err) {
		t.Errorf("expected to be %s", metav1.StatusReasonAlreadyExists)
	}

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: d2cb0f04-1c15-4548-a381-626d906836b6

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@aLekSer aLekSer changed the title Fix possible panic in GS controller. Add event if Pod was not created Add err check in GS controller. Add event if Pod was not created Apr 29, 2020
@aLekSer aLekSer force-pushed the event-on-pod-error branch from 6072398 to 29116c3 Compare April 29, 2020 15:53
@aLekSer aLekSer changed the title Add err check in GS controller. Add event if Pod was not created Check err before using in GS controller. Add event if Pod was not created Apr 29, 2020
@aLekSer aLekSer changed the title Check err before using in GS controller. Add event if Pod was not created Refactor err check in GS controller a bit. Add event if Pod was not created Apr 29, 2020
@aLekSer aLekSer marked this pull request as ready for review April 29, 2020 15:56
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: fc168da0-17e6-44f9-aa70-e8781181b224

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1499/head:pr_1499 && git checkout pr_1499
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.6.0-29116c3

if err != nil {
if k8serrors.IsInvalid(err) {
switch {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Niiiiice.

c.loggerForGameServer(gs).WithField("pod", pod).Errorf("Pod created is forbidden")
gs, err = c.moveToErrorState(gs, err.Error())
return gs, err
}
c.loggerForGameServer(gs).WithField("pod", pod).WithError(err)
c.recorder.Eventf(gs, corev1.EventTypeWarning, string(gs.Status.State), "error creating Pod for GameServer %s", gs.Name)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be a default: state of the switch, rather than sit outside the switch?

Since moveToErrorState will create an event, then we'll get 2 events with the same Error warning if the errors types is AlreadyExists, Invalid or Forbidden ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that is true

@aLekSer aLekSer force-pushed the event-on-pod-error branch from 29116c3 to d6151ba Compare April 29, 2020 18:39
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: e48a50c6-c640-4402-b4ef-50704890d5f8

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/GoogleCloudPlatform/agones.git pull/1499/head:pr_1499 && git checkout pr_1499
  • helm install ./install/helm/agones --namespace agones-system --name agones --set agones.image.tag=1.6.0-d6151ba

Copy link
Collaborator

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@google-oss-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: aLekSer, markmandel

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@markmandel markmandel merged commit cd50fca into googleforgames:master Apr 29, 2020
@markmandel markmandel added the kind/cleanup Refactoring code, fixing up documentation, etc label Apr 29, 2020
@markmandel markmandel added this to the 1.6.0 milestone Apr 29, 2020
ilkercelikyilmaz pushed a commit to ilkercelikyilmaz/agones that referenced this pull request Oct 23, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved kind/cleanup Refactoring code, fixing up documentation, etc lgtm size/S
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants