Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[release-v1.9] Expose init offset and schedule metrics for ConsumerGroup reconciler #790

Conversation

pierDipi
Copy link
Member

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
@openshift-ci openshift-ci bot requested review from lberk and matzew August 22, 2023 10:32
@pierDipi
Copy link
Member Author

/cherry-pick release-v1.10

@openshift-cherrypick-robot

@pierDipi: once the present PR merges, I will cherry-pick it on top of release-v1.10 in a new PR and assign it to you.

In response to this:

/cherry-pick release-v1.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@pierDipi
Copy link
Member Author

/cherry-pick release-v1.11

@openshift-cherrypick-robot

@pierDipi: once the present PR merges, I will cherry-pick it on top of release-v1.11 in a new PR and assign it to you.

In response to this:

/cherry-pick release-v1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Copy link
Member

@matzew matzew left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@openshift-ci
Copy link

openshift-ci bot commented Aug 22, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: matzew, pierDipi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-merge-robot openshift-merge-robot merged commit 882a6a2 into openshift-knative:release-v1.9 Aug 22, 2023
@openshift-cherrypick-robot

@pierDipi: #790 failed to apply on top of branch "release-v1.10":

Applying: Expose init offset and schedule metrics for ConsumerGroup reconciler
Using index info to reconstruct a base tree...
M	go.mod
Falling back to patching base and 3-way merge...
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Expose init offset and schedule metrics for ConsumerGroup reconciler
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-v1.10

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-cherrypick-robot

@pierDipi: #790 failed to apply on top of branch "release-v1.11":

Applying: Expose init offset and schedule metrics for ConsumerGroup reconciler
Using index info to reconstruct a base tree...
M	control-plane/pkg/reconciler/consumergroup/consumergroup_test.go
M	go.mod
Falling back to patching base and 3-way merge...
Auto-merging go.mod
CONFLICT (content): Merge conflict in go.mod
Auto-merging control-plane/pkg/reconciler/consumergroup/consumergroup_test.go
error: Failed to merge in the changes.
hint: Use 'git am --show-current-patch=diff' to see the failed patch
Patch failed at 0001 Expose init offset and schedule metrics for ConsumerGroup reconciler
When you have resolved this problem, run "git am --continue".
If you prefer to skip this patch, run "git am --skip" instead.
To restore the original branch and stop patching, run "git am --abort".

In response to this:

/cherry-pick release-v1.11

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

openshift-merge-robot pushed a commit that referenced this pull request Aug 22, 2023
…790) (#791)

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
openshift-merge-robot pushed a commit that referenced this pull request Sep 20, 2023
* Make SecretSpec field of consumers Auth omitempty (#780)

* Expose init offset and schedule metrics for ConsumerGroup reconciler (#790) (#791)

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Fix channel finalizer logic (knative-extensions#3295) (#795)

Signed-off-by: Calum Murray <cmurray@redhat.com>
Co-authored-by: Calum Murray <cmurray@redhat.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* [release-v1.10] SRVKE-958: Cache init offsets results (#817)

* Cache init offsets results

When there is high load and multiple consumer group schedule calls,
we get many `dial tcp 10.130.4.8:9092: i/o timeout` errors when
trying to connect to Kafka.
This leads to increased "time to readiness" for consumer groups.

The downside of caching is that, in the case, partitions
increase while the result is cached we won't initialize
the offsets of the new partitions.

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Add autoscaler leader log patch

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>

---------

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Scheduler handle overcommitted pods (#820)

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Set consumer and consumergroups finalizers when creating them (#823)

It is possible that a delete consumer or consumergroup might
be reconciled and never finalized when it is deleted before
the finalizer is set.
This happens because the Knative generated reconciler uses
patch (as opposed to using update) for setting the finalizer
and patch doesn't have any optimistic concurrency controls.

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Clean up reserved from resources that have been scheduled (#830)

In a recent testing run, we've noticed we have have a scheduled
`ConsumerGroup` [1] (see placements) being considered having
reserved replicas in a different pod [2].

That makes the scheduler think that there is no space but the
autoscaler says we have enough space to hold every virtual
replica.

[1]
```
$ k describe consumergroups -n ks-multi-ksvc-0 c9ee3490-5b4b-4d11-87af-8cb2219d9fe3
Name:         c9ee3490-5b4b-4d11-87af-8cb2219d9fe3
Namespace:    ks-multi-ksvc-0
...
Status:
  Conditions:
    Last Transition Time:  2023-09-06T19:58:27Z
    Reason:                Autoscaler is disabled
    Status:                True
    Type:                  Autoscaler
    Last Transition Time:  2023-09-06T21:41:13Z
    Status:                True
    Type:                  Consumers
    Last Transition Time:  2023-09-06T19:58:27Z
    Status:                True
    Type:                  ConsumersScheduled
    Last Transition Time:  2023-09-06T21:41:13Z
    Status:                True
    Type:                  Ready
  Observed Generation:     1
  Placements:
    Pod Name:      kafka-source-dispatcher-6
    Vreplicas:     4
    Pod Name:      kafka-source-dispatcher-7
    Vreplicas:     4
  Replicas:        8
  Subscriber Uri:  http://receiver5-2.ks-multi-ksvc-0.svc.cluster.local
Events:            <none>
```

[2]
```
    "ks-multi-ksvc-0/c9ee3490-5b4b-4d11-87af-8cb2219d9fe3": {
      "kafka-source-dispatcher-3": 8
    },
```

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Co-authored-by: Pierangelo Di Pilato <pierdipi@redhat.com>

* Ignore unknown fields in data plane contract (knative-extensions#3335) (#828)

Signed-off-by: Calum Murray <cmurray@redhat.com>

---------

Signed-off-by: Pierangelo Di Pilato <pierdipi@redhat.com>
Signed-off-by: Calum Murray <cmurray@redhat.com>
Co-authored-by: Martin Gencur <mgencur@redhat.com>
Co-authored-by: Matthias Wessendorf <mwessend@redhat.com>
Co-authored-by: Calum Murray <cmurray@redhat.com>
Co-authored-by: OpenShift Cherrypick Robot <openshift-cherrypick-robot@redhat.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants