Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

*: avoid notify privilege update for all users #57042

Open
wants to merge 34 commits into
base: master
Choose a base branch
from

Conversation

tiancaiamao
Copy link
Contributor

@tiancaiamao tiancaiamao commented Oct 31, 2024

What problem does this PR solve?

Issue Number: ref #55563

Problem Summary:

In the previous commit, I have maintained the active user lists, this commit intend to fix the notify part.
When privilege change, just notify the changed users and update data for them, instead of all the users.

What changed and how does it work?

The code changes including:

  • notify privilege update will encoding the changed user list into the etcd message
  • the domain privilege loop will decode the user list and update only the active users among them
  • fix bug on the privilege handle merge operation, the old "append diff + sort + dedup" can not handle the reovke privilege operation
  • the roles of a user is also consider as the privilege data and ensureActiveUse() need to load them
  • ensureActiveUser() should load the privilege data of the user, the roles of the data, plus the data of roles recursively
  • ensureActiveUser() is called more widely now to make the CI pass (we'd better optimize here later)

Check List

Tests

  • Unit test

  • Integration test

  • Manual test (add detailed scripts or steps below)

  • No need to test

    • I checked and no code files have been changed.

Side effects

  • Performance regression: Consumes more CPU
  • Performance regression: Consumes more Memory
  • Breaking backward compatibility

Documentation

  • Affects user behaviors
  • Contains syntax changes
  • Contains variable changes
  • Contains experimental features
  • Changes MySQL compatibility

Release note

Please refer to Release Notes Language Style Guide to write a quality release note.

None

@ti-chi-bot ti-chi-bot bot added release-note-none Denotes a PR that doesn't merit a release note. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Oct 31, 2024
Copy link

tiprow bot commented Oct 31, 2024

Hi @tiancaiamao. Thanks for your PR.

PRs from untrusted users cannot be marked as trusted with /ok-to-test in this repo meaning untrusted PR authors can never trigger tests themselves. Collaborators can still trigger tests on the PR using /test all.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 75.27675% with 67 lines in your changes missing coverage. Please review.

Project coverage is 74.8968%. Comparing base (ad5ca42) to head (99cd464).
Report is 27 commits behind head on master.

Additional details and impacted files
@@               Coverage Diff                @@
##             master     #57042        +/-   ##
================================================
+ Coverage   72.8329%   74.8968%   +2.0639%     
================================================
  Files          1676       1721        +45     
  Lines        463539     482927     +19388     
================================================
+ Hits         337609     361697     +24088     
+ Misses       105125      98841      -6284     
- Partials      20805      22389      +1584     
Flag Coverage Δ
integration 49.2693% <72.3247%> (?)
unit 72.1978% <73.4317%> (-0.0215%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
dumpling 52.7673% <ø> (ø)
parser ∅ <ø> (∅)
br 60.9041% <0.0000%> (+15.5474%) ⬆️
---- 🚨 Try these New Features:

@ti-chi-bot ti-chi-bot bot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 4, 2024
@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Nov 5, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/test check-dev2

Copy link

tiprow bot commented Nov 7, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test check-dev2

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/test pull-br-integration-test

Copy link

tiprow bot commented Nov 8, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test pull-br-integration-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Nov 20, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

pkg/domain/domain.go Outdated Show resolved Hide resolved
pkg/domain/domain.go Show resolved Hide resolved
pkg/domain/domain.go Show resolved Hide resolved
@tiancaiamao
Copy link
Contributor Author

/retest

Copy link

tiprow bot commented Nov 21, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/retest

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@tiancaiamao
Copy link
Contributor Author

/test unit-test

Copy link

tiprow bot commented Nov 21, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test unit-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Contributor

@D3Hunter D3Hunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rest lgtm

Comment on lines +1818 to +1831
if len(val) > 0 {
err := json.Unmarshal(val, &msg)
if err == nil {
break
}
logutil.BgLogger().Warn("decodePrivilegeEvent unmarshal fail", zap.Error(err))
}
}
}
// In case something is wrong, for example, old version tidb mixed with newer, the unmarshal would fail.
// Then we fallback to the old way: reload all the users.
if len(msg.UserList) == 0 {
msg.All = true
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if len(val) > 0 {
err := json.Unmarshal(val, &msg)
if err == nil {
break
}
logutil.BgLogger().Warn("decodePrivilegeEvent unmarshal fail", zap.Error(err))
}
}
}
// In case something is wrong, for example, old version tidb mixed with newer, the unmarshal would fail.
// Then we fallback to the old way: reload all the users.
if len(msg.UserList) == 0 {
msg.All = true
}
if len(val) > 0 {
err := json.Unmarshal(val, &msg)
if err == nil {
break
}
logutil.BgLogger().Warn("decodePrivilegeEvent unmarshal fail", zap.Error(err))
} else {
// In case old version triggers the event, the event value is empty,
// Then we fall back to the old way: reload all the users.
if len(msg.UserList) == 0 {
msg.All = true
}
}
}
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This differs from the old logic. @D3Hunter
What happen if something is wrong? if val does have data, but the data is not what we expected?
Neither the UserList contains data nor the ALL flag is set.
Our expected case is reload all on error to tolerance the cases, but the suggested change here is ignore on error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if something is wrong? if val does have data, but the data is not what we expected?

if no third-part involved, older version tidb write a empty value, new version write a valid json, so when will the data be invalid? If it's caused by bug, then let's fix it, else I think the event consumer should do as what the producer told it to do. and maybe you can add a intest.Assert to check that either All=true or the len(userList) > 0, the producer side should make sure no such invalid data are sent to ETCD

and as you have mentioned, the event is not that critical, i think it's acceptable

Copy link
Collaborator

@lcwangchao lcwangchao Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happen if something is wrong? if val does have data, but the data is not what we expected?

if no third-part involved, older version tidb write a empty value, new version write a valid json, so when will the data be invalid? If it's caused by bug, then let's fix it, else I think the event consumer should do as what the producer told it to do. and maybe you can add a intest.Assert to check that either All=true or the len(userList) > 0, the producer side should make sure no such invalid data are sent to ETCD

and as you have mentioned, the event is not that critical, i think it's acceptable

I agree that we should ignore the message and skip to update privilege cache when failed to decode the json. We only need the handle to case when val == "" to be compatible with old versions (please add some clear comments to explain why we have this logic). And because we have a loop to guarantee the cache should be updated in a certain time, it is not a big deal.

We can also generate some warnings logs when meet a invalid json.

pkg/domain/domain.go Show resolved Hide resolved
pkg/privilege/privileges/cache.go Outdated Show resolved Hide resolved
pkg/privilege/privileges/cache.go Outdated Show resolved Hide resolved
tiancaiamao and others added 2 commits November 21, 2024 17:43
Co-authored-by: D3Hunter <jujj603@gmail.com>
@tiancaiamao
Copy link
Contributor Author

/test unit-test

Copy link

tiprow bot commented Nov 21, 2024

@tiancaiamao: Cannot trigger testing until a trusted user reviews the PR and leaves an /ok-to-test message.

In response to this:

/test unit-test

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

Copy link
Collaborator

@lcwangchao lcwangchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

ti-chi-bot bot commented Nov 22, 2024

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: lcwangchao
Once this PR has been reviewed and has the lgtm label, please assign bornchanger, d3hunter for approval, ensuring that each of them provides their approval before proceeding. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ti-chi-bot ti-chi-bot bot added the needs-1-more-lgtm Indicates a PR needs 1 more LGTM. label Nov 22, 2024
Copy link

ti-chi-bot bot commented Nov 22, 2024

[LGTM Timeline notifier]

Timeline:

  • 2024-11-22 03:22:10.09138556 +0000 UTC m=+174717.711040075: ☑️ agreed by lcwangchao.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs-1-more-lgtm Indicates a PR needs 1 more LGTM. release-note-none Denotes a PR that doesn't merit a release note. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants