-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
*: make conversions between string
and usernames more explicit
#55398
Conversation
NB: I'm not this change needs to be backported; however:
@@ -849,7 +849,7 @@ func (sc *SchemaChanger) rollbackSchemaChange(ctx context.Context, err error) er
// Queue a GC job.
jobRecord := CreateGCJobRecord(
"ROLLBACK OF "+sc.job.Payload().Description,
- sc.job.Payload().Description,
+ sc.job.Payload().UsernameProto.Decode(),
jobspb.SchemaChangeGCDetails{
Tables: []jobspb.SchemaChangeGCDetails_DroppedID{
{
Thanks |
dcdbfae
to
a3d6b20
Compare
IMO none of the userfile implications are severe enough to call for a 20.2.0 backport -- worse-case you get an error if the default table name picked for your user is non-conforming, right? and then you just try again with a a valid table name explicitly specified? That seems slightly annoying but IMO not a release blocker. |
Thanks David! (Still interested in the implication for the schema change diff but maybe that's more a question for Andrew) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good modulo @solongordon 's feedback. I like pulling this principles into the security
package as it's great hygiene and will help with tracking usage as @knz mentioned.
I have a nit around zero length usernames that I'd like others thoughts on. It shouldn't block this PR but I think it's worth discussing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a nit around zero length usernames that I'd like others thoughts on. It shouldn't block this PR but I think it's worth .
And it looks like reviewable ate my nit comment. I promise it was better written the first time.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @knz, @miretskiy, and @solongordon)
pkg/security/username.go, line 119 at r1 (raw file):
func MakeSQLUsernameFromUserInput(u string, purpose UsernamePurpose) (res SQLUsername, err error) { // Perform case folding and NFC normalization. res.u = lex.NormalizeName(u)
Nit: We still allow zero length usernames to be used within the application. Do we need/want to support this going forward?
It looks like NormalizeName will happily return a SQLUsername{""} if provided a zero length string and while we check length during user creation as part of the regex in ValidateForCreation() we don't check it outside of this case.
In addition Undefined() returns true
when provided a zero length SQLUsername meaning that user1.Undefined() == user2.Undefine()
can evaluate true
which unsettles the math student in me.
Do we use/rely on this being able to accommodate a zero length field within the application? Given how many of the string tests we run will evaluate to vacuously true this makes me queasy. Would it be unreasonable to check min/max length in the constructor after normalization to protect against this case if we don't actively use this as a zero length value?
I agree the validation function should check the username is non-empty. I will make this change. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change looks good, and thanks for laying the foundations for #54696 (among other issues). I left mostly minor comments and a couple questions (mostly for my understanding). I'll let @solongordon have the final review on this, though.
Reviewed 158 of 231 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @knz, @miretskiy, and @solongordon)
pkg/cli/cli_test.go, line 1392 at r1 (raw file):
// cert create-client ,foo // ERROR: failed to generate client certificate and key: username ",foo" invalid // SQLSTATE: 42602
Not sure if these comments have any bearing on the tests, but was this change intentional?
pkg/security/username.go, line 128 at r1 (raw file):
// UsernamePurpose indicates the purpose of the resulting // SQLUsername in MakeSQLUsernameFromUserInput. type UsernamePurpose bool
nit: int (and iota below) instead of bool maybe?
pkg/security/username.go, line 180 at r1 (raw file):
// a canonical username. The caller of this promises that the // argument is pre-normalized. func MakeSQLUsernameFromPreNormalizedString(u string) SQLUsername {
Can we check the "pre-normalization" explicitly here instead of relying on the caller knowing what they're doing?
pkg/security/username.go, line 219 at r1 (raw file):
// EncodeProto turns a username into its proto representation. func (u SQLUsername) EncodeProto() SQLUsernameProto { return SQLUsernameProto(u.u) }
super nit: my IDE is complaining about receiver names being different, can we change this to s
?
pkg/server/testserver.go, line 71 at r1 (raw file):
// TestUser is a fixed user used in unittests. // It has valid embedded client certs. TestUser = security.TestUser
Can we get rid of this completely? I tried replacing all usages of server.TestUser
with security.TestUser
and didn't run into any dependency errors, so I wonder if we no longer need this.
pkg/server/serverpb/status.proto, line 547 at r1 (raw file):
// Username of the user making this request. // The caller is responsible to normalize the username // (= case fold and perform unicode NFC normalization).
Instead of relying on the caller, can we be a bit more prescriptive about this without breaking stuff?
What if we replace the type of this to be SQLUsernameProto
instead of string
, and enforce that the username is normalized in the EncodeProto
and Decode
methods?
pkg/sql/grant_revoke.go, line 65 at r1 (raw file):
// REASSIGN / OWNER TO do normalize. // Related: https://github.com/cockroachdb/cockroach/issues/54696 grantees := make([]security.SQLUsername, len(n.Grantees))
What's the reason to consider not normalizing here? Backwards compat?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @arulajmani, @miretskiy, and @solongordon)
pkg/cli/cli_test.go, line 1392 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Not sure if these comments have any bearing on the tests, but was this change intentional?
yes, the comment is the test (expected test output)
the cert
commands are not SQL client commands, so having a SQLSTATE for them is nonsensical
pkg/security/username.go, line 119 at r1 (raw file):
We still allow zero length usernames to be used within the application.
No we don't. Added the check here.
It looks like NormalizeName will happily return a SQLUsername{""} if provided a zero length string and while we check length during user creation as part of the regex in ValidateForCreation() we don't check it outside of this case.
Done
In addition Undefined() returns true when provided a zero length SQLUsername meaning that user1.Undefined() == user2.Undefine() can evaluate true which unsettles the math student in me.
These comparisons do not occur in the source code.
Do we use/rely on this being able to accommodate a zero length field within the application?
No
pkg/security/username.go, line 128 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
nit: int (and iota below) instead of bool maybe?
Nah, as long as there are just 2 values bool just works fine.
pkg/security/username.go, line 180 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Can we check the "pre-normalization" explicitly here instead of relying on the caller knowing what they're doing?
Discussed off the issue: I don't think we want this for performance reasons. The usernames fly over the wire for every SQL query.
pkg/security/username.go, line 219 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
super nit: my IDE is complaining about receiver names being different, can we change this to
s
?
Done.
pkg/server/testserver.go, line 71 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Can we get rid of this completely? I tried replacing all usages of
server.TestUser
withsecurity.TestUser
and didn't run into any dependency errors, so I wonder if we no longer need this.
Done.
pkg/server/serverpb/status.proto, line 547 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Instead of relying on the caller, can we be a bit more prescriptive about this without breaking stuff?
What if we replace the type of this to be
SQLUsernameProto
instead ofstring
, and enforce that the username is normalized in theEncodeProto
andDecode
methods?
Discussed this out of review: I am not keen to force checks in the proto conversion, for performance reason.
However, I think we can be prescriptive here by adding a check when receiving an API request. Done.
pkg/sql/grant_revoke.go, line 65 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
What's the reason to consider not normalizing here? Backwards compat?
Fixed.
4dde711
to
a74a778
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RFAL
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @arulajmani, @miretskiy, and @solongordon)
a74a778
to
691f128
Compare
Had to split package |
9bb4a52
to
1b7dc43
Compare
1b7dc43
to
ca1a0c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also update the commit message to include a reference to MakeSQLUsernameFromPreNormalizedStringChecked
as well?
Reviewed 6 of 69 files at r2, 53 of 88 files at r3, 11 of 15 files at r4.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @aaron-crl, @arulajmani, @knz, @miretskiy, and @solongordon)
pkg/sql/grant_revoke.go, line 65 at r1 (raw file):
Previously, knz (kena) wrote…
Fixed.
Which layer was changed to fix this, I wasn't fully able to follow what happened here.
Also, could you update the TODO (above and below)
pkg/sql/grant_revoke.go, line 163 at r4 (raw file):
if _, ok := users[grantee]; !ok { sqlName := tree.Name(n.grantees[i].Normalized()) return errors.Errorf("user or role %s does not exist", &sqlName)
nit: you can just use grantee.Normalized()
here directly instead of this sqlName stuff
6d88855
to
414f647
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @aaron-crl, @arulajmani, @miretskiy, and @solongordon)
pkg/sql/grant_revoke.go, line 65 at r1 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
Which layer was changed to fix this, I wasn't fully able to follow what happened here.
Also, could you update the TODO (above and below)
Sorry I did not change anything here. So the TODO stands.
All in all this PR pushes the ball down the road on this question, which is why the linked issue remains open for now.
pkg/sql/grant_revoke.go, line 163 at r4 (raw file):
Previously, arulajmani (Arul Ajmani) wrote…
nit: you can just use
grantee.Normalized()
here directly instead of this sqlName stuff
No that's incorrect: tree.Name has different quoting rules, that of identifiers. for example the username foo-bar
would quote as "foo-bar"
. I did not want to change this behavior in this PR.
done |
96dd09d
to
a3ae4e2
Compare
tldr: the conversions between "external" strings and internal usernames was unprincipled, and it turns out, incorrect in some cases. This patch cleans this up by introducing a strict conversion API. **Background** CockroachDB currently performs case folding and unicode NFC normalization upon receiving usernames, specifically usernames received from as SQL login principals. Internally, usernames are often—but not always—considered pre-normalized for the purpose of comparisons, privilege checks, role membership checks and the like. Finally, sometimes usernames are reported "back to outside". In error messages, log files etc, but also: - in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc. - to generate OS-level file names for exported files. **New API** This patch introduces a new data type `security.SQLUsername`. It is incomparable and non-convertible with the Go `string`. Engineers must now declare intent about how to do the conversion: - `security.MakeSQLUsernameFromUserInput` converts an "outside" string to a username. - `security.MakeSQLUsernameFromPreNormalizedString` promises that its argument has already been previously normalized. - `security.MakeSQLUsernameFromPreNormalizedStringChecked` also checks that its argument has already been previously normalized. To output usernames, the following APIs are also available. - `username.Normalized()` produces the username itself, without decorations. These corresponds to the raw string (after normalization). - `username.SQLIdentifier()` produces the username in valid SQL syntax, so that it can be injected safely in a SQL statement. - `(*tree.FmtCtx).FormatUsername()` takes a username and properly handles quoting and anonymization, like `FormatName()` does for `tree.Name` already. Likewise, conversion from/to protobuf is now regulated, via the new APIs `username.EncodeProto()` and `usernameproto.Decode()`. **Problems being solved** - the usernames "from outside" were normalized sometimes, *but not consistently*: 1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was not changed. 2. they were not consistently converted in `cockroach cert`. This was corrected. 3. they were not in the `cockroach userfile` commands. This has been adjusted with a reference to issue cockroachdb#55389. 4. they are *not* in GRANT/REVOKE. This patch does not change this behavior, but highlights it by spelling out `MakeSQLUsernameFromPreNormalizedString()` in the implementation. 5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO 6. they were in the argument to `LoginRequest`. This was not changed. 7. they were not in the argument of the other API requests that allow usernames, for example `ListSessions` or `CancelQuery`. This was not changed, but is now documented in the API. - the usernames "inside" were incorrectly directly injected in SQL statements, even though they may contain special characters that create SQL syntax errors. This has been corrected by suitable uses of the new `SQLIdentifier()` method. - There was an outright bug in a call to `CreateGCJobRec` (something about GCing jobs), where a `Description` field was passed in lieu of a username for a `User` field. The implications of this are unclear. **Status after this change** The new API makes it possible to audit exactly where "sensitive" username/string conversion occurs. After this patch, we find the following uses: - `MakeSQLUsernameFromUserInput`: - pgwire user auth - CLI URL parsing - `cockroach userfile` - `cockroach cert` - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing) - CREATE/DROP/ALTER ROLE - when using literal strings as `role_spec` in the SQL grammar - `MakeSQLUsernameFromPreNormalizedString`: - role membership checks inside SQL based on data read from `system` tables. - in GRANT/REVOKE (this is surprising, see above) - `MakeSQLUsernameFromPreNormalizedStringChecked`: - when intepreting the username in API query parameters, for those API documented as using pre-normalized usernames. Release note: None
a3ae4e2
to
1d05979
Compare
bors r=aaron-crl,arulajmani |
Build succeeded: |
We picked up a dependency on pkg/security within the parser package after cockroachdb#55398. We have to pin go dependencies by hand if they're only present in auto-generated code. It's because it's not otherwise visible to bazel/gazelle when generating the BUILD files (during the analysis phase). Release note: None
56064: storage: Bring back MaxSyncDuration env var, fix disk-stalled roachtest r=itsbilal a=itsbilal The disk-stalled roachtest relies on the ability to control disk stall detection / fatal intervals, as charybdefs only injects 50ms of delay per syscall. This change adds an env variable, similar to the one removed in #55186 to set max sync duration, except now it governs the default of the cluster setting. The roachtest now modifies that env variable to let disk stall detection trip on short syscall delays. Fixes #54332. Release note: None. 56092: parser: pin pkg/security as a bazel dep for pkg/parser r=irfansharif a=irfansharif We picked up a dependency on pkg/security within the parser package after #55398. We have to pin go dependencies by hand if they're only present in auto-generated code. It's because it's not otherwise visible to bazel/gazelle when generating the BUILD files (during the analysis phase). Release note: None 56095: partialidx: add benchmarks for two-variable comparisons r=RaduBerinde a=mgartner Two-variable comparison implication performs similarly to other types of implications. BenchmarkImplicator/single-exact-match-16 76.5 ns/op BenchmarkImplicator/single-inexact-match-16 342 ns/op BenchmarkImplicator/range-inexact-match-16 782 ns/op BenchmarkImplicator/two-var-comparison-16 302 ns/op BenchmarkImplicator/single-exact-match-extra-filters-16 310 ns/op BenchmarkImplicator/single-inexact-match-extra-filters-16 609 ns/op BenchmarkImplicator/multi-column-and-exact-match-16 82.4 ns/op BenchmarkImplicator/multi-column-and-inexact-match-16 722 ns/op BenchmarkImplicator/multi-column-and-two-var-comparisons-16 611 ns/op BenchmarkImplicator/multi-column-or-exact-match-16 76.1 ns/op BenchmarkImplicator/multi-column-or-exact-match-reverse-16 595 ns/op BenchmarkImplicator/multi-column-or-inexact-match-16 1081 ns/op BenchmarkImplicator/in-implies-or-16 976 ns/op BenchmarkImplicator/and-filters-do-not-imply-pred-16 3710 ns/op BenchmarkImplicator/or-filters-do-not-imply-pred-16 917 ns/op BenchmarkImplicator/many-columns-exact-match10-16 296 ns/op BenchmarkImplicator/many-columns-inexact-match10-16 6853 ns/op BenchmarkImplicator/many-columns-exact-match100-16 19817 ns/op BenchmarkImplicator/many-columns-inexact-match100-16 447894 ns/op Release note: None Co-authored-by: Bilal Akhtar <bilal@cockroachlabs.com> Co-authored-by: irfan sharif <irfanmahmoudsharif@gmail.com> Co-authored-by: Marcus Gartner <marcus@cockroachlabs.com>
Informs (but does not fix) #55396
Informs #54696
Informs #55389
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.
Background
CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.
Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.
Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:
New API
This patch introduces a new data type
security.SQLUsername
.It is incomparable and non-convertible with the Go
string
.Engineers must now declare intent about how to do the conversion:
security.MakeSQLUsernameFromUserInput
converts an "outside" stringto a username.
security.MakeSQLUsernameFromPreNormalizedString
promises thatits argument has already been previously normalized.
security.MakeSQLUsernameFromPreNormalizedStringChecked
also checksthat its argument has already been previously normalized.
To output usernames, the following APIs are also available.
username.Normalized()
produces the username itself, withoutdecorations. These corresponds to the raw string (after
normalization).
username.SQLIdentifier()
produces the username in validSQL syntax, so that it can be injected safely in a SQL
statement.
(*tree.FmtCtx).FormatUsername()
takes a username and properlyhandles quoting and anonymization, like
FormatName()
does fortree.Name
already.Likewise, conversion from/to protobuf is now regulated, via the new
APIs
username.EncodeProto()
andusernameproto.Decode()
.Problems being solved
the usernames "from outside" were normalized sometimes, but not
consistently:
not changed.
cockroach cert
. This wascorrected.
cockroach userfile
commands. Thishas been adjusted with a reference to issue importccl: the userfile table name generation algorithm breaks with special usernames, and confused client-side and server-side usernames #55389.
this behavior, but highlights it by spelling out
MakeSQLUsernameFromPreNormalizedString()
in the implementation.LoginRequest
. This was notchanged.
allow usernames, for example
ListSessions
orCancelQuery
.This was not changed, but is now documented in the API.
the usernames "inside" were incorrectly directly injected
in SQL statements, even though they may contain special
characters that create SQL syntax errors.
This has been corrected by suitable uses of the new
SQLIdentifier()
method.There was an outright bug in a call to
CreateGCJobRec
(somethingabout GCing jobs), where a
Description
field was passed in lieuof a username for a
User
field. The implications of this are unclear.Status after this change
The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:
MakeSQLUsernameFromUserInput
:cockroach userfile
cockroach cert
(*rpc.SecurityContext).PGURL()
(unsure whether that's a good thing)role_spec
in the SQL grammarMakeSQLUsernameFromPreNormalizedString
:from
system
tables.MakeSQLUsernameFromPreNormalizedStringChecked
:those API documented as using pre-normalized usernames.
Release note: None