Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

importccl: the userfile table name generation algorithm breaks with special usernames, and confused client-side and server-side usernames #55389

Closed
knz opened this issue Oct 9, 2020 · 4 comments · Fixed by #56046
Assignees
Labels
A-disaster-recovery A-security C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.

Comments

@knz
Copy link
Contributor

knz commented Oct 9, 2020

This code:

 qualifiedTableName = DefaultQualifiedNamePrefix + user

(external_storage.go)

breaks when user starts with digits, or when it contains periods. Username consisting of only digits, or containing periods, have been allowed since crdb v19.2.

The code should instead quote the username properly.

@knz knz added C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption. A-disaster-recovery labels Oct 9, 2020
@knz knz self-assigned this Oct 9, 2020
@knz
Copy link
Contributor Author

knz commented Oct 10, 2020

Additionally, the userfile CLI commands takes the username in the connection URL, and uses it to assemble a destination file name in the COPY IN statement in SQL.

This is conceptually mistaken, because the client-side username may not be the same as the server-side username.

The code should be modified to extract the server-side username using the session_user session variable.

@knz knz added the A-security label Oct 10, 2020
@knz knz changed the title importccl: the userfile table name generation algorithm breaks with special usernames importccl: the userfile table name generation algorithm breaks with special usernames, and confused client-side and server-side usernames Oct 10, 2020
knz added a commit to knz/cockroach that referenced this issue Oct 10, 2020
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue cockroachdb#55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.
  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

Release note: None
@mwang1026 mwang1026 assigned adityamaru and unassigned miretskiy Oct 13, 2020
@knz
Copy link
Contributor Author

knz commented Oct 20, 2020

Had some exchange with @dt on this:

@knz the logic creates invalid table names / identifiers given some input usernames. Suppose I was a user with such a special username. Can you enumerate exactly what would break for me? What would I not be able to do?

@dt I think you’d not be able to use userfile without supplying the optional table name argument. We default to a username-based table name but you can specify any table if you want.

@knz what is the lifetime of "userfile" data? Is it transient? Does it last for more than a single SQL statement or job?

@dt Yes, we do not use temp tables or anything with a lifetime tied to a session or anything, since we upload in a separate session and jobs survive eg restarts. Currently only the user chooses when / if to delete userfile data, but 21.1 will include a cockroach import path-to-file.sql which would automate a) calling upload b) sending the IMPORT stmt of the uploaded file c) rm’ing the uploaded file.

@knz the previous question is really a proxy for this one: if I were to change the name generation algorithm, do I need to care about cross-version compatibility?

@dt Yeah, it would be nice if no-name-provided file uploads were still findable at the default name in a future version but if we have to break it for everyone and go to a totally new name format, I wouldn’t be devastated

@knz is the name embedded inside the job payload? so that if we change the gen algo, all current and past jobs are still attached to the right name?

@dt Yeah, gen algo should be nearly entirely at cli -- my mental model is that it is basically the default value for a cli flag but that substitution should happen very early and use the actual value after that

@knz
Copy link
Contributor Author

knz commented Oct 20, 2020

Ok so a good next step here would be:

  1. factor all the code paths that create a table name from a username inside a helper function (I counted at least 4 occurrences, there may be more?)

  2. define a new algorithm that works OK for any username, but does a best effort at preserving compatibility with the current one.

I am thinking about an algorithm that looks like this:

  1. compose "userfile_" + username.
  2. is the result a valid unquoted SQL identifier and it is normalized already? If yes, then keep it.
  3. otherwise, compose "userfilex_" + fmt.Sprintf("%x", username) and use that

This would escape the entire username to hexadecimal if not a lowercase normalized identifier already.

Examples:

Username Userfile table name
root userfile_root
ro.ot userfilex_726f6f74
726f6f74 userfile_726f6f74

Regardless of the algorithm used, the end result must ensure that:

  • if 2 usernames are different, their mapping to a userfile table are also different.
  • the set of allowable characters in usernames can grow/evolve without requiring changes to the bulk i/o code

The hexification of the username combined with a x_ guarantee this. Other schemes may also be possible.

@adityamaru
Copy link
Contributor

Thanks for this @knz, I haven't been able to get to this yet but I'm clear on the behaviour we expect now. I'll write up a PR and add you to it!

knz added a commit to knz/cockroach that referenced this issue Oct 22, 2020
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue cockroachdb#55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.
  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

Release note: None
knz added a commit to knz/cockroach that referenced this issue Oct 24, 2020
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue cockroachdb#55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.
  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

Release note: None
knz added a commit to knz/cockroach that referenced this issue Oct 25, 2020
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue cockroachdb#55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.
  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

Release note: None
knz added a commit to knz/cockroach that referenced this issue Oct 27, 2020
tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.
- `security.MakeSQLUsernameFromPreNormalizedStringChecked` also checks
  that its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue cockroachdb#55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

- `MakeSQLUsernameFromPreNormalizedStringChecked`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.

Release note: None
craig bot pushed a commit that referenced this issue Oct 27, 2020
55398:  *: make conversions between `string` and usernames more explicit  r=aaron-crl,arulajmani a=knz

Informs (but does not fix) #55396

Informs #54696
Informs  #55389

tldr: the conversions between "external" strings and internal
usernames was unprincipled, and it turns out, incorrect in some
cases. This patch cleans this up by introducing a strict conversion
API.

**Background**

CockroachDB currently performs case folding and unicode NFC
normalization upon receiving usernames, specifically usernames received
from as SQL login principals.

Internally, usernames are often—but not always—considered
pre-normalized for the purpose of comparisons, privilege checks, role
membership checks and the like.

Finally, sometimes usernames are reported "back to outside".
In error messages, log files etc, but also:

- in the SQL syntax produced by SHOW CREATE, SHOW SYNTAX etc.
- to generate OS-level file names for exported files.

**New API**

This patch introduces a new data type `security.SQLUsername`.
It is incomparable and non-convertible with the Go `string`.

Engineers must now declare intent about how to do the conversion:

- `security.MakeSQLUsernameFromUserInput` converts an "outside" string
  to a username.
- `security.MakeSQLUsernameFromPreNormalizedString` promises that
  its argument has already been previously normalized.
- `security.MakeSQLUsernameFromPreNormalizedStringChecked` also checks
  that its argument has already been previously normalized.

To output usernames, the following APIs are also available.

- `username.Normalized()` produces the username itself, without
  decorations. These corresponds to the raw string (after
  normalization).

- `username.SQLIdentifier()` produces the username in valid
  SQL syntax, so that it can be injected safely in a SQL
  statement.

- `(*tree.FmtCtx).FormatUsername()` takes a username and properly
  handles quoting and anonymization, like `FormatName()` does for
  `tree.Name` already.

Likewise, conversion from/to protobuf is now regulated, via the new
APIs `username.EncodeProto()` and `usernameproto.Decode()`.

**Problems being solved**

- the usernames "from outside" were normalized sometimes, *but not
  consistently*:

  1. they were in the arguments of CREATE/DROP/ALTER ROLE. This was
     not changed.
  2. they were not consistently converted in `cockroach cert`. This was
     corrected.
  3. they were not in the `cockroach userfile` commands. This
     has been adjusted with a reference to issue #55389.
  4. they are *not* in GRANT/REVOKE. This patch does not change
     this behavior, but highlights it by spelling out
     `MakeSQLUsernameFromPreNormalizedString()` in the implementation.
  5. ditto for CREATE SCHEMA ... AUTHORIZATION and ALTER ... OWNER TO
  6. they were in the argument to `LoginRequest`. This was not
     changed.
  7. they were not in the argument of the other API requests that
     allow usernames, for example `ListSessions` or `CancelQuery`.
     This was not changed, but is now documented in the API.

- the usernames "inside" were incorrectly directly injected
  in SQL statements, even though they may contain special
  characters that create SQL syntax errors.

  This has been corrected by suitable uses of the new
  `SQLIdentifier()` method.

- There was an outright bug in a call to `CreateGCJobRec` (something
  about GCing jobs), where a `Description` field was passed in lieu
  of a username for a `User` field. The implications of this are unclear.

**Status after this change**

The new API makes it possible to audit exactly where "sensitive"
username/string conversion occurs. After this patch, we find the
following uses:

- `MakeSQLUsernameFromUserInput`:

  - pgwire user auth
  - CLI URL parsing
  - `cockroach userfile`
  - `cockroach cert`
  - `(*rpc.SecurityContext).PGURL()` (unsure whether that's a good thing)
  - CREATE/DROP/ALTER ROLE
  - when using literal strings as `role_spec` in the SQL grammar

- `MakeSQLUsernameFromPreNormalizedString`:

  - role membership checks inside SQL based on data read
    from `system` tables.
  - in GRANT/REVOKE (this is surprising, see above)

- `MakeSQLUsernameFromPreNormalizedStringChecked`:

  - when intepreting the username in API query parameters, for
    those API documented as using pre-normalized usernames.

Release note: None

55986: opt: use paired joins for left semi inverted joins r=sumeerbhola a=sumeerbhola

Semi joins using the inverted index used to be converted
to an inner inverted join followed by an inner lookup join
and a distinct for de-duplication. They are now converted
to paired-joins consisting of an inner inverted join
followed by a left semi lookup join, which is more
efficient.

Release note: None

55997: sql: fix object lookup for ALTER TABLE ... RENAME TO ... r=ajwerner a=jayshrivastava

Previously, ALTER TABLE schema.table RENAME TO schema.existing_table
would fail because the statement would look up the destination table in
the public schema only. This commit fixes this behavior by using
the correct destination schema.

Fixes #55985

Release note: None


Co-authored-by: Raphael 'kena' Poss <knz@thaumogen.net>
Co-authored-by: sumeerbhola <sumeer@cockroachlabs.com>
Co-authored-by: Jayant Shrivastava <jayants@cockroachlabs.com>
adityamaru added a commit to adityamaru/cockroach that referenced this issue Oct 28, 2020
Previously, userfile would break if a user had a username with special
characters, which is otherwise supported by CRDB.
This is because, unless specified, userfile uses the username to
generate the underlying storage table names.

This change introduces a new name generation scheme which accounts for
all usernames supported by the database. The details are explained in:
cockroachdb#55389 (comment)

Fixes: cockroachdb#55389

Release note: None
adityamaru added a commit to adityamaru/cockroach that referenced this issue Oct 28, 2020
Previously, userfile would break if a user had a username with special
characters, which is otherwise supported by CRDB.
This is because, unless specified, userfile uses the username to
generate the underlying storage table names.

This change introduces a new name generation scheme which accounts for
all usernames supported by the database. The details are explained in:
cockroachdb#55389 (comment)

Fixes: cockroachdb#55389

Release note: None
craig bot pushed a commit that referenced this issue Oct 29, 2020
56046: cli: fix username semantics for userfile r=knz a=adityamaru

Previously, userfile would break if a user had a username with special
characters, which is otherwise supported by CRDB.
This is because, unless specified, userfile uses the username to
generate the underlying storage table names.

This change introduces a new name generation scheme which accounts for
all usernames supported by the database. The details are explained in:
#55389 (comment)

Fixes: #55389

Release note: None

Co-authored-by: Aditya Maru <adityamaru@gmail.com>
@craig craig bot closed this as completed in e2130f7 Oct 29, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-disaster-recovery A-security C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. S-3-ux-surprise Issue leaves users wondering whether CRDB is behaving properly. Likely to hurt reputation/adoption.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants