-
Notifications
You must be signed in to change notification settings - Fork 3.9k
changefeedccl: allow users to alter changefeed options #76583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
changefeedccl: allow users to alter changefeed options #76583
Conversation
3bd57a5
to
36eb164
Compare
36eb164
to
156f625
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 4 of 8 files at r1, 1 of 1 files at r2, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @sherman-grewal, and @shermanCRL)
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 88 at r2 (raw file):
targetsChanged := false oldStmt, err := parser.ParseOne(job.Payload().Description)
Ooof... Arguably, I should have picked this earlier (during previous version)... But here we are.
Unfortunately creating changefeed statement node based on payload description is unsafe and
is broken. Basically -- Description contains redacted string representation of the original statement,
and thus cannot be reversed (e.g. you will loose any security related information -- keys, etc). -- changefeedJobDescription
function.
What to do:
- Please add a test (which I expect to fail) where you create changefeed into e.g. s3, and include some of the redactable fields there (
create changefeed ... into 's3://....?AWS_SECRET_ACCESS_KEY=my_secret'
). Verify that
after an alter statement you still see AWS_SECRET_ACCESS_KEY=my_secret and not AWS_SECRET_ACCESS_KEY=redacted. I expect this test to fail. - You can (and probably should) fix this issue as a separate PR on top of existing alter implementation.
- The changefeed statement must be synthesized based on ChangefeedDetails fields. Now, you don't actually have
to regenerate exact statement as it was typed when changefeed was created. In particular, the thing you might have problem with is the target list. You have target id, but the name available in ChangeFeedTarget field might no longer be available -- because the table could have been renamed. So, you can't just assume you can put statement_names in the changefeed statement. It would be nice if we can use tree.AliasedTableExpr in changefeed statement -- but as things are right now, that's not the case (I believe TablePattern is not the same as an alias table expr). Perhaps the changefeed statement can be changed somehow to also allow specification of targets as table expressions? If so, you could generate table expression in the form of[id AS name]
(this is something that select supports... you can doselect * from [30 as namespace]
-- try running this from cockroach demo). Other possible hacks include resolving current names for each descriptor, re-synthesizing changefeed statement using current names, and then, "renaming" them to the original statement_time_name.
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 123 at r2 (raw file):
} if opts.Options != nil {
So, now that we've fixed how the statement is re-generated (above), I have to wonder... Would we be better off if we just take changefeed statement, and append all options there?
As far as I know, appending the same option multiple times overrides the previous occurrences.
If you refactor changefeed statement plan hook so that you have a separate function to compute the job record basically, then you could simply call into that function, and have it perform all of the validation, and return you the "new" job record which you will handle here.
The only thing is that function would probably need to take in a boolean argument indicating if it should update telemetry data (yes, from changefeed, and now from here).
pkg/sql/parser/testdata/alter_changefeed, line 99 at r2 (raw file):
ALTER CHANGEFEED (123) ADD (foo) SET bar = ('baz'), qux = ('quux') DROP (corge) -- fully parenthesized ALTER CHANGEFEED _ ADD foo SET bar = '_', qux = '_' DROP corge -- literals removed ALTER CHANGEFEED 123 ADD _ SET _ = 'baz', _ = 'quux' DROP _ -- identifiers removed
❤️ these tests.
pkg/ccl/changefeedccl/show_changefeed_jobs_test.go, line 450 at r2 (raw file):
t.Fatalf("Expected more rows when querying and none found for query: %s", query) } }
This block is repeated too many times; pull it into helper function?
Suggestion:
if !rowResults.Next() {
err := rowResults.Err()
if err != nil {
t.Fatalf("Error encountered while querying the next row: %v", err)
} else {
t.Fatalf("Expected more rows when querying and none found for query: %s", query)
}
}
156f625
to
e50cb0b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, and @shermanCRL)
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 88 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Ooof... Arguably, I should have picked this earlier (during previous version)... But here we are.
Unfortunately creating changefeed statement node based on payload description is unsafe and
is broken. Basically -- Description contains redacted string representation of the original statement,
and thus cannot be reversed (e.g. you will loose any security related information -- keys, etc). --changefeedJobDescription
function.What to do:
- Please add a test (which I expect to fail) where you create changefeed into e.g. s3, and include some of the redactable fields there (
create changefeed ... into 's3://....?AWS_SECRET_ACCESS_KEY=my_secret'
). Verify that
after an alter statement you still see AWS_SECRET_ACCESS_KEY=my_secret and not AWS_SECRET_ACCESS_KEY=redacted. I expect this test to fail.- You can (and probably should) fix this issue as a separate PR on top of existing alter implementation.
- The changefeed statement must be synthesized based on ChangefeedDetails fields. Now, you don't actually have
to regenerate exact statement as it was typed when changefeed was created. In particular, the thing you might have problem with is the target list. You have target id, but the name available in ChangeFeedTarget field might no longer be available -- because the table could have been renamed. So, you can't just assume you can put statement_names in the changefeed statement. It would be nice if we can use tree.AliasedTableExpr in changefeed statement -- but as things are right now, that's not the case (I believe TablePattern is not the same as an alias table expr). Perhaps the changefeed statement can be changed somehow to also allow specification of targets as table expressions? If so, you could generate table expression in the form of[id AS name]
(this is something that select supports... you can doselect * from [30 as namespace]
-- try running this from cockroach demo). Other possible hacks include resolving current names for each descriptor, re-synthesizing changefeed statement using current names, and then, "renaming" them to the original statement_time_name.
As discussed offline, there was a misunderstanding when this comment was published. In particular, @miretskiy thought that we were using the CREATE CHANGEFEED AST node (parsed by the old job description) to update the details of the existing changefeed. However, this is not the case. The AST node is only used to help generate the new job description of the updated changefeed. As a result, the SinkURI of the ChangefeedDetails remains unchanged after the alteration (leaving the un-redacted information), but sensitive information in the SinkURI remains redacted in the job description as desired.
That being said, this comment helped us catch the fact that we use the SinkURI from the AST node to perform validations, but it would be preferred to use the actual SinkURI (from the ChangefeedDetails) to perform validations instead. Thank you @miretskiy for this catch! 🙌
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 123 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
So, now that we've fixed how the statement is re-generated (above), I have to wonder... Would we be better off if we just take changefeed statement, and append all options there?
As far as I know, appending the same option multiple times overrides the previous occurrences.
If you refactor changefeed statement plan hook so that you have a separate function to compute the job record basically, then you could simply call into that function, and have it perform all of the validation, and return you the "new" job record which you will handle here.
The only thing is that function would probably need to take in a boolean argument indicating if it should update telemetry data (yes, from changefeed, and now from here).
See the comment above.
pkg/sql/parser/testdata/alter_changefeed, line 99 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
❤️ these tests.
:)
pkg/ccl/changefeedccl/show_changefeed_jobs_test.go, line 450 at r2 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
This block is repeated too many times; pull it into helper function?
Awesome suggestion, the code looks a lot more clean now. Thanks!! 💯
1271e04
to
2e2ae92
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r1, 1 of 3 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, @sherman-grewal, and @shermanCRL)
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 123 at r3 (raw file):
} if opts.Options != nil {
Still, I would love to see if there is a way to avoid having to do this options parsing by hand so to speak;
Consider that
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 142 at r3 (raw file):
if _, ok := changefeedbase.OptionsWithNoValue[key]; ok { switch v := changefeedbase.AlterNoValueOptionType(value); v { case changefeedbase.OptUnsetValue:
So, is the expectation is that the user will specify foo=false
if they want to unset something?
This probably needs to be comment in on (in the changefeed.Base); And probably in the PR description.
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 193 at r3 (raw file):
if _, shouldProtect := details.Opts[changefeedbase.OptProtectDataFromGCOnPause]; shouldProtect && !p.ExecCfg().Codec.ForSystemTenant() { return errorutil.UnsupportedWithMultiTenancy(67271) }
I don't know; something about this code where we are re-doing work already done in create changefeed statement just doesn't sit right with me. I do think that re-generating changefeed statement, and letting all of the validation checks happen in 1 place is a way to go. Is there any way we can get to that point?
I honestly think the answer is ... yes. CreateChangefeed is a simple(ish) statement:
type CreateChangefeed struct {
Targets TargetList
SinkURI Expr
Options KVOptions
}
The source of truth during alter changefeed is the ChangeFeedDetails:
message ChangefeedDetails {
map<uint32, ChangefeedTarget> targets
string sink_uri = 3 [(gogoproto.customname) = "SinkURI"];
map<string, string> opts = 4;
}
I believe each field in changefeed details can be converted to tree.CreateChangefeed format.
E.g. if you need to convert a stree (sink_uri) to Expr -- that's simple, tree.NewDString("....") is an expression
that evaluates to string value. Similar for KVOptions. Just add the new options to the list.
TargetList -- that's a bit more involved. Basically, you don't know what the table name is right now.
But what you do know is the descritor ID. So, you could resolve those descriptors to be fully qualified names right now. Now, having created the tree.CreateChangefeed statement you should be able to call into basically the
same code as used in changefeed_stmt. Your goal is to get job.Record out of the function, w/out actually writing new job record. The new job record will have correctly redacted information, etc. The only thing you need to do after that is go over updated changefeedDetails.targets , and replace the names with the original names.
This may sound like a lot of work, but I think it's actually not. YOu're using all of the existing validation function, you're keeping the code in 1 place; and I just think it'll make more sense.
Let me know what you think and if you agree or not.
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 291 at r3 (raw file):
jobDescription := tree.AsString(createChangefeedStmt) newPayload.Description = jobDescription
If I understand correctly, we don't need to call changefeedJobDescription
because createChagefeedStmt was generated previous description, which should have had URI redacted.
At the very least, this deserves a comment; Better yet, I think I would prefer the entirety of taking Description, and parsing into the changefeed statement to be moved into a helper function (with comments). something like getUpdatedChangefeedDescription(....)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @miretskiy, @sherman-grewal, and @shermanCRL)
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 193 at r3 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
I don't know; something about this code where we are re-doing work already done in create changefeed statement just doesn't sit right with me. I do think that re-generating changefeed statement, and letting all of the validation checks happen in 1 place is a way to go. Is there any way we can get to that point?
I honestly think the answer is ... yes. CreateChangefeed is a simple(ish) statement:
type CreateChangefeed struct { Targets TargetList SinkURI Expr Options KVOptions }
The source of truth during alter changefeed is the ChangeFeedDetails:
message ChangefeedDetails { map<uint32, ChangefeedTarget> targets string sink_uri = 3 [(gogoproto.customname) = "SinkURI"]; map<string, string> opts = 4; }
I believe each field in changefeed details can be converted to tree.CreateChangefeed format.
E.g. if you need to convert a stree (sink_uri) to Expr -- that's simple, tree.NewDString("....") is an expression
that evaluates to string value. Similar for KVOptions. Just add the new options to the list.
TargetList -- that's a bit more involved. Basically, you don't know what the table name is right now.
But what you do know is the descritor ID. So, you could resolve those descriptors to be fully qualified names right now. Now, having created the tree.CreateChangefeed statement you should be able to call into basically the
same code as used in changefeed_stmt. Your goal is to get job.Record out of the function, w/out actually writing new job record. The new job record will have correctly redacted information, etc. The only thing you need to do after that is go over updated changefeedDetails.targets , and replace the names with the original names.This may sound like a lot of work, but I think it's actually not. YOu're using all of the existing validation function, you're keeping the code in 1 place; and I just think it'll make more sense.
Let me know what you think and if you agree or not.
Agree, syncing validations in multiple places seems unnecessarily painful. And hard--I think this is missing at least one, validating that you're providing a schema registry while setting the format to avro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 6 of 8 files at r5, 3 of 3 files at r6, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miretskiy, @sherman-grewal, and @shermanCRL)
-- commits, line 16 at r6:
I think this needs to be updated to also mention UNSET
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 176 at r6 (raw file):
} if _, ok := changefeedbase.AlterChangefeedUnsupportedOptions[key]; ok { return errors.Errorf(`cannot alter option %s`, key)
you could probably wrap these errors w/ e.g.
return pgerror.Newf(pgcode.InvalidParameterValue, ..._
(see examples in changefeeed_stmt )
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 240 at r6 (raw file):
) if err != nil { return err
I think we should wrap this error to make it clear that the failure is due to "alter" ....
errors.Wrap(err, "alter changefeed failed") or some such.
pkg/ccl/changefeedccl/changefeedbase/options.go, line 246 at r6 (raw file):
// AlterChangefeedUnsupportedOptions are changefeed options that we do not allow // users to alter var AlterChangefeedUnsupportedOptions = makeStringSet(OptCursor, OptInitialScan, OptNoInitialScan)
Cursor?
pkg/ccl/changefeedccl/alter_changefeed_test.go, line 126 at r6 (raw file):
waitForJobStatus(sqlDB, t, feed.JobID(), `paused`) sqlDB.Exec(t, fmt.Sprintf(`ALTER CHANGEFEED %d SET diff`, feed.JobID()))
probably need to add unset test too.
pkg/ccl/changefeedccl/alter_changefeed_test.go, line 252 at r6 (raw file):
details, ok := job.Details().(jobspb.ChangefeedDetails) require.True(t, ok)
please add a comment on what it is we are testing (i.e. that alter changefeed does not accidently redacts secret keys)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job. I have few minor nits. Perhaps @HonoreDB might have more comments.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @miretskiy, @sherman-grewal, and @shermanCRL)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 8 files at r1, 6 of 8 files at r5, 2 of 3 files at r6, all commit messages.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @miretskiy, @sherman-grewal, and @shermanCRL)
eb508c3
to
bc8be04
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB, @miretskiy, @sherman-grewal, and @shermanCRL)
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 176 at r6 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
you could probably wrap these errors w/ e.g.
return pgerror.Newf(pgcode.InvalidParameterValue, ..._
(see examples in changefeeed_stmt )
Done.
pkg/ccl/changefeedccl/alter_changefeed_stmt.go, line 240 at r6 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
I think we should wrap this error to make it clear that the failure is due to "alter" ....
errors.Wrap(err, "alter changefeed failed") or some such.
Done.
pkg/ccl/changefeedccl/changefeedbase/options.go, line 246 at r6 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
Cursor?
Yeah, it's in the deny list so it will not be allowed to be altered
pkg/ccl/changefeedccl/alter_changefeed_test.go, line 126 at r6 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
probably need to add unset test too.
Done.
pkg/ccl/changefeedccl/alter_changefeed_test.go, line 252 at r6 (raw file):
Previously, miretskiy (Yevgeniy Miretskiy) wrote…
please add a comment on what it is we are testing (i.e. that alter changefeed does not accidently redacts secret keys)
Done.
d44a73a
to
ab16ed4
Compare
with the ALTER CHANGEFEED statement Currently, with the ALTER CHANGEFEED statement users can only add or drop targets from an existing changefeed. In this PR, we would like to extend this functionality so that an user can edit and unset the options of an existing changefeed as well. The syntax of this addition is the following: ALTER CHANGEFEED <job_id> SET <options> UNSET <opt_list> Note that the <options> must follow the same syntax that is used when creating a changefeed with options. In particular, if you would like to set an option that requires a value, you must write SET opt = 'value' On the other hand, if you would like to set an option that requires no value, you must write SET opt Furthermore, this PR allows users to unset options. This can be achieved by writing UNSET <opt_list> Where <opt_list> is a list of options that you would like to unset. For example, if we would like to unset the diff and resolved options for changefeed 123, we would achieve this by writing ALTER CHANGEFEED 123 UNSET diff, resolved Release note (enterprise change): Added support to the ALTER CHANGEFEED statement so that users can edit and unset the options of an existing changefeed. The syntax of this addition is the following: ALTER CHANGEFEED <job_id> SET <options> UNSET <opt_list>
ab16ed4
to
75dc9fe
Compare
bors r+ |
Build failed (retrying...): |
Build failed (retrying...): |
Build succeeded: |
changefeedccl: allow users to alter changefeed options
with the ALTER CHANGEFEED statement
References #75895
Currently, with the ALTER CHANGEFEED statement users
can only add or drop targets from an existing
changefeed. In this PR, we would like to extend this
functionality so that an user can edit and unset
the options of an existing changefeed as well.
The syntax of this addition is the following:
ALTER CHANGEFEED <job_id> SET <options> UNSET <opt_list>
Note that the must follow the same syntax
that is used when creating a changefeed with options.
In particular, if you would like to set an option
that requires a value, you must write
SET opt = 'value'
On the other hand, if you would like to set an
option that requires no value, you must write
SET opt
Furthermore, this PR allows users to unset options.
This can be achieved by writing
UNSET <opt_list>
Where <opt_list> is a list of options that you
would like to unset. For example, if we would like
to unset the diff and resolved options for
changefeed 123, we would achieve this by writing
ALTER CHANGEFEED 123 UNSET diff, resolved
Release note (enterprise change): Added support to
the ALTER CHANGEFEED statement so that users can edit
and unset the options of an existing changefeed. The
syntax of this addition is the following:
ALTER CHANGEFEED <job_id> SET <options> UNSET <opt_list>