-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
importccl,sql: support importing non-public schemas from pgdump #57183
Conversation
I'm still working on adding some more tests and maybe cleaning up some comments, but I wanted to start getting some review passes. cc: @mokaixu |
8c23bf5
to
f8a87e1
Compare
This should be RFAL! I also tested this using one of the PGDUMP files which have been provided to us from a customer demo after commenting out unsupported PGDUMP statements. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This approach generally looks good to me. I'm sending out a few comments to start, but I want to do another pass.
Reviewed 3 of 17 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @miretskiy, and @pbardea)
pkg/ccl/importccl/import_stmt.go, line 1223 at r1 (raw file):
txn *kv.Txn, ) error { if descriptor.GetID() == 0 {
Can we use sqlbase.InvalidID
here?
pkg/ccl/importccl/import_stmt.go, line 1227 at r1 (raw file):
} if descriptor.GetID() != id { log.Fatalf(ctx, "%v", errors.AssertionFailedf("cannot create descriptor with an unexpected (%v) ID: %v", id, descriptor))
nit: this error messages is a bit confusing to me. Perhaps something like cannot create descriptor with an expected ID %v; expected %v
?
pkg/ccl/importccl/read_import_pgdump.go, line 252 at r1 (raw file):
) ([]*schemadesc.Mutable, error) { var dbDesc *dbdesc.Immutable var err error
Do we need this err
declaration?
pkg/ccl/importccl/read_import_pgdump.go, line 387 at r1 (raw file):
var mockDescID = defaultCSVTableID func getNextMockDescID() descpb.ID {
Can we add a comment here explaining why we need these mock IDs (why we can't allocate the "real" ID initially)? Also as a nit, perhaps we can rename this to something like placeholderDescID
? I tend to associate mock
with objects that simulate a production variant to be used in tests.
f8a87e1
to
46b1866
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, and @pbardea)
pkg/ccl/importccl/import_stmt.go, line 1223 at r1 (raw file):
Previously, pbardea (Paul Bardea) wrote…
Can we use
sqlbase.InvalidID
here?
Done.
pkg/ccl/importccl/import_stmt.go, line 1227 at r1 (raw file):
Previously, pbardea (Paul Bardea) wrote…
nit: this error messages is a bit confusing to me. Perhaps something like
cannot create descriptor with an expected ID %v; expected %v
?
modified the message.
pkg/ccl/importccl/read_import_pgdump.go, line 252 at r1 (raw file):
Previously, pbardea (Paul Bardea) wrote…
Do we need this
err
declaration?
we need dbDesc and so I needed to declare err as well. But I've moved the err declaration inside the db.Txn scope to make it clear that it is not used outside the scope of that closure.
pkg/ccl/importccl/read_import_pgdump.go, line 387 at r1 (raw file):
Previously, pbardea (Paul Bardea) wrote…
Can we add a comment here explaining why we need these mock IDs (why we can't allocate the "real" ID initially)? Also as a nit, perhaps we can rename this to something like
placeholderDescID
? I tend to associatemock
with objects that simulate a production variant to be used in tests.
done!
I also remembered having a discussion re: setting the schema descs to Offline, and then publishing them similar to how we deal with tables. I added some logic around that too in this iteration. |
ab79837
to
68bb5e9
Compare
@miretskiy @pbardea just putting this on your radar, apologies for the slightly bloated PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @miretskiy, and @pbardea)
pkg/ccl/importccl/import_processor_test.go, line 941 at r2 (raw file):
if format.Format == roachpb.IOFileFormat_PgDump { spec.tables = map[string]*execinfrapb.ReadImportDataSpec_ImportTable{
nit: slight preference on this conditional to only set the table name for readability, e.g.
fullTableName := "simple"
if pgDump {
fullTableName = "public.simple"
}
spec.tables = ... // using fullTableName here.
pkg/ccl/importccl/import_stmt.go, line 1127 at r2 (raw file):
// schema ID. tableRewrites := schemaRewrites if schemaRewrites == nil {
nit: although it's the same, I think checking for tableRewrites
being nil makes more sense here since that's what we're setting in the next line.
pkg/ccl/importccl/import_stmt.go, line 1305 at r2 (raw file):
) error { if descriptor.GetID() == descpb.InvalidID { log.Fatalf(ctx, "%v", errors.AssertionFailedf("cannot create descriptor with an empty ID: %v", descriptor))
Should we be returning an error here? If not, should we have a comment explaining why? Same directly below.
pkg/ccl/importccl/import_stmt.go, line 2039 at r2 (raw file):
schema.Desc.GetID(), newDesc) } newSchemaDesc.State = descpb.DescriptorState_PUBLIC
Does newSchemaDesc
have a setPublic()
method? If so, we should use that.
pkg/ccl/importccl/import_stmt.go, line 2182 at r2 (raw file):
// TODO(adityamaru): Do we fail the import rollback, think about what // implication that has. return err
I think it's okay to error here, especially if it's after we've dropped the table. I think that the error here will return an error along the lines of "be careful, we couldn't clean up and manual cleanup may be necessary". Which is exactly what would be necessary here since the user would need to drop these newly created schemas.
pkg/ccl/importccl/import_stmt.go, line 2285 at r2 (raw file):
Description: "dropping schemas as part of an import job rollback", Username: p.User(), DescriptorIDs: nil,
Should this have the schema descriptor's IDs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @adityamaru, @ajwerner, @miretskiy, and @pbardea)
pkg/ccl/importccl/import_stmt.go, line 1332 at r2 (raw file):
mutDesc, ok := descriptor.(catalog.MutableDescriptor) if !ok { log.Fatalf(ctx, "unexpected type %T when creating descriptor", descriptor)
Do we have a rule of thumb on when we use errors.AssertionFailedf
as we did above?
pkg/ccl/importccl/import_stmt.go, line 1346 at r2 (raw file):
Quoted 4 lines of code…
if err := txn.Run(ctx, b); err != nil { return err } return nil
trivial nit: I don't know how wee feel about return txn.Run(ctx, b)
in cases like this.
pkg/ccl/importccl/import_stmt.go, line 1852 at r2 (raw file):
if len(details.Schemas) != 0 { schemaPreparedDetails, schemaRewrites, newSchemaIDToName, oldSchemaIDToName, queuedSchemaJobs, err = r.prepareSchemasForIngestion(ctx, p, curDetails, txn, descsCol)
💭 I wonder if there is a type/struct that would could extract here to pass between these functions. Perhaps that is overkill in this case though.
68bb5e9
to
a27b28c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, @pbardea, and @stevendanna)
pkg/ccl/importccl/import_processor_test.go, line 941 at r2 (raw file):
Previously, pbardea (Paul Bardea) wrote…
nit: slight preference on this conditional to only set the table name for readability, e.g.
fullTableName := "simple" if pgDump { fullTableName = "public.simple" } spec.tables = ... // using fullTableName here. </blockquote></details> Done. ___ *[pkg/ccl/importccl/import_stmt.go, line 1127 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9EFgpCTfhlCCzZS8R:-MUK60BK0T4BlP8pSqei:b-896fix) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L1127)):* <details><summary><i>Previously, pbardea (Paul Bardea) wrote…</i></summary><blockquote> nit: although it's the same, I think checking for `tableRewrites` being nil makes more sense here since that's what we're setting in the next line. </blockquote></details> Done. ___ *[pkg/ccl/importccl/import_stmt.go, line 1305 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9EYuz86qfnDJKL0OA:-MUK6TEw2SfkPcHP8wWZ:br96x0p) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L1305)):* <details><summary><i>Previously, pbardea (Paul Bardea) wrote…</i></summary><blockquote> Should we be returning an error here? If not, should we have a comment explaining why? Same directly below. </blockquote></details> I think we should be returning an error, changed. ___ *[pkg/ccl/importccl/import_stmt.go, line 1332 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9C1739XZTwe8JfGsP:-MUKEeymBY22SwCez0k1:b-kimflq) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L1332)):* <details><summary><i>Previously, stevendanna (Steven Danna) wrote…</i></summary><blockquote> Do we have a rule of thumb on when we use `errors.AssertionFailedf` as we did above? </blockquote></details> I took these statements to mirror what we already do in `descriptor.go`. Import is unfortunate in the sense that it needs to replicate a lot of the SQL code. I think AssertionFailedf are to be used in code paths that we don't think users will/should hit under "normal circumstances" and they should be reported via sentry. I can definitely see an argument that this cast failure fits into that category but I just went off of precedent. Switched to returning them though, so that we can rollback the import. ___ *[pkg/ccl/importccl/import_stmt.go, line 1346 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9Cdb65am2XICe9hZv:-MUK77KLDKduV9cF472y:beonz4l) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L1346)):* <details><summary><i>Previously, stevendanna (Steven Danna) wrote…</i></summary><blockquote> > ``` > if err := txn.Run(ctx, b); err != nil { > return err > } > return nil > ``` trivial nit: I don't know how wee feel about `return txn.Run(ctx, b)` in cases like this. </blockquote></details> good point, cleaned up. ___ *[pkg/ccl/importccl/import_stmt.go, line 1852 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9IhuX0kGA43LM-BeT:-MUKRQRx7qitr6AcAPkE:bhxld0c) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L1852)):* <details><summary><i>Previously, stevendanna (Steven Danna) wrote…</i></summary><blockquote> :thought_balloon: I wonder if there is a type/struct that would could extract here to pass between these functions. Perhaps that is overkill in this case though. </blockquote></details> Good idea, the return values are getting out of hand. Added a struct, though there is some subtlety of what we can and cannot have nil between the two methods, without special-casing in a bunch of places for PGDUMP. ___ *[pkg/ccl/importccl/import_stmt.go, line 2039 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9Esdo8aYM-foG9IHI:-MUK7H966ui-tMsWPGQY:bjbhv4w) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L2039)):* <details><summary><i>Previously, pbardea (Paul Bardea) wrote…</i></summary><blockquote> Does `newSchemaDesc` have a `setPublic()` method? If so, we should use that. </blockquote></details> good point, fixed. ___ *[pkg/ccl/importccl/import_stmt.go, line 2182 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9FFoW42FHFRXK06kn:-MUKDw8D6zOwBf2HS_yY:bcsym3n) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L2182)):* <details><summary><i>Previously, pbardea (Paul Bardea) wrote…</i></summary><blockquote> I think it's okay to error here, especially if it's after we've dropped the table. I think that the error here will return an error along the lines of "be careful, we couldn't clean up and manual cleanup may be necessary". Which is exactly what would be necessary here since the user would need to drop these newly created schemas. </blockquote></details> makes sense, thanks. ___ *[pkg/ccl/importccl/import_stmt.go, line 2285 at r2](https://reviewable.io/reviews/cockroachdb/cockroach/57183#-MU9FYzLDJY3FQvI-Z0k:-MUKEdQu0nLoKsKAaB9B:b-rmsntx) ([raw file](https://github.com/cockroachdb/cockroach/blob/68bb5e9bc56addb355e903144df38d69073927bb/pkg/ccl/importccl/import_stmt.go#L2285)):* <details><summary><i>Previously, pbardea (Paul Bardea) wrote…</i></summary><blockquote> Should this have the schema descriptor's IDs? </blockquote></details> nice catch, done. <!-- Sent from Reviewable.io -->
I have no idea what reviewable did to my responses...you might have to open reviewable to see them, but I think I addressed all the comments. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @ajwerner, @miretskiy, @pbardea, and @stevendanna)
Previously, a PGDUMP import did not support any non-public schema statements. Now that CRDB has user defined schemas, bundle format IMPORTs need to be taught how to parse, create and cleanup schema related PGDUMP operations. Note this PR only adds support for `CREATE SCHEMA` and usage of the schema in `CREATE TABLE/SEQUENCE` PGDUMP statements. `ALTER SCHEMA` statements are ignored and support might be added in a follow up. Release justification (bug fix): Import PGDUMP does not support user defined schemas. Release note (sql change): IMPORT PGDUMP can now import dump files with non-public schemas.
a27b28c
to
f36a95a
Compare
TFTR! bors r+ |
Build succeeded: |
Previously, a PGDUMP import did not support any non-public schema
statements. Now that CRDB has user defined schemas, bundle format
IMPORTs need to be taught how to parse, create and cleanup schema
related PGDUMP operations.
Note this PR only adds support for
CREATE SCHEMA
and usage of theschema in
CREATE TABLE/SEQUENCE
PGDUMP statements.ALTER SCHEMA
statements are ignored and support might be added in a follow up.
Release justification: low risk, high reward. Import PGDUMP does not support user
defined schemas.
Release note (sql change): IMPORT PGDUMP can now import dump files with
non-public schemas.