-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cli/dump: support columns less tables #46406
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- if it is easy to do (don't recall what the current dump tests look like), I'd also include a test showing that it now works
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- this needs unit tests
- there needs to be a release note
put into work in progress, there are previous unit tests I need to take care of |
999c3ae
to
ec87c93
Compare
Thank you for updating your pull request. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok so I can see how this is moving forward.
Regarding the testing, I think we're not yet half way there. It's important not just to check that dump
produces something, but also that that something is actually valid. The behavior is create table -> populate data -> dump -> re-load from dump -> compare data. The data must match.
The other thing I'm not yet 100% about is whether this approach is sufficent. This current approach only works if the only hidden columns are specifically those that are auto-created by crdb. There are multiple ways this can break:
-
if CockroachDB currently makes certain columns non-visible for other reasons, this will not work. (I don't know about this. @rohany can you comment on this? does the ALTER PK story make this happen?)
-
if CockroachDB even in the future makes certain columns non-visible, this will start breaking without anyone noticing.
-
in particular, if CockroachDB in later version keeps the auto-generated rowid column, but gives it a different name (e.g. "timestamp" or "$internal_rowid" or whaterver), then current dumps with your patch will not work anymore.
I think the root of my concern is twofold:
-
the
CREATE TABLE
statement thatdump
produces in version X should always work at version Y>X. If you rely on current implicit behavior, but we never really documented/specified that behavior, the dump may non-work with another version.Now, suppose that the pretty-printer for CREATE TABLE was always making the existence of the
rowid
column (albeit hidden) explicit, in a way that future crdb versions would be guaranteed to honor, then there would be no surprise. -
the INSERT statements generated for data rows in the dump should only refer to columns created explicitly by CREATE TABLE. The reason for this is the following. dump is the way that users get confidence they are not locked into cockroachdb -- they can restore their dumps on a different SQL database. But that does not work if the INSERT refers to "magic" columns, where the user has no idea when looking at CREATE TABLE how to create a similar column in their own database. By making the hidden columns visible in the generated CREATE TABLE statement, we'd be informing the user how to "do the same" in their non-crdb database.
Again I encourage you to look at #26644 which provides a possible/candidate approach that addresses both goals.
Another possible approach would be to display the hidden column definitions in a comment; this would help with goal (2) but not with goal (1).
Reviewed 4 of 4 files at r1.
Reviewable status: complete! 0 of 0 LGTMs obtained
primary key changes make some new and non-public indexes, but don't touch the columns in the table. However if a user wants to switch to a hash sharded primary key it will add a new hidden column. |
Although I was following pattern of tests I've encountered in the
I think this is already broken, since if there is table w/o explicit primary key then
This is very fair point, but I would also expect that going forward and changing format or some internal structures the backward campatibility would be preserved.
to me this sounds a bit contradictive to #26644, and by the way can you give a hint why it was closed and not merged? Is still pending to be completed? At any rate I really appreciate your, @knz, feedback and would be glad to understand whenever I'd better to abandon current direction? And in case you think I'd better to close current PR, I think it will beneficial to understand a bit more about current status of #26644. |
I think there's precedent for this in
Oh haha I had missed that. That's not great, but indeed nothing new for you to care about in this PR.
Can you explain more?
It was closed because there as no priority at that time to work on that problem.
Given that you've identified (above) that the dump already generates CREATE TABLE statements that refer to non-existent columns, there's nothing worse introduced by your PR here so I'd encourage you to continue this work. However I think some additional mileage would be obtained wrt vendor lock-in (or the lack thereof) by revealing to the user what's going on. Is there a way perhaps to generate SQL comments (starting with |
Yes, I was just referring to the fact that since at any rate dump already reveals non visible columns extra work has to be done to maintain backward compatibility.
In such a case I will fix unit test and push updated commit.
I think that this could be done, while might require a bit more intrusive changes. |
Thank you for updating your pull request. My owl senses detect your PR is good for review. Please keep an eye out for any test failures in CI. 🦉 Hoot! I am a Blathers, a bot for CockroachDB. I am experimental - my owner is otan. |
@knz I've pushed revisited unit tests to make sure end-to-end functionality, include applying dumps back, also few more test cases. |
This commit allows to dump and treat columnsless tables by explicitly allowing to explicitly specify hidden/reserved columns. Signed-off-by: Artem Barger <bartem@il.ibm.com> Release note (bug fix): providing with ability to dump columns less tables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice test!
LGTM though I'll let @knz take one more look before we tell bors to merge it |
And thanks again for picking this up and for working though all the review suggestions! |
bors r+ |
Build succeeded |
This commit allows to dump and treat columns less tables by explicitly
allowing to explicitly specify hidden/reserved columns.
Fixes #37768.
Signed-off-by: Artem Barger bartem@il.ibm.com
Release note (bug fix): providing with ability to dump columns less tables