-
-
Notifications
You must be signed in to change notification settings - Fork 152
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test deleting column from logical replication #1164
Comments
OK, the first test was to set up two replicating docker images on ports 5432 and 5433 and then use a Once set up, I began by deleting the column from the subscriber. Once deleted, I started getting this log error on the subscriber:
And mirroring it on the publisher:
No surprises there, I suppose. So then I deleted the column from the publisher. With that done, the
So yeah, that shit didn't work. It's pretty clear that although we're sending good data now, the fact that we accumulated data on the publisher while the subscriber lacked the column means that that data needs to get flushed. And the only way to flush it is to have all the right places to put it in the subscriber. Fixing this can be done by adding the column back to the subscriber, waiting for the changes to flush, and then removing it again. |
OK, the next test was to delete the column from the publisher first. This worked better than I thought it would. Weirdly, the subscriber had no issues and the publisher was fine. What's weird though is to look at the column on the subscriber that is no longer getting data from the publisher. It was a varchar(n) field and it just contains blanks. What? So replication is setting those values??? I'm going to try this with a non-nullable integer field and see what tomfoolery it gets up to in that instance. |
OK, this test took a few tweaks to get right, but I began by adding a new column on both sides with:
And then running on both:
I then set up the watch to do the following inside the 5432 docker image:
That got things rolling, so I dropped the column:
Then we got this in the publisher's logs:
So I killed the watch and started a fresh one to run:
That just updates the PK field at this point. At that point, I got the following errors on the subscriber:
Which is fair. We had a NOT NULL column on the subscriber that was getting null values from the publisher. I was then able to fix these by dropping the column on the subscriber as well. After that point things flushed and we were in business. |
So...takeaways:
Welp, this is annoying because adding columns should be done at the subscriber first, and dropping should be done at the publisher first. Great. |
If you will delete columns from tables with REPLICA IDENTITY FULL, then you will turn into troubles. If at master tuple will be like this: Before deleting columns, you must set PRIMARY KEY or REPLICA IDENTITY USING INDEX. |
Yeah, that makes sense, but sure looks nasty. |
The documentation is entirely unclear on how to delete a logically replicated column. Do you do it:
on the publisher first
You'll be sending incomplete tuples to the subscriber. The subscriber will attempt to add these to the DB, but wont' be able to unless it has a default value for the not-yet-deleted column, so it'll probably start failing and flailing around.
on the subscriber first
If you do it on the subscriber first, you'll be getting tuples with extra values that the subscriber doesn't know what to do with. Maybe it'll just ignore them? Unclear. The docs say:
But who knows what to make of that.
Note that I think it's best practice to make additive changes on the subscriber first.
So this will follow the approach of tests that I did in #1115. Basically, I'll create two DBs with docker. Make them replicate, and then try making these tweaks while a
watch
command is busily adding data to the system.The text was updated successfully, but these errors were encountered: