Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

max_replication_slots = 0; can't set up replication #1530

Closed
mlissner opened this issue Jan 3, 2021 · 3 comments
Closed

max_replication_slots = 0; can't set up replication #1530

mlissner opened this issue Jan 3, 2021 · 3 comments

Comments

@mlissner
Copy link
Member

mlissner commented Jan 3, 2021

Not sure what's going on here. I'm doing all my usual stuff on Postgres 11.9 in AWS. It has the same parameter group as another server (postgres 11.8), but somehow one has max_replication_slots=0 and the other has max_replication_slots=10. The config shows that it should be set to ten (the default since postgresql 10).

The way I discovered this was from the following error on the subscriber when I launched logical replication:

ERROR: "cannot start logical replication workers when" max_replication_slots = 0

Nobody else seems to have had this error, ever, so I'm going to try to document things I try.

First thing to note is that there's a blog post that indicates that you can really shoot yourself in the foot if you're not careful when dealing with a slot shortage. It says that if you don't have enough slots while rebooting the instance (say, to increase the number of slots), then you'll get something like this:

2019-07-05 13:16:18 UTC::@:[6165]:PANIC: could not find free replication state, increase max_replication_slots

If that happens, you're stuck. You can't reboot to set more slots, and you can't not reboot, because it needs more slots to start. Apparently, you have to reach out to AWS support for help. So let's try to avoid that!

First thing I'm going to try is a simple reboot, after deleting our subscription so the above isn't an issue. The way this is happening is so weird, I'm hopeful a reboot will fix it. TBD.

@mlissner
Copy link
Member Author

mlissner commented Jan 3, 2021

Well, reboot didn't do it:

courtlistener=> show max_replication_slots;
 max_replication_slots 
-----------------------
 0
(1 row)

I...don't have any better ideas. I guess I'll try just, umm, changing it from the default of 10 to 11?

@mlissner
Copy link
Member Author

mlissner commented Jan 3, 2021

OK, this thread looks really good:

https://forums.aws.amazon.com/thread.jspa?threadID=233510

From 2016, but it says:

Have you disabled automated backups on the instance?

We have, as usual...

If automated backups were disabled, the parameter 'wal_level' is set to 'minimal', which is not sufficient for using replication slots.

Interesting. Sure enough:

courtlistener=> show wal_level;
 wal_level 
-----------
 minimal
(1 row)

Ugh.

Looking in the parameters, it doesn't look like AWS customers get to tweak wal_level. It's just not there.

Looking at automated backups in the configuration page, it says they're disabled. On the "modify" page, the box is checked saying they're enabled, and it's grayed out so I can tweak it. But they're set to be kept for zero days, so I changed that to 7 days, and now we'll see if that helps.

@mlissner
Copy link
Member Author

mlissner commented Jan 3, 2021

Yep, that did it. Ugh.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant