-
Notifications
You must be signed in to change notification settings - Fork 27
Description
Steps to reproduce
a. I do not have a firm reproducer, but I ran into this issue upgrading from rev 429 to rev 468 in a charmed landscape deployment. I originally encountered the issue in rev 429, and based on a prior bug, expected refreshing to 468 would fix the issue. However, I still see my pg units not starting, in a "awaiting for member to start" state.
b. I did not encounter this issue on another cluster in an identical environment, so it seems somewhat random. The machines in the juju model are manual machines in Azure.
- Essentially, 2/3 postgres units stay stuck in a "awaiting for member to start" state. They cycle through different waiting and executing states, but the PG units never actually start.
Expected behavior
I expect the other 2 units to start and enter a active/idle state. They have been in this state for >48 hours.
Actual behavior
see logs below, but the machines cycle through waiting/executing states, but never enter active/idle as expected.
Versions
Operating system: 22.04.4
Juju CLI: 3.5.4
Juju agent: 3.5.4
Charm revision: 468
LXD: n/a
Log output
juju debug log: https://paste.ubuntu.com/p/FzXnjMpNYz/
snap logs from one unit failing to start: https://paste.ubuntu.com/p/St8WZNn4GT/ (restart at the end of the log file)
snap logs from other unit failing to start: https://paste.ubuntu.com/p/BH3RXfZrTW/
snap logs from healthy unit: https://paste.ubuntu.com/p/b6bgSVZKYm/
pg snap services config: https://paste.ubuntu.com/p/xJJq6ktXm9/
Happy to provide more logs, details or access to the environment. Thanks.
