Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use flock for the upgrade lock #1903

Closed
remram44 opened this issue Jan 27, 2023 · 3 comments · Fixed by #1905
Closed

Use flock for the upgrade lock #1903

remram44 opened this issue Jan 27, 2023 · 3 comments · Fixed by #1905

Comments

@remram44
Copy link
Contributor

In #1757, an attempt was made at replacing the locking mechanism for upgrade (when running multiple containers) from simply creating a file to using Linux advisory locks with flock. Admittedly the patch had a few flaws.

Eventually the MR was closed in favor of making the locking optional with an environment variable in #1760. This is a very different approach, but it resolves the issue you had at the time, #1742.

I am opening this issue to gauge whether the is still interest in using flock rather than the mere presence of the file, e.g. did you close #1757 because you don't want to do that, or simply because it was not ready and another patch was able to fix the pressing upgrade issue in #1742?

Pros of using flock:

  • If a container crashes or is stopped during upgrade, another container can do the upgrade. Right now if locking is enabled, you will have to manually go into the volume and remove the file.
  • A concurrent container waiting to do the upgrade will be able to get the lock immediately, rather than the next time it "polls" after sleeping for a while.

Cons of using flock:

  • If a container crashes during upgrade, another container can do the upgrade. I don't know if it is safe to resume an aborted upgrade (but at the same time, what else can you do?)
  • It requires the volume to support locks (but even NFS, Cephfs, ... support it, so you'd have to use a weirdo non-POSIX-compliant filesystem)
  • A bit more complexity in the entrypoint script
@remram44
Copy link
Contributor Author

remram44 commented Feb 8, 2023

I would love a "yay" and "nay" on this before I dig into the test failure at #1905 😅

@meonkeys
Copy link
Contributor

meonkeys commented Feb 8, 2023

I closed #1757 because @skjnldsv's change meant I could just ignore / work around the problem by not using NEXTCLOUD_INIT_LOCK.

"Yay" from me to continue work on this. I'm not a maintainer, just a Nextcloud enthusiast/contributor. It would be a Good Thing to use some kind of ephemeral lock rather than the one requiring user intervention. If flock works let's do it (assuming the maintainers agree/approve).

but even NFS, Cephfs, ... support it

Ah, this removes some of the uncertainty around my patch in #1757, thanks.

Good call to put the lockfile in /var/www/html, I messed that up in #1757.

meonkeys added a commit to meonkeys/nextcloud-docker that referenced this issue Feb 8, 2023
fix nextcloud#1756

fix nextcloud#1903

Signed-off-by: Adam Monsen <haircut@gmail.com>
@meonkeys
Copy link
Contributor

meonkeys commented Feb 8, 2023

re: this con:

If a container crashes during upgrade, another container can do the upgrade. I don't know if it is safe to resume an aborted upgrade (but at the same time, what else can you do?)

Agreed. In either case you'll have a broken server and would need to inspect the logs for errors.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants