Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lockfiles don't rely on PID (#1834) #1915

Merged
merged 2 commits into from
Oct 4, 2022

Conversation

Klaim
Copy link
Member

@Klaim Klaim commented Sep 12, 2022

This should remove the possibility of #1834 to ever happen, but does not attempt to fix the "clean" issue.

This change removes reliance on pid written in a lockfile for the lockfile mechanism to know if the same process is attempting to lock the same file twice. This case is currently an error (throws std::logic_error) but this will probably change later. Meanwhile this fix changes how lockfiles are implemented so that they could share ownership of the lock if the process already did lock that file. I kept the check for double locking for backward compatibility.

@Klaim
Copy link
Member Author

Klaim commented Sep 12, 2022

Not ready yet, tests don't pass locally but this is the general idea. I think there is something fishy file-concurrency-wise with how the system file lock functions are called, but the mutex lock in the registry should help with that too.

@Klaim
Copy link
Member Author

Klaim commented Sep 12, 2022

Also: I'm open to suggestions for a better name than LockFileOwner 😬

@Klaim Klaim force-pushed the klaim/lockfile-no-pid-check branch 3 times, most recently from 863edaa to ff3c1f5 Compare September 15, 2022 15:45
@Klaim Klaim marked this pull request as ready for review September 15, 2022 16:52
mutex; // TODO: replace by synchronized_value once available
};

static LockedFilesRegistry files_locked_by_this_process;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to declare this variable static since it is defined in an anonymous namespace?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct that this is redundant. I added it in case it was not clear to everybody, but can remove it without change.

{
int ret = 0;

// POSIX systems automatically remove locks when closing any file
// descriptor related to the file
#ifdef _WIN32
LOG_TRACE << "Removing lock on '" << m_lock.string() << "'";
LOG_TRACE << "Removing lock on '" << m_lockfile_path.string() << "'";
_lseek(m_fd, MAMBA_LOCK_POS, SEEK_SET);
ret = _locking(m_fd, LK_UNLCK, 1 /*lock_file_contents_length()*/);
#endif
remove_lockfile();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
This may just be "shouting from the sidelines", so take this with a heavy grain of salt 😉

Removing the lock file seems a bit odd here. We ran into a similar problem in our development a while back.
Having the lock file in place enabled us to see what is locking it, and removing the file kind of made this hidden and caused some extra hassle.
..but if you can provide some test build we can test this change with our RCC builds (https://github.com/robocorp/rcc)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thanks for pointing this. However I'm not sure what you find odd exactly as this is the unlock function called when the lock have to be released, so removing the lockfile is kind of mandatory? Or did I misunderstand something?
Also I removed the pid written in the lockfile, do you think it should be kept for debugging purpose? We don't need it anymore here but it's not clear to me if it's still useful for users.

As for the build I'm not sure how I can provide it based on the CI output yet but I'm kind of new around here, I'll ask around and let you know.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,
As said, I might be way off base, but the way we ended up doing the file locks (a simplification as I was not the one implementing 😉) is just to have the file there, take the lock from the process, and ask the OS if the file has locks or not.

That way, the lock file always exists, and in debugging cases you can go in and check with other tools also what is holding the lock.

Copy link
Member Author

@Klaim Klaim Sep 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'll discuss about it with the others, I'll consider making a separate PR with that logic (as this one only treats about not relying on pid). We have plans to refactor/review/revamp this lockfile system anyway so that's good input. 👍🏽

@Klaim Klaim force-pushed the klaim/lockfile-no-pid-check branch from ff3c1f5 to b201ef2 Compare September 23, 2022 12:50
@Klaim Klaim force-pushed the klaim/lockfile-no-pid-check branch from b201ef2 to b151bba Compare October 3, 2022 11:40
@Klaim Klaim force-pushed the klaim/lockfile-no-pid-check branch from b151bba to 77bb95b Compare October 3, 2022 15:40
@wolfv wolfv merged commit 4faa748 into mamba-org:main Oct 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants