-
Notifications
You must be signed in to change notification settings - Fork 379
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Lockfiles don't rely on PID (#1834) #1915
Conversation
Not ready yet, tests don't pass locally but this is the general idea. I think there is something fishy file-concurrency-wise with how the system file lock functions are called, but the mutex lock in the registry should help with that too. |
Also: I'm open to suggestions for a better name than |
863edaa
to
ff3c1f5
Compare
mutex; // TODO: replace by synchronized_value once available | ||
}; | ||
|
||
static LockedFilesRegistry files_locked_by_this_process; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to declare this variable static since it is defined in an anonymous namespace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct that this is redundant. I added it in case it was not clear to everybody, but can remove it without change.
{ | ||
int ret = 0; | ||
|
||
// POSIX systems automatically remove locks when closing any file | ||
// descriptor related to the file | ||
#ifdef _WIN32 | ||
LOG_TRACE << "Removing lock on '" << m_lock.string() << "'"; | ||
LOG_TRACE << "Removing lock on '" << m_lockfile_path.string() << "'"; | ||
_lseek(m_fd, MAMBA_LOCK_POS, SEEK_SET); | ||
ret = _locking(m_fd, LK_UNLCK, 1 /*lock_file_contents_length()*/); | ||
#endif | ||
remove_lockfile(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
This may just be "shouting from the sidelines", so take this with a heavy grain of salt 😉
Removing the lock file seems a bit odd here. We ran into a similar problem in our development a while back.
Having the lock file in place enabled us to see what is locking it, and removing the file kind of made this hidden and caused some extra hassle.
..but if you can provide some test build we can test this change with our RCC builds (https://github.com/robocorp/rcc)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, thanks for pointing this. However I'm not sure what you find odd exactly as this is the unlock
function called when the lock have to be released, so removing the lockfile is kind of mandatory? Or did I misunderstand something?
Also I removed the pid written in the lockfile, do you think it should be kept for debugging purpose? We don't need it anymore here but it's not clear to me if it's still useful for users.
As for the build I'm not sure how I can provide it based on the CI output yet but I'm kind of new around here, I'll ask around and let you know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
As said, I might be way off base, but the way we ended up doing the file locks (a simplification as I was not the one implementing 😉) is just to have the file there, take the lock from the process, and ask the OS if the file has locks or not.
That way, the lock file always exists, and in debugging cases you can go in and check with other tools also what is holding the lock.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I'll discuss about it with the others, I'll consider making a separate PR with that logic (as this one only treats about not relying on pid). We have plans to refactor/review/revamp this lockfile system anyway so that's good input. 👍🏽
ff3c1f5
to
b201ef2
Compare
b201ef2
to
b151bba
Compare
b151bba
to
77bb95b
Compare
This should remove the possibility of #1834 to ever happen, but does not attempt to fix the "clean" issue.
This change removes reliance on pid written in a lockfile for the lockfile mechanism to know if the same process is attempting to lock the same file twice. This case is currently an error (throws
std::logic_error
) but this will probably change later. Meanwhile this fix changes how lockfiles are implemented so that they could share ownership of the lock if the process already did lock that file. I kept the check for double locking for backward compatibility.