Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Database is often lost #198

Closed
balta2ar opened this issue Jan 30, 2017 · 21 comments · Fixed by #247
Closed

Database is often lost #198

balta2ar opened this issue Jan 30, 2017 · 21 comments · Fixed by #247

Comments

@balta2ar
Copy link

I've been suffering from this for a while now, and today decided that I should report. Basically, the issue is described here:

One thing that really annoys me though is that z looses its history when I press Enter rapidly multiple times (I sometimes clear my window this way), before zsh is able to prepare another prompt. This is 100% reproducible on my home machine, but not on my work machine. I don't know whether the culprit is z, fzf wrapper or zsh async, I'm just mentioning it here as zsh async may be related (I'm not sure).

And the following comment:

I've experienced the same issue with z. It's definitely a bug/race-condition in z. My theory is that one instance of z manages to read the database at some point where it's being replaced by another instance, resulting in an empty db.

For this reason I made sure that zsh-async does not run any hooks inside the worker (async.zsh#L75-L80).

I believe this issue was aggravated by either 4a8b741 or 1a361f6 or both.

@rupa Could you please look into this?

@balta2ar
Copy link
Author

Just for reference and history, related old issue: #191

@mafredri
Copy link

mafredri commented Feb 21, 2017

I did a quick hack for zsh to use advisory file locking, should prevent any database corruption. Furthermore, using $RANDOM for the temporary file prefix is not sufficient, switched to mktemp here to avoid collisions. It's entirely possible that the issue described here is simply due to the use of $RANDOM as well.

@balta2ar feel free to try it out if you're interested:

EDIT: Not sure this works as intended, will look closer tomorrow.

diff --git a/z.sh b/z.sh
diff --git a/z.sh b/z.sh
index 24e1272..e7cadbb 100644
--- a/z.sh
+++ b/z.sh
@@ -23,6 +23,8 @@
 #     * z -l foo  # list matches instead of cd
 #     * z -c foo  # restrict matches to subdirs of $PWD
 
+zmodload zsh/system
+
 [ -d "${_Z_DATA:-$HOME/.z}" ] && {
     echo "ERROR: z.sh's datafile (${_Z_DATA:-$HOME/.z}) is a directory."
 }
@@ -56,7 +58,9 @@ _z() {
         done
 
         # maintain the data file
-        local tempfile="$datafile.$RANDOM"
+        local tempfile="$(mktemp "${datafile}.XXXXXXXX")"
+        local lockfd
+        zsystem flock -t 2 -f lockfd "$datafile" || return
         awk < <(_z_dirs 2>/dev/null) -v path="$*" -v now="$(date +%s)" -F"|" '
             BEGIN {
                 rank[path] = 1
@@ -80,13 +84,9 @@ _z() {
                 } else for( x in rank ) print x "|" rank[x] "|" time[x]
             }
         ' 2>/dev/null >| "$tempfile"
-        # do our best to avoid clobbering the datafile in a race condition.
-        if [ $? -ne 0 -a -f "$datafile" ]; then
-            env rm -f "$tempfile"
-        else
-            [ "$_Z_OWNER" ] && chown $_Z_OWNER:$(id -ng $_Z_OWNER) "$tempfile"
-            env mv -f "$tempfile" "$datafile" || env rm -f "$tempfile"
-        fi
+        [ "$_Z_OWNER" ] && chown $_Z_OWNER:$(id -ng $_Z_OWNER) "$tempfile"
+        env mv -f "$tempfile" "$datafile" || env rm -f "$tempfile"
+        zsystem flock -u $lockfd
 
     # tab completion
     elif [ "$1" = "--complete" -a -s "$datafile" ]; then

@balta2ar
Copy link
Author

EDIT: Not sure this works as intended, will look closer tomorrow

Sure, I'm waiting for your signal.

@mafredri
Copy link

mafredri commented Mar 1, 2017

@balta2ar here's an updated patch. My uncertainty from before was with regards to using flock and then replacing the file with another (via mv). I confirmed that a move causes any pending process waiting to lock to fail. In this version we never replace the db and as such flock will work as expected.

diff --git a/z.sh b/z.sh
diff --git a/z.sh b/z.sh
index 24e1272..da4f411 100644
--- a/z.sh
+++ b/z.sh
@@ -23,6 +23,8 @@
 #     * z -l foo  # list matches instead of cd
 #     * z -c foo  # restrict matches to subdirs of $PWD
 
+zmodload zsh/system
+
 [ -d "${_Z_DATA:-$HOME/.z}" ] && {
     echo "ERROR: z.sh's datafile (${_Z_DATA:-$HOME/.z}) is a directory."
 }
@@ -55,8 +57,14 @@ _z() {
             case "$*" in "$exclude*") return;; esac
         done
 
+        # make sure datafile exists (for locking)
+        [[ -f "$datafile" ]] || touch "$datafile"
+
         # maintain the data file
-        local tempfile="$datafile.$RANDOM"
+        local tempfile="$(mktemp "${datafile}.XXXXXXXX")"
+        local lockfd
+        # lock datafile, released when function exits
+        zsystem flock -f lockfd "$datafile" || return
         awk < <(_z_dirs 2>/dev/null) -v path="$*" -v now="$(date +%s)" -F"|" '
             BEGIN {
                 rank[path] = 1
@@ -80,13 +88,10 @@ _z() {
                 } else for( x in rank ) print x "|" rank[x] "|" time[x]
             }
         ' 2>/dev/null >| "$tempfile"
-        # do our best to avoid clobbering the datafile in a race condition.
-        if [ $? -ne 0 -a -f "$datafile" ]; then
-            env rm -f "$tempfile"
-        else
-            [ "$_Z_OWNER" ] && chown $_Z_OWNER:$(id -ng $_Z_OWNER) "$tempfile"
-            env mv -f "$tempfile" "$datafile" || env rm -f "$tempfile"
-        fi
+        # replace contents of datafile with tempfile
+        env cat "$tempfile" >| "$datafile"
+        [ "$_Z_OWNER" ] && chown $_Z_OWNER:$(id -ng $_Z_OWNER) "$datafile"
+        env rm -f "$tempfile"
 
     # tab completion
     elif [ "$1" = "--complete" -a -s "$datafile" ]; then

@balta2ar
Copy link
Author

balta2ar commented Mar 1, 2017

@mafredri Thank you very much! I've applied the patch and did a short experiment trying to overwrite the DB. So far it seems to be acting good. But I'll keep an eye on it for while to be more sure.

I wonder, do you think flock command could be used here so that it's not zsh-specific and could be used in bash as well?

@mafredri
Copy link

mafredri commented Mar 1, 2017

@balta2ar that command does not exist on macOS, for example, so it wouldn't really be cross-platform :(.

@balta2ar
Copy link
Author

balta2ar commented Mar 1, 2017

@mafredri Could this be an option? Or this. It's not looking pretty though...

@mafredri
Copy link

mafredri commented Mar 1, 2017

It kind of beats the purpose of keeping z as a shell script if it relies on additional software being installed :/. This can of course be cleaned up to still work with bash as well, but unfortunately, bash would not enjoy the benefits of locking the DB.

@balta2ar
Copy link
Author

balta2ar commented Mar 1, 2017

bash would not enjoy the benefits of locking the DB

Sorry, I'm not following... Why?

if it relies on additional software

Yeah, you're right. But on the second hand, feature detection could be performed and it would still work without flock. It would work more reliably with it, though.

Anyway, I'm just thinking of a way to have it merged. Now, if I get it right, this won't work in bash as there is no zmodload in bash, obviously. Right, @mafredri?

@mafredri
Copy link

mafredri commented Mar 1, 2017

Sorry, I'm not following... Why?

As in, we would only use zsystem flock in zsh. When I talked about cleaning it up, I did not mean adding support for the flock-command on systems that have it available. In this scenario, bash would have no way to lock.

But on the second hand, feature detection could be performed and it would still work without flock. It would work more reliably with it, though.

Would be a possibility, only beneficial for bash though, since zsh can use zsystem flock.

[...]. Now, if I get it right, this won't work in bash [...]

Yes, correct, would not work in bash as it's a zsh only feature.

@balta2ar
Copy link
Author

balta2ar commented Mar 1, 2017

Yes, correct, would not work in bash as it's a zsh only feature

Thanks for clarifications! What I meant is that at the moment this patch renders z unusable in bash if applied as is, while it could be structured to detect whether we're running in zsh or bash and use zmodload only in zsh.

Anyway, looks like there is a working solution and I'm ok with patching z manually. Let's see how @rupa decides to integrate it.

@dfaure-kdab
Copy link

Ping? This makes z unusable for me.

@mafredri
Copy link

@dfaure-kdab it doesn't seem like rupa is active at the moment, but feel free to use #199 if you're using zsh.

@rupa
Copy link
Owner

rupa commented Apr 12, 2017

i'm aware of and thinking about this.

I'm definitely not in any hurry to add dependencies, although I'm considering bringin back mktemp if that would help.

It's tough for me to evaluate zsh stuff because: it's so configurable, so many people use it with an opinionated framework, and I don't use it.

Anyone that can post a nice setup that reliably clobbers the db, that would be cool

@mafredri
Copy link

@rupa I can take a stab at creating a repro for you.

PS. My PR does introduce the mktemp dependency, but no other. Uses feature detection to use flock if available (available in default configurations on systems that support flock). I can't think of many other methods that aren't racey in one form or another.

@mcornella
Copy link
Contributor

mcornella commented Aug 29, 2018

The culprit of this is the switch to using a background process inside a subshell, coupled with the fact that _z_precmd is called inside the precmd hook, instead of the chpwd one (proposed fix).

I've layed down the basics here: ohmyzsh/ohmyzsh#7094 (comment), but the basic gist is that in zsh, $RANDOM will give the same value on subshells, unless it is referenced or seeded in the parent process (see its documentation). This remarkably doesn't happen in bash.

You can reproduce this issue from a directory other than $HOME, with the following (save your .z file before attempting):

repeat 100 { (_z --add "${PWD:a}" &) }

It should give many "no such file or directory" errors, while this one:

repeat 100 { (_z --add "${PWD:a}" &); : $RANDOM }

shouldn't. Notice that I'm referencing $RANDOM after each subshell creation, thus changing its value in each subshell. We don't need mktemp or other dependencies to ensure a unique value then.

The downside of that is that $RANDOM in zsh is not entirely reliable. I made a basic test to check the uniqueness of a random-number sequence using $RANDOM and after 120+ generated numbers the sequence starts to have repeated numbers. Not to worry though, because that means that there'd have to be more than 120 concurrent _z executions for it cause trouble. Even less worrying given that I haven't been able to reproduce any collisions even after 1000 simultaneous runs.

That said, using a precmd hook means that each time the prompt is drawn _z will be called, with the associated cost in computation and IO that aren't doing anything useful.

So here's my proposed fix: use the chpwd hook instead of the precmd one, which only runs after changing to another directory. This is the easiest, most direct fix, and it also makes sense. It will also have the effect that holding down the enter key will not spam _z with added work.

Finally, there's an edge case even running it in a chwpd hook, which is cd-ing into two directories at once:

mkdir ~/{01,02}
cd ~/01; cd ~/02

This will again use the same $RANDOM number. This will probably never happen, as nobody does this, but adding the reference to $RANDOM at the end of the function will fix it for any use case nonetheless.

TL;DR Proposed change:

_z_chpwd() {
	(_z --add "${PWD:a}" &)
	: $RANDOM
}
chpwd_functions+=(_z_chpwd)

I hope that wasn't too verbose. Cheers!

@mcornella
Copy link
Contributor

Ok, so now that #215 is closed because using chpwd is not suitable, the proposed fix is just referencing $RANDOM after each subshell invocation:

_z_precmd() {
	(_z --add "${PWD:a}" &)
	: $RANDOM
}

Let me know if you want me to submit a PR.

@balta2ar
Copy link
Author

@mcornella Please do, I'd like to give it a try.

@mcornella
Copy link
Contributor

I've submitted #247

@dmd
Copy link

dmd commented Oct 24, 2019

Rupa: This would be worth looking into and dealing with, as use of zsh is going to go way up as Mac people move to Catalina, which is zsh by default (and actively deprecates bash).

@Sti2nd
Copy link

Sti2nd commented Mar 18, 2020

I experience that z will clear all visited paths from time to time. I don't know how to reproduce this, have happened three times the last three months.

Using z on Git Bash on Windows 10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants