-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ipfs commands such as id fail when daemon has been started and stopped. #5784
Comments
id
fail when daemon has been started and stopped.Sad face, see ipfs/kubo#5784 for details.
Thanks for the bug report, and the script to reproduce. My api directory does get cleaned out on macos but I see the same behavior you experienced on linux. That's definitely messed up. |
The ipfs daemon may take a little while to shutdown after you kill it with a signal, so if you try and use a command while it is shutting down you will get this error. Try adding a sleep after the kill and it should work. A better error message could be useful though. |
@kevina thanks for the suggestion but the file
A simple work-around in Peergos (where we manage an ipfs runtime programmatically) is to remove However, it is surprising behaviour that requires time to understand, hence the bug-report. |
@cboddy Thanks so much for reporting this in such a clear and descriptive way. I'm personally assigning myself here because this was the error I encountered when trying IPFS for the very first time and drove me (out of my frustration as a newbie who wants it all and wants it now) pretty close to leaving the project altogether. |
This works for me: #!/bin/bash
set -x
set -e
IPFS=ipfs
function run_it() {
export IPFS_PATH=$(mktemp -d)
$IPFS init
$IPFS id # completes fine
$IPFS daemon &
sleep 5
kill $!
#$IPFS id # will fail
sleep 10
$IPFS id # ok
}
run_it |
@cboddy; @kevina is right in my case. My api directory is consistently deleted, and @cboddy - if you run the daemon (in you script for instance) with the -D option and look at the last dozen lines of output do you see logging from the cleanup code like?
|
Yes, if we really kill the process it won't cleanup the #!/bin/bash
set -x
set -e
IPFS=ipfs
function run_it() {
export IPFS_PATH=$(mktemp -d)
$IPFS init
$IPFS id # completes fine
$IPFS daemon &
sleep 10
kill -SIGKILL $! # Die fast, don't wait on any cleanup.
! $IPFS id # Allow this command to bypass `-e` check.
while [ $? -eq 0 ]; do
sleep 3
! $IPFS id
done # This never ends.
}
run_it |
I think I’d say if SIGTERM (or like SIGINT) doesn’t clean the api directory that’s a bug and if SIGKILL doesn’t clean it then we could do better but it’s debatable. What’s the right thing? Should we treat api like a pidfile and exclusive lock it to indicate the presence of the daemon instead of create/delete? |
I removed the bug label and changed this it to enhancement. It is a bug if killing SIGTERM/SIGINT doesn't (eventually) remove the $IPFS_PATH/api file. When killed with SIGKILL there is no way to clean this up. We should be more intelligent about what to do afterwards. As previously mentioned if |
Actually @cboddy if you can confirm again that your api directory isn't removed on SIGTERM then it's a bug of some sort. We just can't reproduce it ourselves. I mean at least I can't. Did you check the debug logging output to see if cleanup operations are taking place? |
This is basically a duplicate of #2395. |
If this issue (which is labeled as a bug in the dupe'd issue) is so painful that some of our dedicated community members almost dropped the project over it - imagine how many people ran into this issue and didn't have the perseverance/faith to push through that frustration to become a long term contributor. I think that is sufficient reason to prioritize resolving this issue (aka clean this file up consistently) or at least add much better error handling to enable easy debugging and not burn contributor time on this rough edge. |
So, the bug here isn't that the API file isn't getting cleaned up on KILL (@kevina's right, there's nothing we can do about that). The bug is that we're not cleaning it up later. This came up here as well: ipfs/ipfs-desktop#722 (comment) The tricky part is that having an API file without running a daemon is actually a feature: This is how users tell IPFS to use a remote IPFS instance. However, while fixing this in ipfs-desktop is a bit tricky, it's actually pretty easy to fix here:
Step 2 will fail if the local repo is either locked (a daemon is running) or the local repo hasn't been initialized (we're using a remote daemon). The downside is that this prevents users from switching back and forth between a local and a remote daemon. I'm not sure if that's something we really need to support. |
Yes, the KILL was just an example, the system can also crash and we would be in the same scenario.
That may be so, but the first steps any user will perform don't acknowlegde that, nor the importance of having a
From my own personal interpretation of reading the docs, the idea we're normally selling here (and that's something we may need to revise) is that you can just run
If we all agree on this I'll try to apply this fix. |
In all cases where we have a valid local repo, yes. |
We can instruct users to create a separate |
We should also prevent |
@kevina thanks for that; yes that works for me as well. I'd also suggest pinging the remote-endpoint defined in the api file to ensure it is available (as first suggested by @Stebalien) since:
|
It shouldn't really add any overhead. If the |
Unfortunately, I don't understand enough about the Hopefully, ipfs/go-ipfs-cmds#138 should mitigate this issue providing enough information to the user to bypass this (blocking) problem. |
@schomatis the only way you're going to better understand how all this works is by reading the code and trying to fix bugs like this. |
Version information:
Output From
ipfs version --all
Type:
Bug
Description:
When running an ipfs command that is not
ipfs daemon
, the existence of the file$IPFS_PATH/api
decides whether to use the HTTP API to complete the command or not (eg.ipfs id
does this). This happens in fsrepo.goThe file
$IPFS_PATH/api
is created when theipfs daemon
command is started. the problem is that when the daemon process is shut down via a signal it does not remove$IPFS_PATH/api
.This leads to behaviour that after an
ipfs daemon
process has been started and terminated, other ipfs commands that can be fulfilled with or without the HTTP api (eg.ipfs id
) will always fail with the messageA minimal example that reproduces the failure is shown below:
I don't think this is the intended behaviour (?).
If there isn't already a signal handler to remove
$IPFS_PATH/api
(via FSRepo.Close when the daemon is interrupted then could we add one for this?Alternatively, if the command fails in this manner but it can be fulfilled without the ipfs HTTP api why not do it without HTTP and remove
$IPFS_PATH/api
at that point?The text was updated successfully, but these errors were encountered: