-
-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shutdown issue with badgerDS - keeps reading from the disk #7283
Comments
Ah, interesting. Badger is garbage collecting at that point. Or, to be accurate, it's scanning to see if there's anything that needs to be garbage collected. I've filed dgraph-io/badger#1324. However, for now, we should probably do the same systemd-notify dance on shutdown. |
...? I had to wait a while to get all data collected for the other bug report, but I guess on the startup badger is doing the same stuff, since the second startup is within 1 second. |
Sorry, dangling edit.
Well, on startup badger may need to clean something if it was killed on shutdown. Otherwise, I'm not sure what it's doing. |
Yeah, the stack trace sounds like something like this is happening "valuelog open, valuelog replayLog, valuelog iterate". But it's strange that the same datastore can be opened within a second if the first opening process is killed. Maybe there's a detection for 'not clean recovery' which avoids the second attempt? |
Ah, badger may then recognize that the datastore is corrupted and, instead of trying to fix it, it just truncates the unsynced changes (we've configured it to do that because we explicitly call |
I thought about that again... what's the trigger for this garbage collection in the first place? Shouldn't we just start to garbage collect right after the IPFS GC was running (which is not active in this setup)? 🤔 I mean, is there anything that badger is able to clean up, if we haven't run our own GC? |
@Stebalien what data exactly is stored temporarily or semi-temporarily in the datastore which would collect if we wouldn't run the badger GC? DHT data? If so, can I avoid having this background GC running if I switch to DHTclient? Just searching for temporary solution that my shutdowns don't crash :) I wrote regarding the badger GC in ipfs/go-ds-badger#54 (comment):
This would the behavior a bit more transparent. |
DHT data, local provider records, other misc stuff? I'd extend your shutdown timer for now.
Also, how much data do you have?
|
That's is the real database, not the test-database:
But I plan to use a lot more storage on this server for another cluster... like 1-1.5 TB. I mean, reading up to 2 TB with 2 MB/s isn't going to terminate any time soon (if really all data is read as well). And shutting down hard on any security update, is also no good option either. |
Version information:
Commit 591c541
Description:
ipfs init --profile=badgerds
.QmdB8kVBeWvLKyZrvxAAzrVfkLZC3zqcu6o7twLAqUcC67
Then I tried to shut down the daemon. Unexpectedly IPFS started to read on the disk, while nothing was written (according to iotop for minutes):
The following experimental features was activated at the time in the config:
Datastore config:
cap_net_bind_service=+ep
to be able to run on port 443.LIBP2P_SWARM_FD_LIMIT
was set to1000
./usr/bin/ipfs daemon --init --migrate
I fetched the debug data and killed it with SIGABRT to get the stack trace - both are attached.
stacktrace.txt
debug.tar.gz
The text was updated successfully, but these errors were encountered: