-
Notifications
You must be signed in to change notification settings - Fork 418
Skip any erroring entry in FilesystemStore::list
#3799
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Skip any erroring entry in FilesystemStore::list
#3799
Conversation
👋 Thanks for assigning @TheBlueMatt as a reviewer! |
🔔 1st Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @TheBlueMatt! This PR has been waiting for your review. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do we want to do about MigratableKVStore::list_all_keys
?
I also feel like we need to specify the expected semantics at the API-level here. This change can allow for an inconsistent list view - eg if someone wrote key A then deleted key B the list API can return a value without A and without B, which is weird in some cases. |
I don't think we should change anything about that, as for migration we'd want be as strict as possible? For one, users really shouldn't run migration in-parallel to anything else (which is basically already documented on
Are we sure this would be the case? I'm not sure if it is honestly. Before we change the docs, we should probably first try to understand better what exactly happened in the user-reported instance. |
I think so. We'd first list the directory entries that doesn't have A or B, then the other thread can remove A and write B and the loop will go through, detect that A is missing and ignore it, but never look for B. Absent a full restart of the whole loop and giving up after a while I'm not sure how we'd fix it, though, honestly. |
Issues like this reminds me that we still need to provide a better default |
✅ Added second reviewer: @valentinewallace |
🔔 3rd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
Coming back to this: I don't think we'd want to change any |
The same issue applies in any KVStore once we're looking at multi-level issues. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, we should be able to fix this by calling https://doc.rust-lang.org/std/fs/struct.DirEntry.html#method.file_type instead.
🔔 4th Reminder Hey @valentinewallace! This PR has been waiting for your review. |
🔔 1st Reminder Hey @valentinewallace! This PR has been waiting for your review. |
🔔 5th Reminder Hey @valentinewallace! This PR has been waiting for your review. |
🔔 2nd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
How come? IIUC, So why would this reliably solve the issue at hand? |
Currently the code calls |
Not sure if I follow here. Now pushed a fixup using |
Oh, hmm, my memory was fuzzy on what Sadly, per |
Also, actually, |
Because “directories” are a concept that is only known to the specific implementation that is |
Right, but generally they indicate the presence of a sub-namespace. Which presumably we should return per the docs on |
🔔 3rd Reminder Hey @valentinewallace! This PR has been waiting for your review. |
No, we discussed this several times in the past, and IIRC we can't just require that a) due to how our upgrade path from the original So, TLDR: yes, would have been great, but we didn't/couldn't do it this way, which is also why we had to introduce a separate
See above: directories are a filesystem-only concept. And yes, technically we do name them differently in that we don't return them at all, which we can't easily change without breaking backwards compatibility.^^ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not the biggest fan of adding a loop there as it might be unexpected for users that we just retry in this particular case.
Not sure how we can fix this issue without either (a) looping until we succeed or (b) changing the KVStore
API so that we're allowed to return partial results in case of "tears" here. The current PR fixes the logic for ext4/btrfs, but not every FS, and eg I doubt it fixes it for c=.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
It really shows the that using the filesystem simply has some shortcomings and doesn't provide basic ACID guarantees, especially across different platforms. In my eyes neither a) nor b) is an option: a) if we'd add a loop, we def. can't loop forever. So we'll have to add proper retry logic on our end, aborting operation after N retries. But, any magic number N here could lead to behavior that the user doesn't expect, especially if the API contract is that we bubble up IO errors. We'd need to make N configurable and document it, which is in turn an impossible API since no user will really understand what this parameter is for. b) I don't think this is a good option either. IMO we fixed what was immediately addressable in this PR, but the real fix will be to add fully-supported SQLite/Postgres backends to
They already implement retry logic around the |
Thanks for the ping.
If I get the time and mandate, something like this would be fun to write. Now that you all use rustfmt, I bet it wouldn't even get 500 comments. |
This isn't true. We screwed up the directory structure and are hitting this issue because of it. If we put top-level keys in a directory instead we'd be just fine here. Plenty of large applications use filesystems as K-V stores and they generally are really good at it!
Given we've hit this only recently in prod even with a large node that does high-latency persistence, I imagine a few loops will ~always work in practice unless there's something really bad going on. Indeed, we might want a limit, but that limit can be fairly high and I don't think we need to worry about that too much.
Or not?
I tend to agree. Hence why we should add a loop :)
This is news to me? What requirements are we adding and why would they be problematic here?
Which probably means we really need to take that logic upstream and add a loop? |
1039663
to
fba1095
Compare
As mentioned elsewhere, I still think retry logic shouldn't necessarily be the concern of |
5d9b3b1
to
989cc6f
Compare
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #3799 +/- ##
==========================================
- Coverage 88.93% 88.93% -0.01%
==========================================
Files 174 174
Lines 123876 123875 -1
Branches 123876 123875 -1
==========================================
- Hits 110173 110170 -3
- Misses 11250 11251 +1
- Partials 2453 2454 +1
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
Previously, the internal `dir_entry_is_key` method would take a path argument, which would risk a race when the respective entry was modified in the time between the original `fs::read_dir` call and the `Path::metadata` call. Here, we instead have it take a `DirEntry` argument, which--at least on some platforms--should allow us to avoid this race as `DirEntry::metadata` would return the original (cached) metadata.
Previously, we might hit a certain race condition between `fs::read_dir` and actually accessing the directory entry in `FilesystemStore::list`. Here, we introduce a `loop` retrying the listing (by default 10 times) when we suddenly hit a `NotFound` error, to hopefully reach a consistent view on the second/third time around.
989cc6f
to
9d32931
Compare
Amended with the following changes: diff --git a/lightning-persister/src/fs_store.rs b/lightning-persister/src/fs_store.rs
index 735b68165..5580382b3 100644
--- a/lightning-persister/src/fs_store.rs
+++ b/lightning-persister/src/fs_store.rs
@@ -36,5 +36,5 @@ const GC_LOCK_INTERVAL: usize = 25;
// The number of times we retry listing keys in `FilesystemStore::list` before we give up reaching
// a consistent view and error out.
-const LIST_DIR_CONSISTENCY_RETRIES: usize = 3;
+const LIST_DIR_CONSISTENCY_RETRIES: usize = 10;
/// A [`KVStoreSync`] implementation that writes to and reads from the file system.
@@ -328,5 +328,5 @@ impl KVStoreSync for FilesystemStore {
match res {
Ok(true) => {
- let key = git_key_from_dir_entry_path(&p, &prefixed_dest)?;
+ let key = get_key_from_dir_entry_path(&p, &prefixed_dest)?;
keys.push(key);
},
@@ -405,5 +405,5 @@ fn dir_entry_is_key(dir_entry: &fs::DirEntry) -> Result<bool, lightning::io::Err
}
-fn git_key_from_dir_entry_path(p: &Path, base_path: &Path) -> Result<String, lightning::io::Error> {
+fn get_key_from_dir_entry_path(p: &Path, base_path: &Path) -> Result<String, lightning::io::Error> {
match p.strip_prefix(&base_path) {
Ok(stripped_path) => {
@@ -469,5 +469,5 @@ impl MigratableKVStore for FilesystemStore {
let primary_namespace = String::new();
let secondary_namespace = String::new();
- let key = git_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
+ let key = get_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
keys.push((primary_namespace, secondary_namespace, key));
continue 'primary_loop;
@@ -481,7 +481,7 @@ impl MigratableKVStore for FilesystemStore {
if dir_entry_is_key(&secondary_entry)? {
let primary_namespace =
- git_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
+ get_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
let secondary_namespace = String::new();
- let key = git_key_from_dir_entry_path(&secondary_path, &primary_path)?;
+ let key = get_key_from_dir_entry_path(&secondary_path, &primary_path)?;
keys.push((primary_namespace, secondary_namespace, key));
continue 'secondary_loop;
@@ -495,8 +495,8 @@ impl MigratableKVStore for FilesystemStore {
if dir_entry_is_key(&tertiary_entry)? {
let primary_namespace =
- git_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
+ get_key_from_dir_entry_path(&primary_path, prefixed_dest)?;
let secondary_namespace =
- git_key_from_dir_entry_path(&secondary_path, &primary_path)?;
- let key = git_key_from_dir_entry_path(&tertiary_path, &secondary_path)?;
+ get_key_from_dir_entry_path(&secondary_path, &primary_path)?;
+ let key = get_key_from_dir_entry_path(&tertiary_path, &secondary_path)?;
keys.push((primary_namespace, secondary_namespace, key));
} else { |
continue 'skip_entry; | ||
}, | ||
Err(e) => { | ||
if e.kind() == lightning::io::ErrorKind::NotFound && retries > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is right because the dir_entry_is_key
metadata()
call gets its error mapped to an ErrorKind::Other
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, good catch, I had forgotten we started to do that at some point. Now added a commit reverting this, as remapping to ErrorKind::Other
is nonsense, IMO, if we already have stronger typed io::Error
.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
If we already have an `io::Error`, we shouldn't remap to `Other`, as it would have us lose type information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gonna go ahead and land this since its pretty straightforward, whether it suffices or not I dunno.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
Closes #3795.
Previously, we would bubble up any error that we'd encouter during retrieving the metadata for all listed entries. Here we relax this somewhat to allow for minor inconsistencies between reading the directory entries and checking whether they are valid key entries.