-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow searchable snapshot cache service to periodically fsync cache files #64696
Changes from 8 commits
012eee2
499f12b
31299c1
bf0e660
9085f81
fb10c6f
f5d9559
684e01e
6b238d6
eabe562
af47aaa
437c8ea
c20bf75
365812d
8fe920b
74f728f
5091a5d
f69dde9
395845d
082b76d
ea34570
54d051d
e8ad9be
6066924
e9721c4
bd1ce13
46adb3c
04a4d77
db24043
10216ea
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -650,6 +650,36 @@ public void remove() { | |
}; | ||
} | ||
|
||
/** | ||
* An LRU sequencing of the entries in the cache. This sequence is not protected from mutations | ||
* to the cache (except for {@link Iterator#remove()}. The result of iteration under any other mutation is | ||
* undefined. | ||
* | ||
* @return an LRU-ordered {@link Iterable} over the entries in the cache | ||
*/ | ||
public Iterable<Tuple<K,V>> entries() { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This looks unused now? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Just a drive by comment @tlrx this is still unused and can probably go away, just in case you missed it :) I'll take a proper look at this PR in general now There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Thanks! I'll remove it.
I was about to ping you once Henning approved the direction :) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I removed the unused |
||
return () -> new Iterator<>() { | ||
|
||
private final CacheIterator iterator = new CacheIterator(head); | ||
|
||
@Override | ||
public boolean hasNext() { | ||
return iterator.hasNext(); | ||
} | ||
|
||
@Override | ||
public Tuple<K,V> next() { | ||
final Cache.Entry<K, V> next = iterator.next(); | ||
return Tuple.tuple(next.key, next.value); | ||
} | ||
|
||
@Override | ||
public void remove() { | ||
iterator.remove(); | ||
} | ||
}; | ||
} | ||
|
||
private class CacheIterator implements Iterator<Entry<K, V>> { | ||
private Entry<K, V> current; | ||
private Entry<K, V> next; | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -77,6 +77,13 @@ protected void closeInternal() { | |
**/ | ||
private final AtomicBoolean needsFsync = new AtomicBoolean(); | ||
|
||
/** | ||
* A consumer that accepts the current {@link CacheFile} instance every time the {@link #needsFsync} flag is toggled to {@code true}, | ||
* which indicates that the cache file has been updated. | ||
* See {@link #markAsNeedsFSync()} method. | ||
*/ | ||
private final Consumer<CacheFile> needsFsyncListener; | ||
|
||
/** | ||
* A reference counted holder for the current channel to the physical file backing this cache file instance. | ||
* By guarding access to the file channel by ref-counting and giving the channel its own life-cycle we remove all need for | ||
|
@@ -117,10 +124,11 @@ protected void closeInternal() { | |
@Nullable | ||
private volatile FileChannelReference channelRef; | ||
|
||
public CacheFile(String description, long length, Path file) { | ||
public CacheFile(String description, long length, Path file, @Nullable Consumer<CacheFile> fsyncListener) { | ||
this.tracker = new SparseFileTracker(file.toString(), length); | ||
this.description = Objects.requireNonNull(description); | ||
this.file = Objects.requireNonNull(file); | ||
this.needsFsyncListener = fsyncListener != null ? fsyncListener : cacheFile -> {}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Looks like we're only ever passing There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done in af47aaa |
||
assert invariant(); | ||
} | ||
|
||
|
@@ -320,7 +328,7 @@ protected void doRun() throws Exception { | |
reference.decRef(); | ||
} | ||
gap.onCompletion(); | ||
needsFsync.set(true); | ||
markAsNeedsFSync(); | ||
} | ||
|
||
@Override | ||
|
@@ -420,6 +428,15 @@ boolean needsFsync() { | |
return needsFsync.get(); | ||
} | ||
|
||
/** | ||
* Marks the current cache file as "fsync needed" and notifies the corresponding listener. | ||
*/ | ||
private void markAsNeedsFSync() { | ||
if (needsFsync.getAndSet(true) == false) { | ||
needsFsyncListener.accept(this); | ||
} | ||
} | ||
|
||
/** | ||
* Ensure that all ranges of data written to the cache file are written to the storage device that contains it. This method performs | ||
* synchronization only if data has been written to the cache since the last time the method was called. If calling this method | ||
|
@@ -444,12 +461,12 @@ public SortedSet<Tuple<Long, Long>> fsync() throws IOException { | |
assert completedRanges != null; | ||
assert completedRanges.isEmpty() == false; | ||
|
||
IOUtils.fsync(file, false, false); // TODO don't forget to fsync parent directory | ||
IOUtils.fsync(file, false, false); | ||
success = true; | ||
return completedRanges; | ||
} finally { | ||
if (success == false) { | ||
needsFsync.set(true); | ||
markAsNeedsFSync(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will add it back on the queue, so we continually try to fsync a bad file. I saw the comments Armin made in other places about this, we can tackle this in the same follow-up. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Let's do that 👍 |
||
} | ||
} | ||
} | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think we can but the existing tests assume that fsync can be executed at any time even when fsync is not needed. Those tests should be adapted as well, maybe in a follow up? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 👍 |
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is true. In fact, it looks like the lru-chain would be unsafely published with no happens-before to the reader.
I think the iteration of the entries done in this PR is thus unsafe, at least it might skip some entries.
I think the same applies to iterating over
keys()
, which we do a few places.Together with other comments in this PR, this makes me think that perhaps it was easier to let
CacheFile
's that need fsync register with theCacheService
(or a registry in between)?