Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CacheFile#fsync() method to ensure cached data are written on disk #64201

Merged
merged 9 commits into from
Nov 3, 2020

Conversation

tlrx
Copy link
Member

@tlrx tlrx commented Oct 27, 2020

This pull request adds a new CacheFile.fsync() method to fsync the cache file on disk. This method uses the utility method IOUtils.fsync(Path, boolean) that executes a FileChannel.force() call under the hood.

@tlrx tlrx added >enhancement :Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs v8.0.0 v7.11.0 labels Oct 27, 2020
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-distributed (:Distributed/Snapshot/Restore)

@elasticmachine elasticmachine added the Team:Distributed Meta label for distributed team (obsolete) label Oct 27, 2020
try {
// check again if the file is evicted before doing expensive I/O
ensureOpen();
IOUtils.fsync(file, false); // TODO don't forget to fsync parent directory
Copy link
Member Author

@tlrx tlrx Oct 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method is intended to be called periodically by a task that will fsync all cache files of a given shard before fsyncing the parent directory.

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just two questions, looks good overall

try {
// check again if the file is evicted before doing expensive I/O
ensureOpen();
IOUtils.fsync(file, false); // TODO don't forget to fsync parent directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder, can't we just call force on the file channel that we already have open if we have one and avoid fsync if we don't have one by fsyncing before closing the channel (would have to enhance acquireFileChannelReference to return null in case there's no eviction listeners on the current instance I guess)? No need to get a new file descriptor here maybe is there?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I also thought about borrowing the current file channel if we have one but I found that it added more complexity (mostly in incref/decref the file channel another time) to save a file descriptor that would be quickly released.

I can maybe give it another try by implementing something that use the file channel if there's one or use IOUtils.fsync() otherwise. Deferring the fsync at closing time does not seem the right thing to do as we don't want to fsync everytime a chunk is written but instead control when cache files are fsynced.

@@ -305,6 +312,7 @@ protected void doRun() throws Exception {
reference.decRef();
}
gap.onCompletion();
fsynced.set(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be better to set this false as early as possible (i.e. directly after the write to the file) to reduce contention statistically speaking? (since fsync and write will always be blocking each other anyway even if we do use multiple FileChannel instances I think).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that fsynced not only indicate that something was written to the file but that ranges are also updated - although it is not a requirement. I'm not sure if it makes a big difference but I can move this earlier.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah that's a good point :)

Copy link
Member

@original-brownbear original-brownbear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either way (regarding the file channels) LGTM, that comment does only matter if the fsync interval we will use is small I guess :)

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few comments, looking good otherwise.

final List<Tuple<Long, Long>> completedRanges = tracker.getCompletedRanges();
assert completedRanges != null;

if (fsynced.compareAndSet(false, true)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an assumption of only ever going here in one thread, since otherwise we risk a concurrent thread seeing the completedRanges. I think that is fair, but perhaps for safety we should either lock the entire method or assert something to capture this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is OK, but your comment and your suggestion about returning an empty set of ranges if the cache file got evicted made me change a bit the semantic of this new fsync method.

I pushed d55278e which makes the method return a non-empty set of completed ranges only if the cache file was actually fsynced on disk; otherwise it returns an empty collection. This should address the case of concurrent threads and also simplify the logic to know if new ranges of data were effectively synchronized on disk or not.

try {
// check again if the file is evicted before doing expensive I/O
ensureOpen();
IOUtils.fsync(file, false); // TODO don't forget to fsync parent directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also fsync's the metadata of the file. I think that is unnecessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed - it is an extra I/O that we don't need. I pushed 6eea900

public void testFSync() throws Exception {
try (FSyncTrackingFileSystemProvider fileSystem = setupFSyncCountingFileSystem()) {
final CacheFile cacheFile = new CacheFile("test", randomLongBetween(100, 1000), fileSystem.resolve("test"));
assertFalse(cacheFile.isFSynced());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think counting the file as not fsync'ed when just created could lead to fsync'ing the empty file - which I think is unnecessary? We care mostly about the data, not the files.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right - I pushed 2f8b0e7

if (randomBoolean()) {
final long position = i * 10L; // simplify the test by completing small non-contiguous ranges
final Tuple<Long, Long> range = Tuple.tuple(position, position + 1L);
cacheFile.populateAndRead(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be good to also verify multiple populateAndRead calls within one fsync?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I randomized this in the test now.

* @throws AlreadyClosedException if the cache file is evicted
* @throws java.nio.file.NoSuchFileException if the cache file does not exist
*/
public List<Tuple<Long, Long>> fsync() throws IOException {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return a SortedSet instead to signal the expected order verified by the test, as well as uniqueness?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good suggestion, thanks. I pushed 7b3590e

}

cacheFile.startEviction();
expectThrows(AlreadyClosedException.class, cacheFile::fsync);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can definitely be handled in a follow-up instead when you start using it. But it might be more appropriate to just return the empty set of ranges in this case, since we essentially having nothing cached for it, rather than have to exception handle in the fsync'er thread?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. In case of an evicted file the method will now return an empty collection. See #64201 (comment)

if (fsynced.compareAndSet(false, true)) {
boolean success = false;
try {
// check again if the file is evicted before doing expensive I/O
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is necessary, since there is no waiting above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed it.

@tlrx
Copy link
Member Author

tlrx commented Nov 2, 2020

@henningandersen Thanks for your review. I updated the code and changed a bit the semantic of the fsync method. Can you please have another look and tell me what you think? Thanks!

Copy link
Contributor

@henningandersen henningandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for the extra iteration.

* considered as fsynced (the initialization value is {@code true}) when it is created; and writing new data to the cache file
* will toggle the flag to {@code false}.
**/
private final AtomicBoolean fsynced = new AtomicBoolean(true);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if calling this needsFsync (and reverting the boolean) would be slightly more intuitive?

With the current name, a reader would be inclined to think the boolean is only true when the file is fsync'ed, but this is not true (it is also still true while writing and completing the gap).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, it makes sense. I pushed d8fdaaa2aab25e78033906f2fa8c956a6e340441.

@tlrx tlrx merged commit 34fc303 into elastic:master Nov 3, 2020
@tlrx tlrx deleted the add-cache-file-fsync branch November 3, 2020 14:08
@tlrx
Copy link
Member Author

tlrx commented Nov 3, 2020

Thanks Armin and Henning! I'm merging this only on master for now and will backport this change and the futures one about the persistent cache in one go.

tlrx added a commit that referenced this pull request Nov 23, 2020
…iles (#64696)

This committ changes the searchable snapshot's CacheService so that 
it now periodically fsync cache files using the method introduced in #64201.

The synchronization is executed every 10 seconds by default (this interval 
can be changed using a new xpack.searchable.snapshot.cache.sync.interval
setting).
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Dec 11, 2020
elastic#64201)

This commit adds a new CacheFile.fsync() method to fsync the cache
file on disk. This method uses the utility method IOUtils.fsync(Path,
boolean) that executes a FileChannel.force() call under the hood.
tlrx added a commit that referenced this pull request Dec 12, 2020
#66200)

This commit adds a new CacheFile.fsync() method to fsync the cache
file on disk. This method uses the utility method IOUtils.fsync(Path,
boolean) that executes a FileChannel.force() call under the hood.

Backport of #64201 for 7.11
tlrx added a commit to tlrx/elasticsearch that referenced this pull request Dec 12, 2020
…iles (elastic#64696)

This committ changes the searchable snapshot's CacheService so that
it now periodically fsync cache files using the method introduced in elastic#64201.

The synchronization is executed every 10 seconds by default (this interval
can be changed using a new xpack.searchable.snapshot.cache.sync.interval
setting).
tlrx added a commit that referenced this pull request Dec 14, 2020
…files #64696  (#66216)

This committ changes the searchable snapshot's CacheService so that
it now periodically fsync cache files using the method introduced in #64201.

The synchronization is executed every 10 seconds by default (this interval
can be changed using a new xpack.searchable.snapshot.cache.sync.interval
setting).

Backport of #64696 for 7.11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Distributed Coordination/Snapshot/Restore Anything directly related to the `_snapshot/*` APIs >enhancement Team:Distributed Meta label for distributed team (obsolete) v7.11.0 v8.0.0-alpha1
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants