Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow merge operations to yield to a concurrent close #185

Merged
merged 2 commits into from
Jun 26, 2020

Conversation

craigfe
Copy link
Member

@craigfe craigfe commented Jun 25, 2020

Fix #180.

The change is in two parts:

  1. generalise threads to return results from children;

  2. implement a pre-emptable merge operation (using thread results to propagate the return status of asynchronous merges back to the parent thread).

@craigfe craigfe force-pushed the cancellable-merge branch from 1ad552a to c1da2e6 Compare June 25, 2020 12:34
@icristescu
Copy link
Contributor

Maybe it is a stupid question, but why not use thread.kill? We can keep the thread that is doing the merge as mutable field in index and kill it in the close operation, no?

@craigfe
Copy link
Member Author

craigfe commented Jun 25, 2020

Not a stupid question at all. I considered it, but there are a few reasons for not going with that approach:

  • We need to be sure the merge lock is treated properly. Thread.kill might kill the child while it's still holding the merge lock, without a good way to determine when that's happened.

  • Thread.kill can't distinguish an already-terminated thread, making it impossible to return [ `Completed | `Aborted ]. That's not a huge problem, but is harder to test.

  • (bikeshedding) For blocking merges, there is no child thread to kill. It's technically possible for that to be a problem when using hooks.

  • (bikeshedding) The Lwt equivalent of Thread.kill is Lwt.cancel, which has the same ergonomics problems and also doesn't always work (threads can be made 'uncancelable').

@craigfe craigfe merged commit 768813f into mirage:master Jun 26, 2020
craigfe added a commit to craigfe/index that referenced this pull request Jun 26, 2020
craigfe added a commit that referenced this pull request Jun 26, 2020
Fix broken rebase during merge of #185
craigfe added a commit to craigfe/opam-repository that referenced this pull request Oct 21, 2020
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) [`Block_writes] strategy. The new
  [`Overcommit_memory] does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)

- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
craigfe added a commit to craigfe/opam-repository that referenced this pull request Jan 5, 2021
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) `` `Block_writes`` strategy. The new
  `` `Overcommit_memory`` does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

- `Index.replace` now takes a `~overcommit` argument to postpone a merge. (mirage/index#253)

- `Index.merge` is now part of the public API. (mirage/index#253)

- `Index.try_merge` is now part of the public API. `try_merge' is a no-op if
  the number of entries in the write-ahead log is smaller than `log_size`,
  otherwise it's `merge'. (mirage/index#253 @samoht)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)
- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
- Fix `Index.iter` to only iterate once over elements persisted on the disk
  (mirage/index#260, @samoht, @icristescu)
kit-ty-kate pushed a commit to craigfe/opam-repository that referenced this pull request Jan 6, 2021
CHANGES:

## Added

- Added `flush_callback` parameter to the creation of a store, to register
  a callback before a flush. This callback can be temporarily disabled by
  `~no_callback:()` to `flush`. (mirage/index#189, mirage/index#216)

- Added `Stats.merge_durations` to list the duration of the last 10 merges.
  (mirage/index#193)

- Added `is_merging` to detect if a merge is running. (mirage/index#192)

- New `IO.Header.{get,set}` functions to read and write the file headers
  atomically (mirage/index#175, mirage/index#204, @icristescu, @craigfe, @samoht)

- Added a `throttle` configuration option to select the strategy to use
  when the cache are full and an async merge is already in progress. The
  current behavior is the (default) `` `Block_writes`` strategy. The new
  `` `Overcommit_memory`` does not block but continue to fill the cache instead.
  (mirage/index#209, @samoht)

- Add `IO.exists` obligation for IO implementations, to be used for lazy
  creation of IO instances. (mirage/index#233, @craigfe)

- `Index.close` now takes an `~immediately:()` argument. When passed, this
  causes `close` to terminate any ongoing asynchronous merge operation, rather
  than waiting for it to finish. (mirage/index#185, mirage/index#234)

- Added `Index.Checks.cli`, which provides offline integrity checking of Index
  stores. (mirage/index#236)

- `Index.replace` now takes a `~overcommit` argument to postpone a merge. (mirage/index#253)

- `Index.merge` is now part of the public API. (mirage/index#253)

- `Index.try_merge` is now part of the public API. `try_merge' is a no-op if
  the number of entries in the write-ahead log is smaller than `log_size`,
  otherwise it's `merge'. (mirage/index#253 @samoht)

## Changed

- `sync` has to be called by the read-only instance to synchronise with the
  files on disk. (mirage/index#175)
- Caching of `Index` instances is now explicit: `Index.Make` requires a cache
  implementation, and `Index.v` may be passed a cache to be used for instance
  sharing. The default behaviour is _not_ to share instances. (mirage/index#188)

## Fixed

- Added values after a clear are found by read-only instances. (mirage/index#168)
- Fix a race between `merge` and `sync` (mirage/index#203, @samoht, @craigfe)
- Fix a potential loss of data if a crash occurs at the end of a merge (mirage/index#232)
- Fix `Index.iter` to only iterate once over elements persisted on the disk
  (mirage/index#260, @samoht, @icristescu)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allow Index.close to abort ongoing merges
3 participants