Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: content: flush before checkpointing #6260

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Sep 5, 2024

Problem: Before checkpointing, users need to remember to call content.flush, to ensure data has been flushed to the backing store. It is easy to forget this.

Within the content module, call content.flush before checkpointing.

Fixes #6242


Last in the chain, this is built on top of #6255 and then #6240. So this one is last, setting WIP for now.

chu11 added 11 commits December 18, 2024 10:31
Problem: An accidental 'd' was added to remove, making it "removed".

Fix spelling.
Problem: A test in t0028-content-backing-none.t incorrectly
calls checkpoint_put when it should call checkpoint_get.

Fix invalid test.
Problem:  The typical message unpack style is to place key names
and storage pointers on the same line, but that is not done in
several locations in the content and content backing modules.

Correct code style to be more consistent to the rest of flux-core.
Problem: A backing store is required for content.flush but it
is not required for content.checkpoint-put.  This is inconsistent
and can lead to checkpointing problems done the line.

Require content.checkpoint-put to only work if there is a backing
store available.  As a consequence, remove code that handled
"cached" checkpoints when a backing store is not available.

Fixes flux-framework#6251
Problem: Now that the content backing store is required for checkpoints,
many tests fail.

Remove tests that previously assumed that checkpointing worked without
a content backing store.  Adjust some tests that now have an new
error message.
Problem: There is no coverage to ensure that the "none" backing
store works identically to when no backing store is never loaded.

Add coverage in t0028-content-backing-none.t.
Problem: There is no coverage to ensure FLUX_KVS_SYNC fails when
there is no longer space on disk.

Add coverage to t0090-content-enospc.t.
Problem: There is no coverage to ensure FLUX_KVS_SYNC does not
work if there is no backing store.

Add coverage in t1010-kvs-commit-sync.t.
Problem: When the KVS module is unloaded, a checkpoint of the root
reference is attempted.  However, a content.flush is not done
beforehand.  This could result in an invalid checkpoint reference
as data is not guaranteed to be flushed to the backing store.

Solution: Call content.flush before checkpointing.

Fixes flux-framework#6237
Problem: Before checkpointing, users need to remember to call
content.flush, to ensure data has been flushed to the backing store.
It is easy to forget this.

Within the content module, call content.flush before checkpointing.

Fixes flux-framework#6242
Problem: Calls to content.checkpoint-put (via kvs_checkpoint_commit())
will automatically call content.flush.  Therefore the calls to
content.flush in the kvs are duplicates and now unnecessary.

Remove calls to content.flush in the kvs module.
@chu11 chu11 force-pushed the issue6242_content_flush_in_checkpoint branch from 8e9d319 to 647340f Compare December 18, 2024 18:31
Copy link

codecov bot commented Dec 18, 2024

Codecov Report

Attention: Patch coverage is 77.77778% with 12 lines in your changes missing coverage. Please review.

Project coverage is 83.62%. Comparing base (c9eb3a8) to head (647340f).

Files with missing lines Patch % Lines
src/modules/content/checkpoint.c 77.35% 12 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #6260   +/-   ##
=======================================
  Coverage   83.61%   83.62%           
=======================================
  Files         522      522           
  Lines       87734    87659   -75     
=======================================
- Hits        73356    73301   -55     
+ Misses      14378    14358   -20     
Files with missing lines Coverage Δ
src/modules/content-files/content-files.c 73.91% <ø> (ø)
src/modules/content-sqlite/content-sqlite.c 72.26% <ø> (-1.62%) ⬇️
src/modules/content/cache.c 85.43% <ø> (-0.03%) ⬇️
src/modules/kvs/kvs.c 72.50% <ø> (+0.09%) ⬆️
src/modules/kvs/kvstxn.c 79.70% <100.00%> (+0.24%) ⬆️
src/modules/content/checkpoint.c 73.29% <77.35%> (-5.74%) ⬇️

... and 9 files with indirect coverage changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

content: call content.flush within content-checkout.put
1 participant