Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kvs: refactor kvs cache to handle raw data as "primary" data #1246

Merged
merged 13 commits into from
Oct 24, 2017

Conversation

chu11
Copy link
Member

@chu11 chu11 commented Oct 20, 2017

This PR refactors the KVS cache to favor the storage of raw data over json objects. This accomplishes several goals.

  1. before this PR, KVS cache entries had a data "type" associated with it. This can lead to a "race" in which how the data was loaded into the KVS cache the first time and could lead to errors/bugs later. For example, if data was incorrectly loaded into the KVS cache as type json, a later attempt to read the data out of the KVS cache as raw data would fail, leading to an error.

  2. This should clean up the code and hopefully make it less confusing. Many unit tests were removed in this PR, so I feel the less confusing part was accomplished.

The primary idea behind this refactor to remove the "type" system with KVS cache entries and make the KVS cache primarily for storing raw data.

Users can set get/set json objects in the KVS cache, but the API is simply a set of convenience functions converting those json objects to/from their raw string form.

As an aside, I am sometimes anal when it comes to code "ordering". In commit chu11@368f858 I literally just move "json" code below "raw data" code, b/c I want "raw data" code to be listed first as it is now the "primary" way the KVS cache works. I know its more code change than may be necessary.

chu11 added 2 commits October 20, 2017 15:57
Remove cache entry "type" parameter / concept from external facing
API.  It is still used internally for several purposes.

Adjust / remove many cache unit tests as a result.
@chu11
Copy link
Member Author

chu11 commented Oct 20, 2017

One of the builds hung and timed out, restarted it. One hung and timed out in #1244 earlier today and @garlick suggested one did in #1243. I'm concerned there is a racy-hang somewhere in the tests. But it's hard to see where it could be given the output of travis.

@garlick
Copy link
Member

garlick commented Oct 20, 2017

This is a high value cleanup! I will try provide some review commands before monday.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) to 78.538% when pulling 3cc234f on chu11:issue1239 into b31a944 on flux-framework:master.

@codecov-io
Copy link

codecov-io commented Oct 20, 2017

Codecov Report

Merging #1246 into master will decrease coverage by 0.05%.
The diff coverage is 71.26%.

@@            Coverage Diff             @@
##           master    #1246      +/-   ##
==========================================
- Coverage   77.99%   77.93%   -0.06%     
==========================================
  Files         154      154              
  Lines       28964    28902      -62     
==========================================
- Hits        22590    22525      -65     
- Misses       6374     6377       +3
Impacted Files Coverage Δ
src/modules/kvs/lookup.c 82.72% <50%> (-1.46%) ⬇️
src/modules/kvs/kvs.c 62.7% <53.12%> (-1.02%) ⬇️
src/modules/kvs/cache.c 88.83% <83.72%> (-3.54%) ⬇️
src/modules/kvs/commit.c 76.98% <87.5%> (+0.33%) ⬆️
src/common/libkvs/kvs_lookup.c 75.19% <0%> (-1.56%) ⬇️
src/common/libutil/blobref.c 97.22% <0%> (-1.39%) ⬇️
src/common/libkvs/kvs.c 64.11% <0%> (-0.97%) ⬇️
src/broker/overlay.c 73.88% <0%> (-0.32%) ⬇️
... and 9 more

@chu11
Copy link
Member Author

chu11 commented Oct 20, 2017

Just missed having coverage on the diff. Looking through coverage diff, I don't think I can eek out a few more lines. It's all bad error path scenarios.

Copy link
Member

@garlick garlick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few comments inline.

@@ -59,8 +59,9 @@ struct cache_entry {
waitqueue_t *waitlist_valid;
void *data; /* value object/data */
int len;
bool data_valid; /* flag indicating if data set, don't use
* data == NULL as test */
bool entry_valid; /* flag indicating if data set, don't use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: rename to valid and use same declaration style as dirty (either bool or bitfield is fine with me).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agreed that this should be consistent, however I'll make a new issue and clean up in another PR. Some functions (such as cache_entry_clear_dirty()) return the current setting of the dirty bit as a 1 or 0. This should be a bool, not a 1 or 0. It'll be a bit more cleanup than I'd like in this PR.

else if (hp->type == CACHE_DATA_TYPE_RAW)
free (hp->data);
}
if (hp->o)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test for null is redundant. both free() and json_decref() are no-ops if argument is NULL.

@@ -84,42 +84,42 @@ struct cache_entry *cache_entry_create (void)
return hp;
}

struct cache_entry *cache_entry_create_json (json_t *o)
struct cache_entry *cache_entry_create_raw (void *data, int len)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would appear this function has no users?

return hp;
}

struct cache_entry *cache_entry_create_raw (void *data, int len)
struct cache_entry *cache_entry_create_json (json_t *o)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would appear this function has only one user in setroot_event_cb().

suggest changing that user to cache_entry_create() + cache_entry_set_json()

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This sounds like a good idea. Was honestly surprised the cache_entry_create_raw() and cache_entry_create_json functions had almost no use.

else if (!hp->data) {
/* attempt to change already valid cache entry,
* cannot, must call cache_entry_clear_data() */
errno = EBADE;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function doesn't set errno in all the -1 return cases (e.g. if args are invalid)

the second test for !hp->data is redundant with the first leg of the conditional.

maybe it would be better to change the hp->entry_valid leg of the outer conditional to simply

if (hp->entry_valid) {
    if (hp->o)
        json_decref (o); // already stored
    else
        hp->o = o;
    assert (hp->len > 0 && hp->data != NULL); // o/w would not be valid json
} else ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ugh, why did I add the redundant check :-)

let me think about how to handle the logic here, as I seemed to have copied some logic from the set_raw() function, but perhaps it shouldn't have been copied over in that way.

json_decref (o);
else if (!hp->data) {
/* attempt to change already valid cache entry,
* cannot, must call cache_entry_clear_data() */
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache_entry_clear_data() doesn't appear to have any users

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah created this function as a "just in case", knowing there were no real use cases at the time. Perhaps would be wise to remove. If anything, users can cache_remove_entry then recreate if they are in dire straits to put new data in the cache entry.

@chu11
Copy link
Member Author

chu11 commented Oct 23, 2017

just re-pushed with some fixes. Removed cache_entry_clear_data(), cache_entry_create_raw(), and cache_entry_create_json().

Changed the logic in cache_entry_set_json() a bit to hopefully make more sense.

Removed unnecessary checks for NULL in cache_entry_destroy()

Will do the cleanup regarding consistent dirty bit and valid bit in a separate PR (#1247).

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.08%) to 78.494% when pulling c1f8414 on chu11:issue1239 into b31a944 on flux-framework:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.07%) to 78.507% when pulling 67ca6a9 on chu11:issue1239 into b31a944 on flux-framework:master.

@chu11
Copy link
Member Author

chu11 commented Oct 24, 2017

coverage diff fell from about 77% to about 71%, I believe mostly due to high coverage functions that were removed.

@garlick
Copy link
Member

garlick commented Oct 24, 2017

IF you're feeling OK with this, I'd say squash it down and we can merge it.

chu11 added 11 commits October 23, 2017 17:39
Internally refactor KVS cache code to store json object and
raw data memory in separate variables instead of one.
Make 'raw' KVS cache functions the primary function over 'json'
functions by moving them to be the first function listed over the
'json' equivalent.  There is no functional change in this commit,
only the movement and re-ordering of code.
Refactor cache API to make all cache entries store raw data instead
of raw or json data (but not both).  All json functions are now convenience
functions operating under the assumption of raw data underneath.  For example,
cache_entry_create_json() and cache_entry_set_json() are convenience
functions that take a json object and extract the raw string out of
it.  The raw string is now the primary data of the cache entry.  The
cache_entry_get_json() function is a convenience function that returns
the json object equivalent of the raw data string stored within a
cache entry.

To avoid regularly encoding/decoding raw data into/from json objects,
a json object is cached in the cache entry.

Internally, the cache_data_type_t is no longer needed and has been
removed.

Update and add unit tests appropriately.
Update internal lookup API for KVS cache changes.

Adjust callback lookup_ref_f to not pass back raw_data flag, as it
is no longer relevant.  Adjust several log error messages to make more
sense given changes.

Update tests appropriately.  Most notably, valref's that previously
had a (at the time invalid) dirref reference within it, will now pass.
Valref blobrefs can now to point to anything.  If user wishes to
point to a random treeobj, they can.
Remove cache type usage in internal commit API.  Take advantage
of fact all cache entries have raw data.

Adjust unit tests appropriately for loss of cache types.
Handle the fact that KVS cache now automatically converts raw data
to json and json to raw data, so that the conversion no longer
needs to be done at this level.

Also adjust to removal of cache entry types.

Fixes flux-framework#1239
Most notably, valrefs that previously had a (at the time invalid)
dirref reference within it, will now pass.  Valref blobrefs can
now point to anything.
Function was unused except in tests, so remove it.

Remove / adjust unit tests appropriately.
Remove cache_entry_create_raw(), which was unused except in tests.

Remove cache_entry_create_json(), which was unused except for only
1 location.

Replace calls to these functions with cache_entry_create() and
call to either cache_entry_set_raw() or cache_entry_set_json().

Adjust unit tests appropriately.
In cache_entry_set_raw() and cache_entry_set_json(), set errno = EINVAL
when input is invalid.

Update unit tests appropriately.
@chu11
Copy link
Member Author

chu11 commented Oct 24, 2017

squashed and re-pushed

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.03%) to 78.546% when pulling b8b8c87 on chu11:issue1239 into b31a944 on flux-framework:master.

@garlick garlick merged commit 0dfdcfc into flux-framework:master Oct 24, 2017
@grondo grondo mentioned this pull request May 10, 2018
@chu11 chu11 deleted the issue1239 branch June 5, 2021 17:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants