Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a config option to prefix keys in Redis stores #981

Merged
merged 1 commit into from
Jun 7, 2024

Conversation

caass
Copy link
Contributor

@caass caass commented Jun 7, 2024

Description

Add a configuration option that allows for setting an optional prefix to all the keys used with a Redis store. I added some documentation to the RedisStore configuration struct, but I'm not sure if there's docs I should update elsewhere.

Fixes #977

Type of change

Please delete options that aren't relevant.

  • New feature (non-breaking change which adds functionality)
  • This change requires a documentation update

How Has This Been Tested?

I added duplicates of existing tests of the Redis store to use prefixes

Checklist

  • Updated documentation if needed
  • Tests added/amended
  • bazel test //... passes locally
  • PR is contained in a single commit, using git amend see some docs

This change is Reviewable

Copy link
Contributor Author

@caass caass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+@allada

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, and 3 discussions need to be resolved (waiting on @allada)


nativelink-config/src/stores.rs line 889 at r1 (raw file):

    ///
    /// Default: `""`
    pub key_prefix: Option<String>,

FYI I thought about making this just a String with #[serde(default)], but later in the implementation it wound up adding an allocation in cases where there wasn't a prefix.


nativelink-store/src/redis_store.rs line 42 at r1 (raw file):

/// e.g. attach a prefix to every key. Instead, you can use [`RedisStore::key`] to create a
/// `RedisStoreKey`.
struct RedisStoreKey<'a, 's> {

I decided to write a custom newtype here because I felt that it was an appropriate level of abstraction.

In one extreme, we just manually call what became the implementation of ToRedisArgs every time we pass a key to Redis. I felt that was unmaintainable -- what if we add other configuration options later that also impact how keys are encoded? It would be pretty nightmarish to manually ensure every time we talked to Redis we ran the logic for encoding the keys properly.

On the other hand, we could define a whole layer on top of the Redis library that enforces at compile time that we only use a processed key (i.e. RedisStoreKey) instead of StoreKey.as_str().as_ref(), but that felt pretty brittle in the other direction -- what if we add more functionality to the StoreDriver trait? Then we'd need to update our intermediate layer as well.

Here, we do define a newtype to use, and describe with doc comments how it's supposed to be used. The way to construct a RedisStoreKey is to use RedisStore::key (or the RedisStoreKey { prefix: Some("prefix"), body: Cow::Borrowed("body") } syntax, which we could prevent by moving this into a child module...), and since it implements ToRedisArgs you can pass that directly to calls to Redis.


nativelink-store/src/redis_store.rs line 52 at r1 (raw file):

        W: ?Sized + redis::RedisWrite,
    {
        let body = self.body.as_bytes();

In the Cow::Owned case, this doesn't actually call String::as_bytes -- is calls String::as_ref().as_bytes, which delegates to str::as_bytes. Ultimately, these are all zero-cost abstractions that come out in the wash, but I thought it was worth mentioning.


nativelink-store/src/redis_store.rs line 52 at r1 (raw file):

        W: ?Sized + redis::RedisWrite,
    {
        let body = self.body.as_bytes();

Nit: This doesn't need to be its own declaration, we can inline it.


nativelink-store/src/redis_store.rs line 131 at r1 (raw file):

            lazy_conn,
            || uuid::Uuid::new_v4().to_string(),
            config.key_prefix.clone(),

The alternative to cloning here is to make RedisStore dependent on the lifetime of Config, which technically should be possible (as a config is likely to live for the rest of the program), but since this isn't in the hot path and only happens when a RedisStore is constructed I figure this is better for ergonomics.


nativelink-store/src/redis_store.rs line 160 at r1 (raw file):

    /// Wrap a [`StoreKey`] in a [`RedisStoreKey`] which can be sent to Redis.
    fn key<'a, 's, 'k: 's>(&'a self, key: &'k StoreKey<'s>) -> RedisStoreKey<'a, 's> {

Holy lifetimes...these all need to be explicit so the compiler can work out what's happening, which is that:

  1. The key prefix is guaranteed to be valid for the lifetime of the borrow of self (the RedisStore)
  2. The key body is guaranteed to be valid for the lifetime of the StoreKey (since it's referencing the str slice inside it)
  3. The borrow of the StoreKey is guaranteed to be valid for at least the lifetime of the StoreKey (by the rules of the borrow checker -- since if the lifetime of StoreKey could be shorter than its own borrow, we'd get unsoundness).

nativelink-store/src/redis_store.rs line 160 at r1 (raw file):

    /// Wrap a [`StoreKey`] in a [`RedisStoreKey`] which can be sent to Redis.
    fn key<'a, 's, 'k: 's>(&'a self, key: &'k StoreKey<'s>) -> RedisStoreKey<'a, 's> {

Maybe this method should be named something else? encode_key?

@caass caass force-pushed the caass/redis-prefix branch from 71f8260 to 9aa5782 Compare June 7, 2024 02:27
Copy link
Contributor Author

@caass caass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 LGTMs obtained, and pending CI: Analyze (javascript-typescript), Analyze (python), Bazel Dev / ubuntu-22.04, Cargo Dev / macos-13, Cargo Dev / ubuntu-22.04, Installation / macos-13, Installation / macos-14, Installation / ubuntu-22.04, Local / ubuntu-22.04, Publish image, Publish nativelink-worker-lre-cc, Remote / large-ubuntu-22.04, asan / ubuntu-22.04, docker-compose-compiles-nativelink (20.04), docker-compose-compiles-nativelink (22.04), integration-tests (20.04), integration-tests (22.04), macos-13, pre-commit-checks, ubuntu-20.04 / stable, ubuntu-22.04, ubuntu-22.04 / stable, vale, windows-2022 / stable, and 2 discussions need to be resolved (waiting on @allada)


nativelink-store/src/redis_store.rs line 52 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

Nit: This doesn't need to be its own declaration, we can inline it.

Done

Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good :-)

Reviewed 1 of 3 files at r1, all commit messages.
Reviewable status: 0 of 1 LGTMs obtained, and 5 discussions need to be resolved


nativelink-config/src/stores.rs line 889 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

FYI I thought about making this just a String with #[serde(default)], but later in the implementation it wound up adding an allocation in cases where there wasn't a prefix.

It shouldn't. Empty strings don't have any data on the heap.

See: https://doc.rust-lang.org/std/string/struct.String.html#method.new


nativelink-config/src/stores.rs line 888 at r2 (raw file):

    /// organize your data according to the shared prefix.
    ///
    /// Default: `""`

nit: Default: (Empty String / No prefix)


nativelink-store/src/redis_store.rs line 42 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

I decided to write a custom newtype here because I felt that it was an appropriate level of abstraction.

In one extreme, we just manually call what became the implementation of ToRedisArgs every time we pass a key to Redis. I felt that was unmaintainable -- what if we add other configuration options later that also impact how keys are encoded? It would be pretty nightmarish to manually ensure every time we talked to Redis we ran the logic for encoding the keys properly.

On the other hand, we could define a whole layer on top of the Redis library that enforces at compile time that we only use a processed key (i.e. RedisStoreKey) instead of StoreKey.as_str().as_ref(), but that felt pretty brittle in the other direction -- what if we add more functionality to the StoreDriver trait? Then we'd need to update our intermediate layer as well.

Here, we do define a newtype to use, and describe with doc comments how it's supposed to be used. The way to construct a RedisStoreKey is to use RedisStore::key (or the RedisStoreKey { prefix: Some("prefix"), body: Cow::Borrowed("body") } syntax, which we could prevent by moving this into a child module...), and since it implements ToRedisArgs you can pass that directly to calls to Redis.

To be honest, I think this is one step too far for readability. This is adding enough complexity that we should just allocate a new string for every key if that's what it requires. I mean, we are already doing an allocation in BytesMut, so I don't see a reason to do a string allocation instead.


nativelink-store/src/redis_store.rs line 131 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

The alternative to cloning here is to make RedisStore dependent on the lifetime of Config, which technically should be possible (as a config is likely to live for the rest of the program), but since this isn't in the hot path and only happens when a RedisStore is constructed I figure this is better for ergonomics.

No, it's a one time cost. It's fine to copy.


nativelink-store/src/redis_store.rs line 160 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

Holy lifetimes...these all need to be explicit so the compiler can work out what's happening, which is that:

  1. The key prefix is guaranteed to be valid for the lifetime of the borrow of self (the RedisStore)
  2. The key body is guaranteed to be valid for the lifetime of the StoreKey (since it's referencing the str slice inside it)
  3. The borrow of the StoreKey is guaranteed to be valid for at least the lifetime of the StoreKey (by the rules of the borrow checker -- since if the lifetime of StoreKey could be shorter than its own borrow, we'd get unsoundness).

Use key: StoreKey<'_>

StoreKey only owns data if it is required. The caller would use .borrow() if they don't want to pass ownership of the heap.


nativelink-store/src/redis_store.rs line 92 at r2 (raw file):

    ///
    /// See [`RedisStore::key_prefix`](`nativelink_config::stores::RedisStore::key_prefix`).
    key_prefix: Option<String>,

nit: Use String w/o Option.


nativelink-store/tests/redis_store_test.rs line 300 at r2 (raw file):

#[nativelink_test]
async fn test_uploading_large_data_with_prefix() -> Result<(), Error> {

nit: This test is a bit too much. We are testing the important stuff with the above tests.

Copy link
Contributor Author

@caass caass left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@allada Ok, I'll push some changes and re-request review.

Reviewable status: 0 of 1 LGTMs obtained, and 2 discussions need to be resolved (waiting on @allada)


nativelink-config/src/stores.rs line 889 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

It shouldn't. Empty strings don't have any data on the heap.

See: https://doc.rust-lang.org/std/string/struct.String.html#method.new

I misspoke, what I meant is that later in RedisStoreKey::write_redis_args we would basically try to copy an empty buffer (i.e. String.as_bytes()), which could be avoided if we knew up from the string was empty.


nativelink-store/src/redis_store.rs line 42 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

To be honest, I think this is one step too far for readability. This is adding enough complexity that we should just allocate a new string for every key if that's what it requires. I mean, we are already doing an allocation in BytesMut, so I don't see a reason to do a string allocation instead.

Got it 👍 I think also an easier way to read this would be to do like if string.is_empty() { /* just encode the key body */ } else { encode the prefix }.


nativelink-store/src/redis_store.rs line 131 at r1 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

No, it's a one time cost. It's fine to copy.

Yea that's what I figured 👍


nativelink-store/tests/redis_store_test.rs line 300 at r2 (raw file):

Previously, allada (Nathan (Blaise) Bruer) wrote…

nit: This test is a bit too much. We are testing the important stuff with the above tests.

👍

Per TraceMachina#977, we've added a configuration option that allows for setting an
optional prefix to all the keys used with a Redis store.
@caass caass force-pushed the caass/redis-prefix branch from 9aa5782 to f06dc85 Compare June 7, 2024 14:26
Copy link
Member

@allada allada left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 3 of 3 files at r3, all commit messages.
Reviewable status: 1 of 1 LGTMs obtained, and pending CI: pre-commit-checks


nativelink-store/src/redis_store.rs line 52 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

In the Cow::Owned case, this doesn't actually call String::as_bytes -- is calls String::as_ref().as_bytes, which delegates to str::as_bytes. Ultimately, these are all zero-cost abstractions that come out in the wash, but I thought it was worth mentioning.

Yeah, maybe there's a little more stack used, but it should hit L1 cache each time (if the optimizer doesn't optimize it away).


nativelink-store/src/redis_store.rs line 160 at r1 (raw file):

Previously, caass (Cass Fridkin) wrote…

Maybe this method should be named something else? encode_key?

I try to be more explicit, like: redis_args_for_key. But not strict on this here.

@caass caass enabled auto-merge (squash) June 7, 2024 15:57
@caass caass merged commit b7a7e36 into TraceMachina:main Jun 7, 2024
28 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Redis store should support prefixing
2 participants