c-s mixed operation #63

muzarski · 2023-11-28T13:28:43Z

Motivation

Cassandra-stress supports mixed workloads. It allows to run multiple kinds of operations. Allowed operations are:

read
write
counter_read
counter_write

The parameters supported by mixed command are:

all of the common parameters (e.g. used by read)
all of the counter_write parameters
ratio() parameter based on which tool samples an operation to execute. E.g. ratio(read=2, write=1) means that there will be approximately 2 read operations per 1 write operation.
clustering distribution parameter which tells the tool how many times to run the operation that just got sampled.

Changes

Prepared settings module for MixedOperation
Implemented MixedOperation

piodul · 2023-12-04T09:39:49Z

src/bin/cql-stress-cassandra-stress/java_generate/distribution/enumerated.rs

+    pub fn sample(&self) -> T {
+        self.items[self.dist.sample(&mut rand::thread_rng())].0
+    }


Is thread_rng intended here? We have ThreadLocalRandom, maybe it should be used here instead? Not sure, just asking.

We do not depend on any determinism here, so we don't need to use a java_random::Random.

In addition, rand_distr::WeightedIndex::sample accepts &mut R where R: rand::Rng. We would need to implement rand::RngCore for java_random::Random (which is impossible, so we would need to wrap it with yet another type).

I just decided to keep it simple and use the rng provided by rand crate.

piodul · 2023-12-04T11:16:29Z

src/bin/cql-stress-cassandra-stress/settings/command/mixed.rs

+impl OperationRatio {
+    fn parse_command_weight(s: &str) -> Result<(MixedSubcommand, f64)> {
+        let (cmd, weight) = {
+            let mut iter = s.split('=').fuse();
+            match (iter.next(), iter.next(), iter.next()) {
+                (Some(cmd), Some(w), None) => (cmd, w),
+                _ => anyhow::bail!(
+                    "Command weight specification should match pattern <command>=<f64>"
+                ),
+            }
+        };
+
+        let command = match Command::parse(cmd)? {
+            Command::Read => MixedSubcommand::Read,
+            Command::Write => MixedSubcommand::Write,
+            Command::CounterRead => MixedSubcommand::CounterRead,
+            Command::CounterWrite => MixedSubcommand::CounterWrite,
+            _ => anyhow::bail!("Invalid command for mixed workload: {}", cmd),
+        };
+        let weight = weight.parse::<f64>()?;
+        Ok((command, weight))
+    }
+
+    fn do_parse(s: &str) -> Result<Self> {
+        // Remove wrapping parenthesis.
+        let arg = {
+            let mut chars = s.chars();
+            anyhow::ensure!(
+                chars.next() == Some('(') && chars.next_back() == Some(')'),
+                "Invalid operation ratio specification: {}",
+                s
+            );
+            chars.as_str()
+        };
+
+        let mut command_set = HashSet::<MixedSubcommand>::new();
+        let weights = arg
+            .split(',')
+            .map(|s| -> Result<(MixedSubcommand, f64)> {
+                let (command, weight) = Self::parse_command_weight(s)?;
+                anyhow::ensure!(
+                    !command_set.contains(&command),
+                    "{} command has been specified more than once",
+                    command
+                );
+                command_set.insert(command);
+                Ok((command, weight))
+            })
+            .collect::<Result<Vec<_>, _>>()?;
+
+        Self::new(&weights)
+    }
+}


Side note: this falls under the "custom syntax" mentioned in #52 (no need to do anything right now).

src/bin/cql-stress-cassandra-stress/settings/command/mixed.rs

src/bin/cql-stress-cassandra-stress/java_generate/distribution/enumerated.rs

piodul · 2023-12-04T11:39:42Z

src/bin/cql-stress-cassandra-stress/operation/mixed.rs

+        let mixed_params = settings.command_params.mixed.as_ref().unwrap();
+        let read_statement =
+            prepare_read_statement(regular_table_name, &session, &settings).await?;
+        let counter_read_statement =
+            prepare_read_statement(counter_table_name, &session, &settings).await?;
+        let write_statement = WriteOperationFactory::prepare_statement(&session, &settings).await?;
+        let counter_write_statement =
+            CounterWriteOperationFactory::prepare_statement(&session, &settings).await?;
+        let max_operations = settings.command_params.common.operation_count;
+        let operation_ratio = Arc::new(mixed_params.operation_ratio.clone());


I'm not a fan of the approach... You are exposing some details of the existing operations and use them to implement MixedOperation.

I think that the operations should be self-contained. MixedOperation should just encapsulate them and not care about their internals apart from the fact that they implement some trait. Would it be possible to implement it like that?

I thought about introducing some trait, but don't we end up with the same performance issue as in #34 (async-trait boxed futures)?

I haven't compared both of the approaches when it comes to performance, but to me, it looks exactly the same - we end up with per-operation allocation from boxing the futures. I can still do the comparison, if necessary.

I see. AFAIK async in trait methods is going to be stabilized in Rust 1.75, so maybe we can use async_trait until that happens.

I see. AFAIK async in trait methods is going to be stabilized in Rust 1.75, so maybe we can use async_trait until that happens.

They indeed got stabilized, but unfortunately such traits are not object-safe, and so cannot be used in trait objects. I think the issue still holds. Do you still think we should use async_trait?

They indeed got stabilized, but unfortunately such traits are not object-safe

Doesn't the usual trick in the form of putting where Self: Sized at the end of the method definition work? This makes it impossible to call this method if you have a trait object, but in other contexts you can still call it.

roydahan · 2024-02-12T13:43:06Z

@muzarski let's try to move this one forward, once we will have this option supported we wlll be able to replace some of the currentl longevities that are using cassandra-stress with cql-stress.

muzarski · 2024-02-22T16:03:56Z

v2:

addressed review comments
rebased on top of c-s operation: Extract common operation logic #68. Can be useful to review mentioned PR to see how the changes interact with mixed command.

piodul

Left some nitpicks, after fixing them and rebasing it will LGTM.

piodul · 2024-02-26T07:03:23Z

src/bin/cql-stress-cassandra-stress/operation/mixed.rs

+        self.stats.get_shard_mut().account_operation(ctx, &result);
+
+        if result.is_ok() {
+            self.current_operation_count = self.current_operation_count.saturating_sub(1);


Is there a particular reason here for using saturating_sub? We check for self.current_operation_count == 0 earlier in the function, so the current operation count here should not be 0, right?

Yes, it's pointless to have saturating_sub here. It remains from the previous version where we would have:

self.current_operation_count = self.clustering_distribution.next_i64() as usize;

instead of

self.current_operation_count = (self.clustering_distribution.next_i64() as usize).max(1);

In the previous version there was a possibility that self.current_operation_count is 0 after taking the sample. Now we are guaranteed that it's at least 1 (thanks to usize::max method).

I'll get rid of the saturating_sub.

piodul · 2024-02-26T07:04:20Z

src/bin/cql-stress-cassandra-stress/operation/mixed.rs

+            operation_ratio: Arc::clone(&self.operation_ratio),
+            clustering_distribution: mixed_params.clustering.create(),
+            current_operation: MixedSubcommand::Read,
+            current_operation_count: 0,


How about current_operation_remaining? It will be more descriptive IMO.

Implemented `EnumeratedDistribution` which samples the items from the vector with the probability based on the weight assigned to each item.

`mixed` command makes use of `counter_write` command parameters, which is why we expose function that groups prepares parameter grouping for `counter_write` command and make it public.

OperationRatio is an alias for `EnumeratedDistribution<MixedSubcommand>`. In this commit we implement parsing logic for this type and some printing functionalities.

muzarski requested a review from piodul November 28, 2023 13:28

muzarski self-assigned this Dec 1, 2023

piodul requested changes Dec 4, 2023

View reviewed changes

muzarski mentioned this pull request Feb 21, 2024

c-s operation: Extract common operation logic #68

Merged

muzarski force-pushed the c-s-mixed branch from a9d7866 to 013fc33 Compare February 21, 2024 22:55

piodul reviewed Feb 26, 2024

View reviewed changes

muzarski added 7 commits February 26, 2024 16:32

rng: introduce enumerated distribution

7193534

Implemented `EnumeratedDistribution` which samples the items from the vector with the probability based on the weight assigned to each item.

counter: expose counter parameters grouping

f596b37

`mixed` command makes use of `counter_write` command parameters, which is why we expose function that groups prepares parameter grouping for `counter_write` command and make it public.

mixed: parsing logic for OperationRatio

1625813

OperationRatio is an alias for `EnumeratedDistribution<MixedSubcommand>`. In this commit we implement parsing logic for this type and some printing functionalities.

settings: add mixed command params

f4a6ffb

c-s operation: implement MixedOperation

ad909c6

c-s main: add mixed operation

b0f9ac5

c-s args: add test cases for mixed command

fa3dc28

muzarski force-pushed the c-s-mixed branch from 013fc33 to fa3dc28 Compare February 26, 2024 15:35

piodul approved these changes Feb 27, 2024

View reviewed changes

piodul merged commit d343e28 into scylladb:master Feb 27, 2024
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

c-s mixed operation #63

c-s mixed operation #63

muzarski commented Nov 28, 2023

piodul Dec 4, 2023

muzarski Dec 6, 2023

piodul Dec 4, 2023

piodul Dec 4, 2023

muzarski Dec 6, 2023

piodul Dec 6, 2023

muzarski Feb 14, 2024

piodul Feb 14, 2024

roydahan commented Feb 12, 2024

muzarski commented Feb 22, 2024

piodul left a comment

piodul Feb 26, 2024

muzarski Feb 26, 2024 •

edited

Loading

piodul Feb 26, 2024

c-s mixed operation #63

c-s mixed operation #63

Conversation

muzarski commented Nov 28, 2023

Motivation

Changes

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

roydahan commented Feb 12, 2024

muzarski commented Feb 22, 2024

piodul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

muzarski Feb 26, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

muzarski Feb 26, 2024 •

edited

Loading