-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
c-s mixed operation #63
Conversation
pub fn sample(&self) -> T { | ||
self.items[self.dist.sample(&mut rand::thread_rng())].0 | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is thread_rng
intended here? We have ThreadLocalRandom
, maybe it should be used here instead? Not sure, just asking.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We do not depend on any determinism here, so we don't need to use a java_random::Random
.
In addition, rand_distr::WeightedIndex::sample
accepts &mut R
where R: rand::Rng
. We would need to implement rand::RngCore
for java_random::Random
(which is impossible, so we would need to wrap it with yet another type).
I just decided to keep it simple and use the rng provided by rand
crate.
impl OperationRatio { | ||
fn parse_command_weight(s: &str) -> Result<(MixedSubcommand, f64)> { | ||
let (cmd, weight) = { | ||
let mut iter = s.split('=').fuse(); | ||
match (iter.next(), iter.next(), iter.next()) { | ||
(Some(cmd), Some(w), None) => (cmd, w), | ||
_ => anyhow::bail!( | ||
"Command weight specification should match pattern <command>=<f64>" | ||
), | ||
} | ||
}; | ||
|
||
let command = match Command::parse(cmd)? { | ||
Command::Read => MixedSubcommand::Read, | ||
Command::Write => MixedSubcommand::Write, | ||
Command::CounterRead => MixedSubcommand::CounterRead, | ||
Command::CounterWrite => MixedSubcommand::CounterWrite, | ||
_ => anyhow::bail!("Invalid command for mixed workload: {}", cmd), | ||
}; | ||
let weight = weight.parse::<f64>()?; | ||
Ok((command, weight)) | ||
} | ||
|
||
fn do_parse(s: &str) -> Result<Self> { | ||
// Remove wrapping parenthesis. | ||
let arg = { | ||
let mut chars = s.chars(); | ||
anyhow::ensure!( | ||
chars.next() == Some('(') && chars.next_back() == Some(')'), | ||
"Invalid operation ratio specification: {}", | ||
s | ||
); | ||
chars.as_str() | ||
}; | ||
|
||
let mut command_set = HashSet::<MixedSubcommand>::new(); | ||
let weights = arg | ||
.split(',') | ||
.map(|s| -> Result<(MixedSubcommand, f64)> { | ||
let (command, weight) = Self::parse_command_weight(s)?; | ||
anyhow::ensure!( | ||
!command_set.contains(&command), | ||
"{} command has been specified more than once", | ||
command | ||
); | ||
command_set.insert(command); | ||
Ok((command, weight)) | ||
}) | ||
.collect::<Result<Vec<_>, _>>()?; | ||
|
||
Self::new(&weights) | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Side note: this falls under the "custom syntax" mentioned in #52 (no need to do anything right now).
src/bin/cql-stress-cassandra-stress/java_generate/distribution/enumerated.rs
Outdated
Show resolved
Hide resolved
src/bin/cql-stress-cassandra-stress/java_generate/distribution/enumerated.rs
Outdated
Show resolved
Hide resolved
let mixed_params = settings.command_params.mixed.as_ref().unwrap(); | ||
let read_statement = | ||
prepare_read_statement(regular_table_name, &session, &settings).await?; | ||
let counter_read_statement = | ||
prepare_read_statement(counter_table_name, &session, &settings).await?; | ||
let write_statement = WriteOperationFactory::prepare_statement(&session, &settings).await?; | ||
let counter_write_statement = | ||
CounterWriteOperationFactory::prepare_statement(&session, &settings).await?; | ||
let max_operations = settings.command_params.common.operation_count; | ||
let operation_ratio = Arc::new(mixed_params.operation_ratio.clone()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a fan of the approach... You are exposing some details of the existing operations and use them to implement MixedOperation
.
I think that the operations should be self-contained. MixedOperation
should just encapsulate them and not care about their internals apart from the fact that they implement some trait. Would it be possible to implement it like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about introducing some trait, but don't we end up with the same performance issue as in #34 (async-trait
boxed futures)?
I haven't compared both of the approaches when it comes to performance, but to me, it looks exactly the same - we end up with per-operation allocation from boxing the futures. I can still do the comparison, if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. AFAIK async
in trait methods is going to be stabilized in Rust 1.75, so maybe we can use async_trait
until that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see. AFAIK
async
in trait methods is going to be stabilized in Rust 1.75, so maybe we can useasync_trait
until that happens.
They indeed got stabilized, but unfortunately such traits are not object-safe, and so cannot be used in trait objects. I think the issue still holds. Do you still think we should use async_trait
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
They indeed got stabilized, but unfortunately such traits are not object-safe
Doesn't the usual trick in the form of putting where Self: Sized
at the end of the method definition work? This makes it impossible to call this method if you have a trait object, but in other contexts you can still call it.
@muzarski let's try to move this one forward, once we will have this option supported we wlll be able to replace some of the currentl longevities that are using cassandra-stress with cql-stress. |
v2:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some nitpicks, after fixing them and rebasing it will LGTM.
self.stats.get_shard_mut().account_operation(ctx, &result); | ||
|
||
if result.is_ok() { | ||
self.current_operation_count = self.current_operation_count.saturating_sub(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a particular reason here for using saturating_sub? We check for self.current_operation_count == 0
earlier in the function, so the current operation count here should not be 0
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's pointless to have saturating_sub here. It remains from the previous version where we would have:
self.current_operation_count = self.clustering_distribution.next_i64() as usize;
instead of
self.current_operation_count =
(self.clustering_distribution.next_i64() as usize).max(1);
In the previous version there was a possibility that self.current_operation_count
is 0 after taking the sample. Now we are guaranteed that it's at least 1 (thanks to usize::max
method).
I'll get rid of the saturating_sub
.
operation_ratio: Arc::clone(&self.operation_ratio), | ||
clustering_distribution: mixed_params.clustering.create(), | ||
current_operation: MixedSubcommand::Read, | ||
current_operation_count: 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about current_operation_remaining
? It will be more descriptive IMO.
Implemented `EnumeratedDistribution` which samples the items from the vector with the probability based on the weight assigned to each item.
`mixed` command makes use of `counter_write` command parameters, which is why we expose function that groups prepares parameter grouping for `counter_write` command and make it public.
OperationRatio is an alias for `EnumeratedDistribution<MixedSubcommand>`. In this commit we implement parsing logic for this type and some printing functionalities.
Motivation
Cassandra-stress supports mixed workloads. It allows to run multiple kinds of operations. Allowed operations are:
The parameters supported by
mixed
command are:read
)counter_write
parametersratio()
parameter based on which tool samples an operation to execute. E.g.ratio(read=2, write=1)
means that there will be approximately 2 read operations per 1 write operation.clustering
distribution parameter which tells the tool how many times to run the operation that just got sampled.Changes
settings
module forMixedOperation
MixedOperation