Skip to content

Fix driver defaults - cluster & exec profiles #332

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from

Conversation

wprzytula
Copy link
Collaborator

@wprzytula wprzytula commented Jun 29, 2025

Stacked on: #330
Start review from 62b1a05 - exec_profile: style & typo fixes.

Generated with GPT-4o and manually (heavily) redacted

Problem Overview

The Rust wrapper diverges from the CPP Driver in several areas of configuration, including:

  1. default settings,
  2. execution profile behavior, and
  3. load balancing policies.

These inconsistencies can lead to confusion for users migrating from the CPP Driver or relying on its documented behavior. Additionally, certain code patterns in the Rust wrapper could benefit from refactoring for improved clarity and maintainability.

Solution

All changes are made to align the Scylla Rust wrapper with the CPP Driver's behavior while improving code organization and defensive programming practices.

The commits do the following:

  1. Paste CPP Driver defaults in a comment:

    • Adds comments listing the CPP Driver's default settings and configuration code to ensure no defaults are accidentally omitted in the Rust wrapper.
  2. Style and typo fixes:

    • Fixes minor style and typographical issues in exec_profile.rs.
  3. Use MonotonicTimestampGenerator by default:

    • Ensures the default timestamp generator matches the CPP Driver's behavior.
  4. Enable TCP keepalive by default:

    • Aligns with the CPP Driver, but sets the keepalive interval to 2 seconds instead of 0 seconds due to libuv's rejection of 0-second configurations (and Rust driver's complaints that 1 sec is too short, whatever).
  5. Refactor build_session_builder as a method:

    • Moves build_session_builder to be a method of CassCluster, as it perfectly makes sense.
  6. Set default policies explicitly:

    • Explicitly sets default retry and speculative execution policies in the default execution profile builder to prevent future changes from breaking functionality.
  7. Describe load balancing policy divergence:

    • Adds comments explaining the divergence in load balancing policy behaviour between the CPP Driver and the Rust wrapper. The Rust wrapper defaults to RoundRobin, requiring explicit local DC configuration for DC-aware policies.
  8. Use cluster defaults for unset settings in execution profiles:

    • Modifies CassExecProfile to use the cluster's default profile for settings not explicitly set in the execution profile, ensuring consistency with the CPP Driver.

Discussion Needed

Point 7: Should we reconsider the default load balancing policy to match the CPP Driver's behavior (DCAware with automatic local DC initialization)? While the current approach avoids potential pitfalls, it diverges from the CPP Driver's default.

Summary

These changes improve the correctness, maintainability, and alignment of the Scylla Rust wrapper with the CPP Driver's behavior. They also enhance code clarity and defensive programming practices.

Testing

Integration tests for consistency and serial consistency settings are planned as a follow-up. Unit tests for the introduced changes are TODO.

Pre-review Checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • PR description sums up the changes and reasons why they should be introduced.
  • I have implemented Rust unit tests for the features/changes introduced.
  • [ ] I have enabled appropriate tests in Makefile in {SCYLLA,CASSANDRA}_(NO_VALGRIND_)TEST_FILTER.
  • [ ] I added appropriate Fixes: annotations to PR description.

@wprzytula wprzytula self-assigned this Jun 29, 2025
@wprzytula wprzytula added bug Something isn't working P1 P1 priority item - very important labels Jun 29, 2025
@wprzytula wprzytula added this to the 0.6 milestone Jun 29, 2025
@wprzytula wprzytula force-pushed the fix-driver-defaults branch 3 times, most recently from 68d51d0 to 3880377 Compare June 30, 2025 12:38
wprzytula added 16 commits June 30, 2025 18:45
CPP Driver allows setting serial consistency to `ANY`, which results in
not setting serial consistency at all. Also, `CASS_CONSISTENCY_UNKNOWN`
can be used on statement or batch to indicate that serial consistency is
not set, which means "use whatever the execution profile/cluster
specifies". Let's ensure consistent behaviour in CPP-Rust Driver.

Unfortunately, the Rust Driver does not support "unsetting" serial
consistency from a statement at the moment. For now, we will throw
an error upon `CASS_CONSISTENCY_UNKNOWN` provided, in order to warn
the user about this limitation.
CPP Driver allows setting `CASS_CONSISTENCY_UNKNOWN` on statement
or batch to indicate that consistency is not set, which means
"use whatever the execution profile/cluster specifies". Let's
ensure consistent behaviour in CPP-Rust Driver.

Unfortunately, the Rust Driver does not support "unsetting" consistency
from a statement at the moment. For now, we will throw an error upon
`CASS_CONSISTENCY_UNKNOWN` provided, in order to warn the user about
this limitation.
Similarly to the previous commit, this change ensures that the
serial consistency is set to `None` when the value is
`CASS_CONSISTENCY_ANY`. This aligns with the behavior of the CPP Driver.

Note that I previously mistreated the `CASS_CONSISTENCY_UNKNOWN` value
as an "any" value. This is incorrect: `CASS_CONSISTENCY_UNKNOWN` denotes
"ignore me", which means that if for example a statement has
`CASS_CONSISTENCY_UNKNOWN` set, then the execution profile should be
considered instead. To compare, `CASS_CONSISTENCY_ANY` set on a
statement overrides the execution profile.

Actually, if a CPP execution profile specifies
`CASS_CONSISTENCY_UNKNOWN` for its setting, then this setting is taken
from the cluster = from the default execution profile (which forbids
`CASS_CONSISTENCY_UNKNOWN`, which is ensured in the next commit).
I'm going to introduce the described mechanism in a follow-up PR.
For now, `CASS_CONSISTENCY_UNKNOWN` passed to
`cass_execution_profile_set_serial_consistency` results in an error
`CASS_ERROR_LIB_BAD_PARAMS` returned.
This commit refactors `cass_cluster_set_serial_consistency` to use the
newly introduced `get_serial_consistency_from_cass_consistency`.
This makes clear what are the differences in handling of serial
consistency between statement/batch, execution profile and cluster.
`impl TryFrom<CassConsistency> for SerialConsistency` is removed,
as it's:
- Not used anymore in the codebase.
- Prone to misuse, because there are two different semantics of such
  conversion:
  1. For statement/batch, where `CASS_CONSISTENCY_UNKNOWN` is allowed
     and must be handled.
  2. For execution profiles/cluster, where `CASS_CONSISTENCY_UNKNOWN`
     is not allowed, and must result in an error.

This commit finishes corrections made to serial consistency handling.
The default serial consistency in the Rust driver is LocalSerial, which
is different to Any in the CPP driver. This commit sets the default
serial consistency to None, which is equivalent to Any.
This makes CPP-Rust driver consistent with the CPP driver in terms
of the default serial consistency.

Note: Rust driver purposefully set the default serial consistency to
LocalSerial instead of None. The rationale taken from the issue
(scylladb/scylla-rust-driver#277):

> Using lightweight transactions in CQL requires setting an additional
> consistency level, the so called serial consistency level, on top of
> the regular one. The serial consistency level can only take two
> values: SERIAL and LOCAL_SERIAL. There's currently no default, so
> using lwt results in an error message: `Consistency level for LWT is
> missing for a request with conditions`. We should consider picking
> a sensible default value in order to improve the user experience
> of the driver.

Considering this, we should consider changing the default serial
consistency in CPP-Rust driver to LocalSerial, too.
The relevant issue: scylladb/scylla-ccm#646
has been fixed some time ago.
The number of defaults in the CPP Driver is quite large, so not to
accidentally omit any, the whole list of them, as well as the
configuration code, is pasted in a comment in `cass_cluster_new`.
This is in line with the CPP Driver.
The keepalive is enabled by default in the CPP driver.
There is one difference: the default TCP keepalive interval (delay
after the last message sent/received before the first keepalive probe
is sent) is set to 0 sec in the CPP driver. However, libuv started
to reject such configuration on purpose since 1.49, so let's follow
its reasoning and set it to 2 sec (Rust driver warns that 1 sec
is too short and would drag a performance penalty, whatever).
See:
- <https://docs.libuv.org/en/v1.x/tcp.html#c.uv_tcp_keepalive>
- <https://github.com/libuv/libuv/blob/513751e2fcf1b5be5238de66b2c06f3e0623aca0/src/unix/tcp.c#L591>
- <https://github.com/libuv/libuv/blob/513751e2fcf1b5be5238de66b2c06f3e0623aca0/src/unix/tcp.c#L545>
`build_session_builder` makes perfect sense as a method of `CassCluster`
rather than a standalone function.

This commit is viewed best without whitespace changes.
The default execution profile in the Scylla Rust wrapper
has been setting default retry policy and speculative execution policy
correctly (in line with the CPP Driver). However, it's a better
defensive pattern to set these explicitly in the default execution
profile builder, so that any future changes to the default execution
profile in the Rust Driver do not break the Scylla Rust wrapper.
CPP Driver uses DCAware load balancing policy by default.
The tricky part is that it initializes the local DC to the first node
the client connects to. This is tricky, a possible footgun, and hard to
be done with the current Rust driver implementation, which is why we use
RoundRobin by default (we turn off DC awareness by default, requiring
the user to provide the local DC explicitly if they want to use it).

The comments describing this behaviour are added to the `cluster.rs`
and `load_balancing.rs` files for better discoverability and
understanding of this divergence.
This commit modifies the `CassExecProfile` to use the defaults from
the cluster's default profile for settings that are not explicitly
set in the execution profile. This is consistent with the behavior
of the CPP driver.
Note that the consistency is handled separately, as it has a separate
logic in CPP Driver (whether it makes sense or not is another topic
for dicussion): if consistency is not set in the execution profile,
it is set to the hardcoded default consistency, not the one from
the cluster's default profile.
According to the CPP Driver's code, passing a null retry policy pointer
to `cass_execution_profile_set_retry_policy` shall unset the retry
policy in the execution profile, making the driver use the retry
policy from the cluster's default profile.
This has been different in the Rust wrapper, which has returned
`CASS_ERROR_LIB_BAD_PARAMS` when passed a null pointer.
This commit fixes this, using the new execution profile overrides
mechanism.
@wprzytula wprzytula force-pushed the fix-driver-defaults branch from f9ba8ff to 19b229e Compare June 30, 2025 16:46
@wprzytula wprzytula marked this pull request as ready for review June 30, 2025 17:24
@wprzytula wprzytula requested review from Copilot and Lorak-mmk June 30, 2025 17:24
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR aligns the Rust driver’s configuration defaults and execution profile behavior with the CPP Driver, while refactoring and improving code clarity.

  • Refactors consistency and serial consistency handling using new Rust syntax and a MaybeUnset enum
  • Moves build_session_builder to a method on CassCluster and updates related default configurations
  • Adjusts default TCP keepalive, timestamp generator settings, and enhances execution profile defaults management

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
scylla-rust-wrapper/src/statement.rs Updates consistency handling with new Result pattern and explicit error paths
scylla-rust-wrapper/src/session.rs Changes build_session_builder usage to a method on CassCluster
scylla-rust-wrapper/src/load_balancing.rs Updates load balancing comments to explain deviation from CPP Driver defaults
scylla-rust-wrapper/src/exec_profile.rs Refactors execution profile settings inheritance and overrides tracking
scylla-rust-wrapper/src/cluster.rs Adjusts cluster defaults including TCP keepalive and timestamp generator
Other files Various style, typo fixes and API consistency improvements
Comments suppressed due to low confidence (2)

scylla-rust-wrapper/src/statement.rs:337

  • Ensure to revisit the FIXME comments when unsetting consistency becomes supported, and consider adding a note in the public API documentation to explain the current limitation.
    let Ok(maybe_set_consistency) = get_consistency_from_cass_consistency(consistency) else {

scylla-rust-wrapper/src/cluster.rs:304

  • [nitpick] The new TCP keepalive interval of 2 seconds is well-documented; verify cross-environment support to prevent unexpected connection issues.
            .tcp_keepalive_interval(DEFAULT_TCP_KEEPALIVE_INTERVAL)

Comment on lines 294 to 297
.local_ip_address(DEFAULT_LOCAL_IP_ADDRESS)
.shard_aware_local_port_range(DEFAULT_SHARD_AWARE_LOCAL_PORT_RANGE)
.timestamp_generator(Arc::new(MonotonicTimestampGenerator::new()))
};
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In cpp-driver I see:

  void set_timestamp_gen(TimestampGenerator* timestamp_gen) {
    if (timestamp_gen == NULL) return;
    timestamp_gen_.reset(timestamp_gen);
  }

Is it even possible to not use client-side timestamps there?
Hmm... I looked at the docs and I see this function: cass_timestamp_gen_server_side_new
So that would imply that either TimestampGenerator interface in cpp-driver allows for both client-side and server-side timestamps, or there is some other place that controls it.
Interestingly, docs for this function also include: "Note: This is the default timestamp generator."
So are you really sure that Monotonic is the default in cpp-driver and there is no other place in the code that unexpectedly make the driver use different one by default?

In general defaulting to client-side timestamps seems weird to me, I'd prefer to default to server-side.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it even possible to not use client-side timestamps there?

Yes, by passing an instance of the server-side timestamp generator, which may be obtained by cass_timestamp_gen_server_side_new. This API is fully analogous to retry policy API, where you similarly cannot pass a NULL - some retry policy is always needed, which is also the case with timestamp generators. You can also change the used RP/timestamp generator.

So are you really sure that Monotonic is the default in cpp-driver and there is no other place in the code that unexpectedly make the driver use different one by default?

CPP Driver, by default, does use client-side timestamps.

Excerpt from topics/basics/client_side_timestamps/README.md:

# Client-side timestamps

**Note**: Cassandra 2.1+ is required.

Cassandra uses timestamps to serialize write operations. That is, values with a
more current timestamp are considered to be the most up-to-date version of that
information. By default, timestamps are assigned by the driver on the
client-side. This behavior can be overridden by configuring the driver to use a
timestamp generator or assigning a timestamp directly to a [`CassStatement`] or
[`CassBatch`].

Copy link
Collaborator

@Lorak-mmk Lorak-mmk Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So there is bug in documentation, right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What bug? I don't see any.

Perhaps you misparse the gen part in cass_timestamp_gen_server_side_new as generate instead of generator.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation for cass_timestamp_gen_server_side_new says that it is the default generator.
Documentation for cass_timestamp_gen_monotonic_new does not say it is the default.

Copy link
Collaborator Author

@wprzytula wprzytula Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interestingly, docs for this function also include: "Note: This is the default timestamp generator."

Ah, I've missed this.

This is the commit that changed the default timestamp generator to Monotonic:
datastax/cpp-driver@c705ea5
They indeed forgot to update the docs in cassandra.h.

See the related Datastax Jira issue.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This issue says "For consistency with other drivers, the cpp-driver should enable client timestamps by default in the next non-patch release."

Do you know if this is really the case? I was under the impression that drivers generally default to server-side timestamps.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commit: "exec_profile: use cluster defaults for unset settings "

The behavior this commit fixes looks like a serious bug. Was that really the case before that cluster (=default EP) settings were ignored?

Copy link
Collaborator Author

@wprzytula wprzytula Jul 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They were ignored iff an execution profile was set for the executed statement.
This is the Rust Driver's semantics, that's where the bug comes from.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So in Rust Driver the cluster-level default will be ignored when EP is set on a statement, even if this EP doesn't have some value set?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because in Rust Driver there is no notion of unset setting in an exec profile. This is what I'm referring to with MaybeUnset::Unset - a condition that someone left a setting unset, which is equivalent to the semantics of UNSET in CQL (Sobociński logic).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"(whether it makes sense or not is another topic
for dicussion): if consistency is not set in the execution profile,
it is set to the hardcoded default consistency, not the one from
the cluster's default profile."

We are aiming for a "drop-in" replacement, but I think that as with most other things it is a matter of balance.
For example, we are not interested in replicating cpp-drivers bugs.
When migrating to this driver, people will have to pay more attention that with ordinary update, so I think it is a good idea to have a reasonable defaults in cpp-driver, instead of tying everything to cpp-driver.
Not that cpp-driver changed its default consistency multiple times, so why can't we when implementing new driver?

Comment on lines +751 to +759
let maybe_retry_policy: Option<Arc<dyn RetryPolicy>> =
ArcFFI::as_ref(retry_policy).map(|rp| match rp {
CassRetryPolicy::Default(default) => Arc::clone(default) as Arc<dyn RetryPolicy>,
CassRetryPolicy::Fallthrough(fallthrough) => Arc::clone(fallthrough) as _,
CassRetryPolicy::DowngradingConsistency(downgrading) => Arc::clone(downgrading) as _,
CassRetryPolicy::Logging(logging) => Arc::clone(logging) as _,
#[cfg(cpp_integration_testing)]
CassRetryPolicy::Ignoring(ignoring) => Arc::clone(ignoring) as _,
});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is the first cast needed? It was not present before.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we had Arc<dyn RetryPolicy>. Now we have Option<Arc<dyn RetryPolicy>>. The presence of an Option wrapper somehow alters the type-inferring logic.

CassPtr has two constructors that are useful for testing:
- `null()` for creating a null pointer
- `null_mut()` for creating a null mutable pointer
Both are safe to be used, because the unsafe code that CassPtr uses
always safely dereferences the pointer, i.e., only after nullity checks.
A unit test is added to ensure that when an execution profile does not
set specific settings (serial consistency, request timeout, retry policy,
and speculative execution policy), as well as if it sets them once
and later unsets them, the settings are fetched from the default
execution profile of the cluster.
@wprzytula wprzytula force-pushed the fix-driver-defaults branch from 19b229e to 92576fd Compare July 1, 2025 06:19
@wprzytula wprzytula requested a review from Lorak-mmk July 1, 2025 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working P1 P1 priority item - very important
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants