Skip to content

deser: lazy CassRow construction #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Apr 2, 2025

Conversation

muzarski
Copy link
Collaborator

@muzarski muzarski commented Dec 3, 2024

Depends on: #208

Motivation

Currently, we unnecessarily construct CassRows eagerly. What we can do (and what cpp-driver does) is to deserialize rows lazily, during iteration using iterator created via cass_iterator_from_result.

Why?
Notice that CassResult has no methods that request a row at given index. The only exception is cass_result_first_row - but we can simply deserialize only first row, and cache it in CassResult so the call to cass_result_first_row is always fast.

Changes

  • Adjusted CassRowsResult, so now it only holds deserialized first row (instead of all rows), and payload required for deserialization (i.e. DeserializedMetadataAndRawRows and CassResultMetadata).
  • Adjusted CassResultIterator, so now it is an enum - to distinguish whether it's an iterator created from either non-rows result or rows result.
  • Introduced CassRowsResultIterator, which now uses TypedRowIterator under the hood. The rows are now deserialized on the fly, during cass_iterator_next call.
  • Adjusted CassRow, and newly introduced CassRowsResultIterator, so they now borrow CassResultMetadata from CassResult (instead of sharing ownership via Arc). Their lifetime is already bound by the lifetime of CassResult, so there is no point in cloning an Arc.

Pre-review checklist

  • I have split my patch into logically separate commits.
  • All commit messages clearly explain what they change and why.
  • PR description sums up the changes and reasons why they should be introduced.
  • [ ] I have implemented Rust unit tests for the features/changes introduced.
  • [ ] I have enabled appropriate tests in .github/workflows/build.yml in gtest_filter.
  • [ ] I have enabled appropriate tests in .github/workflows/cassandra.yml in gtest_filter.

@muzarski muzarski marked this pull request as draft December 3, 2024 20:32
@muzarski muzarski self-assigned this Dec 3, 2024
@muzarski muzarski mentioned this pull request Dec 9, 2024
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch 2 times, most recently from 40cf28e to da5ce8e Compare December 16, 2024 01:44
@muzarski muzarski requested a review from Copilot March 4, 2025 16:45
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Overview

This PR updates the FFI interface across multiple modules by replacing raw pointer types with explicit smart pointer wrappers (e.g. CassSharedPtr, CassExclusiveMutPtr, etc.) and updating the associated usage of BoxFFI/ArcFFI throughout the codebase. Key changes include:

  • Converting function signatures from raw pointers (e.g. *mut CassCluster) to well‐typed smart pointers.
  • Updating FFI-related macros and helper calls to use the new pointer types with explicit unwrap calls.
  • Adjusting tests and integration functions to use the new pointer conversion and cloning semantics.

Reviewed Changes

File Description
scylla-rust-wrapper/src/cluster.rs Updated pointer types and conversion logic in cluster setup functions.
scylla-rust-wrapper/src/prepared.rs Converted prepared statement FFI functions to use smart pointers.
scylla-rust-wrapper/src/exec_profile.rs Updated execution profile functions to work with new pointer wrappers.
... (other files changed) Similar changes applied for future, binding, logging, metadata, batch, statement, query_error, ssl, retry_policy, integration_testing, cass_error, and session modules.

Copilot reviewed 27 out of 27 changed files in this pull request and generated 2 comments.

@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch 4 times, most recently from 73d4b45 to 4950df5 Compare March 18, 2025 13:49
@muzarski muzarski added this to the 0.4 milestone Mar 18, 2025
@muzarski muzarski marked this pull request as ready for review March 18, 2025 14:02
@muzarski
Copy link
Collaborator Author

This one is ready for review. #208 needs to be merged first, though.

@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from 4d0f7c5 to e66b4bd Compare April 1, 2025 09:49
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 1, 2025

Rebased on master (to include #232)

@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from e66b4bd to 700cf6e Compare April 1, 2025 10:26
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 1, 2025

v1.1: Adjusted safety comments and commit messages as @Lorak-mmk suggested.

The CI will fail - still waiting for #233 to finish (cmake version bump)

@muzarski
Copy link
Collaborator Author

muzarski commented Apr 1, 2025

WTF, Build / Build, lint and run unit tests passed. It looks like not all GH runners have CMake updated to 4.0 yet.

@muzarski muzarski requested a review from Lorak-mmk April 1, 2025 10:52
Just introducing the config file is unfortunately not enough.
CMakeRust does not seem to pick up the config file.

Even if it did (after some adjustments), it defines the RUSTFLAGS env variable
which takes precedence over the config file. The value of RUSTFLAGS changes
depending on the platform (so we can't just put them in the config file).
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from 700cf6e to d2d910c Compare April 1, 2025 11:50
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 1, 2025

v1.2: Rebased on master (cmake version bump) and addressed @Lorak-mmk comments

@Lorak-mmk
Copy link
Collaborator

Regarding the self-reference problem: maybe we could use yoke as we did in Rust Driver?

@muzarski
Copy link
Collaborator Author

muzarski commented Apr 1, 2025

Regarding the self-reference problem: maybe we could use yoke as we did in Rust Driver?

I considered this. I'd need to investigate how easy it would be to implement it. I'm not that familiar with yoke traits, but I always have the rust-driver deserialization code as a reference.

muzarski added 3 commits April 2, 2025 04:42
It is to distinguish between the two types of iterators that we have:
- iterator over non-rows result (always returning null)
- iterator over rows result

In one of the next commits, we will modify `CassRowsResultIterator`, so
it holds `TypedRowIterator` instead of reference to CassRowsResult.

This change would be necessary then anyway, so it's better to do it now
and reduce the noise during review in later commit.
If we look at the CassResult methods, there is no method that requests
a CassRow at given index. It means, that we can lazily deserialize rows
during iteration (`cass_iterator_from_result`).

The only exception is `cass_result_first_row`. The result should
hold first deserialized row, and simply return it. We should not
deserialize the row each time this method is called.

This is exactly what this commit does. It adjusts both `CassRowsResult`
and `CassRowsResultIterator`:
- CassRowsResult now holds only first row, and all deserialization payload (i.e.
  metadata and serialized rows).
- CassRowsResultIterator from now on holds a TypedRowIterator, instead
  of reference to CassRowsResult.

This semantic is more efficient, and is employed by cpp-driver as well.
In the later commit, this PR introduces self-borrowing to the codebase.
As of now, it's not possible to express it using safe Rust. Luckily,
there is a yoke crate that will allow us to do that.
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from d2d910c to 1b4b176 Compare April 2, 2025 04:29
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 2, 2025

v2.0:

@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from 1b4b176 to 1db2e37 Compare April 2, 2025 04:32
@muzarski muzarski requested review from wprzytula and Lorak-mmk April 2, 2025 04:33
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from 1db2e37 to 4edfb2e Compare April 2, 2025 12:43
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 2, 2025

v2.1: Addressed @wprzytula comments

@muzarski muzarski requested a review from wprzytula April 2, 2025 12:43
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from 4edfb2e to e8e910f Compare April 2, 2025 12:56
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 2, 2025

v2.2: Renamed the constructor so it indicates that it constructs first row only.

@muzarski muzarski requested a review from wprzytula April 2, 2025 12:56
Copy link
Collaborator

@wprzytula wprzytula left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

muzarski added 3 commits April 2, 2025 15:09
Because of that, we need to introduce a lifetime parameter to CassRow.
CassRow will from now on borrow metadata from CassResult (CassRowsResult).

OTOH, notice that CassRowsResult need to hold the deserialized first row.
This is because `cass_result_first_row` method exists. We don't want to
deserialize the row each time this method is called.

Because of that, the self-borrowing appears in our codebase. We have a
CassRowsResult, which holds a CassRow that borrows CassResultMetadata from the CassRowsResult.
It's not possible to express it in safe Rust, thus we make use of yoke crate
which wraps unsafe code and exposes safe abstractions on top of it.
I extracted the logic for each iterator to corresponding methods.
QueryError was replaced with ExecutionError in Rust driver 1.0.
@muzarski muzarski force-pushed the bump-cpp-rust-cfg branch from e8e910f to b6e1f85 Compare April 2, 2025 13:09
@muzarski
Copy link
Collaborator Author

muzarski commented Apr 2, 2025

v2.3: Reduced visibility of RowWithSelfBorrowedMetadata methods to pub(super).

@muzarski muzarski merged commit b304403 into scylladb:master Apr 2, 2025
11 checks passed
@muzarski muzarski mentioned this pull request Apr 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants