Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: sharable reference to storage backend #697

Merged
merged 4 commits into from
Jul 18, 2022
Merged

Conversation

roeap
Copy link
Collaborator

@roeap roeap commented Jul 17, 2022

Description

While working on #632 i saw that a shareable storage backend is also required for #690 - as such i decided to move this to a separate PR, to get things stared for #689.

Related Issue(s)

Documentation

Copy link
Collaborator

@wjones127 wjones127 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for pulling this out into a separate PR. 🙌

This looks great to me!

@houqp
Copy link
Member

houqp commented Jul 18, 2022

In the long run, I wonder if it would be better to add a Clone trait constraint to the StorageBackend trait instead.

@wjones127
Copy link
Collaborator

In the long run, I wonder if it would be better to add a Clone trait constraint to the StorageBackend trait instead.

Is the expectation there that internally they might use an Arc so that cloning is cheap? Not sure which way is more idiomatic in Rust.

@houqp
Copy link
Member

houqp commented Jul 18, 2022

Is the expectation there that internally they might use an Arc so that cloning is cheap? Not sure which way is more idiomatic in Rust.

I think using Arc internally would be the last resort if the upstream sdk client doesn't implement Clone because it incurs extra runtime overheads. All of our current storage backends should be able to derive Clone without using Arcs from a quick glance. Most of the derived clone implementation should be very cheap and if it is not it shouldn't be a problem assuming storage backend clone is a very infrequent operation. On the other hand, accessing the backend itself for read/write is a frequent runtime operation, so it's better to not force the atomic reference counter overhead everywhere only because we need infrequent cheap clones.

If clone is a frequent runtime operation, then wrapping all backend with an Arc is likely the right design choice.

We even have a very old task that talked about getting rid of the trait object constraint for storage backend entirely in #257 :)

@wjones127
Copy link
Collaborator

On the other hand, accessing the backend itself for read/write is a frequent runtime operation, so it's better to not force the atomic reference counter overhead everywhere only because we need infrequent cheap clones.

I was under the impression the increment / decrement overhead just happened in Arc::clone() and Arc::drop(), and thus we shouldn't have any overhead within objects, just a little bit when we are creating new structs where we need to pass a reference to the filesystem. Is it called also when calling methods of T on an Arc<T>?

@houqp
Copy link
Member

houqp commented Jul 18, 2022

I was under the impression the increment / decrement overhead just happened in Arc::clone() and Arc::drop(), and thus we shouldn't have any overhead within objects,

I was wrong and you are right in this case @wjones127 :) Accessing the heap object should have no overhead, Arc's overhead only comes from clone and drop. So it won't make a difference for us compared to Rc because clone is a very infrequent operation.

@houqp houqp merged commit 7f0c91d into delta-io:main Jul 18, 2022
@roeap roeap deleted the arc-store branch July 18, 2022 04:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants