Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replace scopes with guards #20

Merged
4 commits merged into from Nov 20, 2017
Merged

Replace scopes with guards #20

4 commits merged into from Nov 20, 2017

Conversation

ghost
Copy link

@ghost ghost commented Nov 3, 2017

Copy link
Contributor

@jeehoonkang jeehoonkang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the RFC! I'm almost approving this RFC, but I'd like to discuss the section on default_handle() a little bit more. I left two comments.

}
```

This way `default_handle` bears some cost of reference counting, but it should be
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how this solves the problem discussed in crossbeam-rs/crossbeam-epoch#28. What's the benefit of this over the version in the PR? (https://github.com/crossbeam-rs/crossbeam-epoch/pull/28/files#diff-aeb58025d21f8e06a38eae37fad74b8dR26)

Copy link
Author

@ghost ghost Nov 4, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a more elaborate explanation to the text.
The main benefit is that default_handle is a safe function.

```

This way `default_handle` bears some cost of reference counting, but it should be
forgivable, as is demonstrated by the benchmarks...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there's 1ns diff between epoch::pin() and epoch::default_handle().pin() (12ns vs. 13ns). How about specifically pointing out this benchmark result here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.


### Accessing the default handle

Moreover, we'been having difficulties finding a satisfying signature for the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"we've been"

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

Similarly, each `Handle` and each `Guard` keeps a pointer to the `Local` associated with it.
Handles and guards are counted using fields `handle_count` and `guard_count` inside `Local`.
When both counts reach zero, the `Arc<Global>` inside `Local` is immediately dropped
(using `ManuallyDrop::drop`), and the heap-allocated `Local` is then marked as deleted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If dropping Arc<Global> happens before marking Local as deleted, isn't it possible that Global is dropped so that every associated Local's are dropped, after which marking the Local as deleted is use-after-free?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point!

This is a mistake - it's actually the other way around. First, the Local is marked as deleted, and then the Arc<Global> is dropped. I'll update the text.


But if handles are not `Send`, a fourth option emerges:

4. Same as the third option, but have internal reference counting using `Rc` instead of `Arc`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the options 3. and 4. are as unsafe as 1. and 2., as they may panic: https://doc.rust-lang.org/1.21.0/std/thread/struct.LocalKey.html#panics . For this reason, I think the signature of thread_local!'s HANDLE.with() is misleading: while the signature says it's safe, it is actually quite unsafe. (I'd like to cc @joshlf, who met this panic in the development of elfmalloc.)

That being said, I'd like to advocate for option 1., which doesn't incur runtime performance.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've rewritten this section of the text. Hopefully it's clearer now and objectively lays out the pros & cons of each option. I've also removed the 3. option.

For this reason, I think the signature of thread_local!'s HANDLE.with() is misleading: while the signature says it's safe, it is actually quite unsafe.

Just to be clear: when you say "it's actually quite unsafe", do you mean "it's actually quite dangerous", or are you talking about "unsafe" as in "invokes undefined behavior"?

@ghost
Copy link
Author

ghost commented Nov 5, 2017

@joshlf Could you please take a look at the Accessing the default handle section, since it is related to elfmalloc? You've been dealing with thread-locals and tuning performance a lot, so your opinion would be very helpful here.

@jeehoonkang is advocating for the 1st option, while I want the 3rd. But in the end, I don't feel particularly strongly about it.

However, there is one thing I do strongly care about: let's avoid unsafe APIs whenever possible. In that spirit, the 2nd option would be better than the 1st one.

@ghost
Copy link
Author

ghost commented Nov 6, 2017

@jeehoonkang

I appreciate your concerns about the performance of the 3rd option. However, I'd like put some things into perspective first. Take a look at the following benchmarks:

#[bench]
fn pin_empty(b: &mut Bencher) {
    b.iter(|| epoch::pin());
}

#[bench]
fn default_handle_pin(b: &mut Bencher) {
    b.iter(|| epoch::default_handle().pin());
}

#[bench]
fn local_handle_pin(b: &mut Bencher) {
    let handle = epoch::default_handle();
    b.iter(|| handle.pin());
}

#[bench]
fn thread_local_handle_pin(b: &mut Bencher) {
    thread_local! {
        static MY: Handle = epoch::default_handle();
    }
    b.iter(|| MY.with(|handle| handle.pin()));
}

Results:

test default_handle_pin      ... bench:          13 ns/iter (+/- 1)
test local_handle_pin        ... bench:           9 ns/iter (+/- 1)
test pin_empty               ... bench:          12 ns/iter (+/- 1)
test thread_local_handle_pin ... bench:           9 ns/iter (+/- 0)

Looks like pinning itself costs 9 nanoseconds. Accessing the thread-local (that's LOCAL.with(...) inside epoch::pin()) adds 3 more nanoseconds. Finally, incrementing/decrementing the reference count to access the default handle costs 1 more nanosecond.

However, if you store the handle in an another thread-local (in this example, MY) and access it with MY.with(...), you don't pay the cost of reference counting. This is also faster than epoch::pin(), perhaps due to better inlining opportunities (?).

The point is, that 1 nanosecond paid on reference counting really doesn't matter at all. If you need maximum performance, it's easy to have it by creating a custom thread-local anyway.

@jeehoonkang
Copy link
Contributor

@stjepang thanks for the kind explanation. Now I prefer the 3rd option. Let's go for it :)

For safety, I incorrectly assumed that both 1st and 3rd options invoke undefined behavior. However, the 3rd option seems to raise a specific panic: HANDLE is already destroyed. It's arguably less unsafe, I think.

Copy link
Contributor

@jeehoonkang jeehoonkang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for a great job!

Copy link

@arthurprs arthurprs left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solid proposal. Thanks for putting all this together.

@ghost ghost merged commit add3c93 into crossbeam-rs:master Nov 20, 2017
This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants