-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor the collector to use reference counting #74
base: master
Are you sure you want to change the base?
Conversation
[profile.release] | ||
debug = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs commented back out
|
||
use crate::collector::GcData; | ||
use crate::concurrency::cross_thread_buffer::CrossThreadBuffer; | ||
|
||
type DropBuffer = CrossThreadBuffer<Arc<GcData>>; | ||
|
||
pub(crate) struct BackgroundDropper { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changes in here are pure premature optimization. But I think it is better than the previous design
A 4x speed up on what? |
@jrmuizel a 4x speedup on the This benchmark kinda blows, and I've had an issue to create new ones for a while (#7). But I'm pretty sure that this new design is significantly better on a number of axises anyway, as it removes the dumb parts of the collector where we keep a list of every |
This is around a ~4x speedup, but required a lot of code changes. In terms of public API, only `AtomicGc` has a significant diff. (The old API would still work, but the changes make it work more efficently with the new internals.)
See #74 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably not the most thorough/helpful for critical feedback, but was able to as a bunch of questions 😄
// | ||
// Taken together, this means that this pointer is always valid when executing a method on | ||
// `AtomicGc`. | ||
atomic_ptr: AtomicPtr<GcData>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I fully follow this change- why is atomic_ptr
no longer an arc? Did the underlying assumption about what the GcData is change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the semantics of AtomicGc
to not be sharable by default. Now you need to do DerefGc<AtomicGc<T>>
if you want to share it. This is much more flexible, and equivalently performant.
/// | ||
/// `AtomicGc` should be fairly fast, but you may not assume it does not block. In fact in the | ||
/// presence of an active garbage collection operation, all operations will block. Otherwise | ||
/// it shouldn't block. | ||
#[derive(Clone, Debug)] | ||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove the Clone trait?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the auto-derive doesn't work anymore, due to the changing type of atomic_ptr
. Will add a manual impl
let data_ptr = Arc::as_ptr(data_arc); | ||
|
||
let atomic_ptr = Arc::new(AtomicPtr::new(data_ptr as _)); | ||
let atomic_ptr = AtomicPtr::new(Arc::as_ptr(data_arc) as _); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the as _
mean? My rust is (very) rusty :P
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cast this to the right type but don't make me type out what type I need"
let atomic_ptr = Arc::new(AtomicPtr::new(data_ptr as _)); | ||
let atomic_ptr = AtomicPtr::new(Arc::as_ptr(data_arc) as _); | ||
// Forget the initial data, we will absorb its reference counts | ||
data.drop_preserving_reference_counts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are you dropping the initial data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this method turns a Gc
into an AtomicGc
, and we need to get rid of the old Gc
to do that. Gc
is reference counted, but instead of incrementing and then decrementing, we can just absorb the reference count. This idea is used throughout this file
ptr = gc_data.scan_ptr().cast(); | ||
internal_handle = COLLECTOR.handle_from_data(gc_data); | ||
// The count of the data going out decreases | ||
COLLECTOR.decrement_reference_count(&old_arc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the count decrease because you've absorbed the other pointer?
|
||
// Then clear and recycle the buffer | ||
to_drop.clear(); | ||
// ignore recycling failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why can you ignore them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if the channel is full we don't want this whole thread to block indefinitely
/// Indicates to the `BackgroundDropper` that it should sync up with the calling code | ||
SyncUp(Sender<()>), | ||
} | ||
|
||
impl BackgroundDropper { | ||
const RECYCLING_CHANNEL_SIZE: usize = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you summarize the changes you've made here? I don't understand the overarching idea
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this class has learned how to recycle it's buffers so the collector can get them back (rather than allocating a big buffer every time the collector runs)
#[derive(Copy, Clone, Debug, Eq, PartialEq)] | ||
pub(crate) enum RefCountPolicy { | ||
// A transient handle doesn't increment or decrement the reference count | ||
TransientHandle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the point of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the Scan implementations of AtomicGc
/ DerefGc
to create handles we can report to the collector without having them mess up our reference counts
// `total_handles` = count_positive + count_negative | ||
// `count_positive` is always >= the actual count >= 0 | ||
// At the beginning of every collection we recalculate `count_positive` | ||
count_positive: AtomicI64, | ||
// `count_negative` is always <= 0 | ||
count_negative: AtomicI64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you clarify this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a specific enough question. Basically this is a complex way of letting count_positive
always be equal to or greater than the real count . The reason why it is split is so that threads can decrement without using a lock
use std::cell::RefCell; | ||
use thread_local::ThreadLocal; | ||
|
||
pub(crate) struct CrossThreadBuffer<T: Send> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is this used for? (not sure if I missed its usage somewhere else in the PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used for the buffer that stores dead data in the collector
/// | ||
/// `AtomicGc` should be fairly fast, but you may not assume it does not block. In fact in the | ||
/// presence of an active garbage collection operation, all operations will block. Otherwise | ||
/// it shouldn't block. | ||
#[derive(Clone, Debug)] | ||
#[derive(Debug)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because the auto-derive doesn't work anymore, due to the changing type of atomic_ptr
. Will add a manual impl
// | ||
// Taken together, this means that this pointer is always valid when executing a method on | ||
// `AtomicGc`. | ||
atomic_ptr: AtomicPtr<GcData>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've changed the semantics of AtomicGc
to not be sharable by default. Now you need to do DerefGc<AtomicGc<T>>
if you want to share it. This is much more flexible, and equivalently performant.
let data_ptr = Arc::as_ptr(data_arc); | ||
|
||
let atomic_ptr = Arc::new(AtomicPtr::new(data_ptr as _)); | ||
let atomic_ptr = AtomicPtr::new(Arc::as_ptr(data_arc) as _); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"cast this to the right type but don't make me type out what type I need"
let atomic_ptr = Arc::new(AtomicPtr::new(data_ptr as _)); | ||
let atomic_ptr = AtomicPtr::new(Arc::as_ptr(data_arc) as _); | ||
// Forget the initial data, we will absorb its reference counts | ||
data.drop_preserving_reference_counts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this method turns a Gc
into an AtomicGc
, and we need to get rid of the old Gc
to do that. Gc
is reference counted, but instead of incrementing and then decrementing, we can just absorb the reference count. This idea is used throughout this file
// No need for collection blocker, as we're not modifying the graph | ||
// This is safe. See comment on the `atomic_ptr` field | ||
let gc_data_ptr = self.atomic_ptr.load(ordering); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a data race. What happens if we:
- Read the data, but don't increment the count yet (count = 1)
- Switch to another thread that overwrites the data and decrements the count
- The collector runs and cleans up the "dead" data
- The read completes, incrementing the count, but the data is already cleaned up and invalid
|
||
// Then clear and recycle the buffer | ||
to_drop.clear(); | ||
// ignore recycling failures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if the channel is full we don't want this whole thread to block indefinitely
/// Indicates to the `BackgroundDropper` that it should sync up with the calling code | ||
SyncUp(Sender<()>), | ||
} | ||
|
||
impl BackgroundDropper { | ||
const RECYCLING_CHANNEL_SIZE: usize = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically this class has learned how to recycle it's buffers so the collector can get them back (rather than allocating a big buffer every time the collector runs)
#[derive(Copy, Clone, Debug, Eq, PartialEq)] | ||
pub(crate) enum RefCountPolicy { | ||
// A transient handle doesn't increment or decrement the reference count | ||
TransientHandle, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used in the Scan implementations of AtomicGc
/ DerefGc
to create handles we can report to the collector without having them mess up our reference counts
// `total_handles` = count_positive + count_negative | ||
// `count_positive` is always >= the actual count >= 0 | ||
// At the beginning of every collection we recalculate `count_positive` | ||
count_positive: AtomicI64, | ||
// `count_negative` is always <= 0 | ||
count_negative: AtomicI64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a specific enough question. Basically this is a complex way of letting count_positive
always be equal to or greater than the real count . The reason why it is split is so that threads can decrement without using a lock
use std::cell::RefCell; | ||
use thread_local::ThreadLocal; | ||
|
||
pub(crate) struct CrossThreadBuffer<T: Send> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's used for the buffer that stores dead data in the collector
This is around a ~4x speedup, but required a lot of code changes.
In terms of public API, only
AtomicGc
has a significant diff.(The old API would still work, but the changes make it work
more efficently with the new internals.)