-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rewrite the local_data implementation #15399
Conversation
/// let y = x.clone(); | ||
/// asesrt_eq!(x.try_unwrap(), Err(Rc::new(4u))); | ||
/// ``` | ||
#[inline] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this inline
is necessary (and similarly for the other new methods here): these functions are generic anyway and so are in crate metadata. Do they make a measurable difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made it inline
largely because make_unique()
is inline
(and of course deref()
is #[inline(always)]
). I thought it was a reasonable precedent to default these new methods to being inlined as well.
Also argh, that comment example has a typo!
Very cool! There should be a slight unease with adding methods to Rc though -- anything that implements Deref will have confusing (for users) conflicts between its methods and the contained value's methods. cc @thestinger maybe wants to know about the Rc changes. |
@blake2-ppc Yeah, that's my concern as well, but the ability to deal with a unique One option is to turn them into functions inside |
At @huonw's request I put
I'll have the numbers for 100/1000 for TreeMap soon, but last time I ran them it was 29ns/iter and 38ns/iter, so there's a pretty big difference here. |
Well ok that's some inspiration -- |
@blake2-ppc Good idea. Is using a trait for this better than just using top-level functions in the |
@kballard one possible alternative to avoid the deref issue is to make |
Here's the benchmarks again using
Note the 100 and 1000 keys behaviors are vastly better than the |
@sfackler I hadn't considered static methods. That's better than free functions in the module, which is what I was thinking of doing. |
Updated. r? |
And again. Fixed the |
I need to take some more time to read this more thoroughly, but casually adding functions to core modules such as |
@alexcrichton I am not at all surprised you want to discuss this first, and that's fine (as long as it doesn't end up stalling out, like other PRs where you've wanted to discuss first). The alternative to adding the
|
Previously we rejected the |
@pcwalton What sort of warnings are you thinking about adding to the docs? It's a bit of a specialized tool and you need to be careful about when you assume it will or won't be able to work, but it's not really dangerous in any way. I'm happy to add wording suggesting caution, but I don't know what to actually say that makes sense. "You should be careful using this because...." |
Taking a step back, I'm not sure that a shared ownership model is one that we would like to live with moving into the future. Ideally, I would imagine that task-local-data looks something like this:
With a design such as this, all TLD lookups/sets are all O(1), unlike the O(log n) in this PR and the O(n) previously. TLD is far less valuable if it has any complexity other than O(1) for the most part. Basically, in an ideal world, there's no space for this shared ownership model. I think it's a great idea to reduce allocations, but I think it's important to allow for references into the actual TLS data rather than handing out I know that @pcwalton has wanted to use TLS for many high performance use cases in servo, but has been unable to due to the current design. |
@alexcrichton I agree that an O(1) approach would be ideal, but as long as we're handing out references to values stored inline in TLD (instead of separately-allocated as I'm doing here), we need to effectively be a Now of course If you feel strongly about this, and think that your described TLD scheme will actually be implemented some day (because it sure isn't being implemented today), then I can reintroduce dynamic failure. The only benefit to doing that right now is to preserve API compatibility with this proposed future TLD rewrite (and that's assuming there's no API breakage from this rewrite otherwise). Fundamentally, it would just involve basically doing FWIW, it's possible to get O(1) lookup and still preserve the shared ownership model. The time complexity does not come from the need for allocations but rather the fact that we do not in fact have a known offset into TLD for each key, and therefore need to perform some kind of search. And the allocation with this implementation only happens 1. on first insert of a key, or 2. when setting a key value that is not uniquely owned. As long as clients of the TLD API avoid shared ownership, replacing values re-uses the existing allocation. The way this handles shared ownership is compatible with a O(1) TLD scheme like you described, so keeping this API does not preclude fixing lookup. So the only real reason to switch back to a dynamic failure model is if you believe that the allocation will have an important performance impact on TLD (you already have several indirections in your scheme, so adding one more to read the value out of the allocation does not seem necessarily problematic). I don't know how your scheme is going to handle dynamically loading dylibs at runtime that define new TLD keys. Any existing tasks won't have space in their allocated TLD table for the keys defined by the runtime-loaded dylib. And any resolution to this will involve reallocating the tables somehow, which will break any references to contained values anyway. AFAICT supporting this scenario[1] will require keeping allocation for each value, which brings us back to the shared-ownership model. [1] and I think we have to. It's a bit prohibitive to say that e.g. syntax extensions cannot use TLD, because while the syntax extension may not use it directly, it may use a library that itself uses TLD (that the syntax extension author isn't even aware of). |
(A hashmap would give (slightly more expensive) O(1) lookups, although the current library organisation doesn't allow that yet, and @kballard had some segfaulting issues when he tried, iirc.) |
@huonw Yeah I think my segfaulting issues stemmed from changing the implementation of local_data in the tests, without recompiling everything from scratch. And since it can't be compiled from scratch (because hashmap lives in std), that makes it a bit of a non-starter. |
@alexcrichton How do you want to move forward on this? I think this PR needs to land in some form, because it improves the performance of local_data and that's useful. Do you have a strong objection to landing it with the shared ownership model? As already documented, the shared ownership model can still be preserved in a hypothetical future O(1) lookup implementation. Alternatively I could modify the PR to reintroduce the failure-on- |
I personally feel that this is not the right approach for TLS. The primitive that TLS is providing is not shared ownership, but rather task-local-storage. The old I'm fine with landing the addition of |
Note: I have a pending rewrite of this PR that I'm working on that addresses @alexcrichton's issues. I hope to find the time to complete it this coming week. |
The correct terminology is Task-Local Data, or TLD. Task-Local Storage, or TLS, is the old terminology that was abandoned because of the confusion with Thread-Local Storage (TLS).
Rewritten. r? @alexcrichton I ditched |
@@ -66,33 +69,51 @@ pub type Key<T> = &'static KeyValue<T>; | |||
#[allow(missing_doc)] | |||
pub enum KeyValue<T> { Key } | |||
|
|||
#[doc(hidden)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you be sure to confirm that this doesn't pop up in the documentation by accident? In theory grep -R LocalData doc/
should turn up nothing. I forget if this was for a broken link in the past or some weird rustdoc bug which has since been fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Aww crud, it does indeed show up. It's not public, it has no business showing up. What is rustdoc thinking?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the LocalData
trait isn't even used anymore, it should just be removed.
let ClearKey(ref key) = *self; | ||
key.clear(); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper type to expunge no-longer-used keys from the map, so that way the tests and benchmarks don't interfere with each other (each test and benchmark uses brand new keys, which as explained earlier won't get removed from the map, and this affects the measurements of the benchmarks). I'll add a comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests are run their own tasks, so I thought they wouldn't affect other tasks or benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm, actually, that's a good point. Benchmarks definitely need to clear the keys (because they're run multiple times) but regular tests do get their own task. I recall measuring a difference in the benchmarks when I added key-clearing to these tests, but it seems like that shouldn't happen. I'll test it again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested again, looks like you're right, the keys from the tests aren't impacting the benchmarks. Which is good, because that's how it should be.
Updated. I believe I've addressed all of the feedback. r? @alexcrichton |
Some(newValue) => { | ||
map.insert(keyval, TLDValue::new(newValue)); | ||
} | ||
None => {} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could the (None, None)
case have a return None
, the (None, Some(data))
case returns data
from the match
, and then the (Some(slot), data)
case starts with return match (refcount, data) { ... }
?
Just a few minor comments, but otherwise looks good to me! r=me |
@kballard You ditched |
@forticulous I submitted that 1.5 days ago as #16101. |
@kballard Excellent, I was hoping that didn't get lost |
Comments addressed. Thanks for the review, @alexcrichton |
Just realized this is no longer a breaking change. |
This was motivated by a desire to remove allocation in the common pattern of let old = key.replace(None) do_something(); key.replace(old); This also switched the map representation from a Vec to a TreeMap. A Vec may be reasonable if there's only a couple TLD keys, but a TreeMap provides better behavior as the number of keys increases. Like the Vec, this TreeMap implementation does not shrink the container when a value is removed. Unlike Vec, this TreeMap implementation cannot reuse an empty node for a different key. Therefore any key that has been inserted into the TLD at least once will continue to take up space in the Map until the task ends. The expectation is that the majority of keys that are inserted into TLD will be expected to have a value for most of the rest of the task's lifetime. If this assumption is wrong, there are two reasonable ways to fix this that could be implemented in the future: 1. Provide an API call to either remove a specific key from the TLD and destruct its node (e.g. `remove()`), or instead to explicitly clean up all currently-empty nodes in the map (e.g. `compact()`). This is simple, but requires the user to explicitly call it. 2. Keep track of the number of empty nodes in the map and when the map is mutated (via `replace()`), if the number of empty nodes passes some threshold, compact it automatically. Alternatively, whenever a new key is inserted that hasn't been used before, compact the map at that point. --- Benchmarks: I ran 3 benchmarks. tld_replace_none just replaces the tld key with None repeatedly. tld_replace_some replaces it with Some repeatedly. And tld_replace_none_some simulates the common behavior of replacing with None, then replacing with the previous value again (which was a Some). Old implementation: test tld_replace_none ... bench: 20 ns/iter (+/- 0) test tld_replace_none_some ... bench: 77 ns/iter (+/- 4) test tld_replace_some ... bench: 57 ns/iter (+/- 2) New implementation: test tld_replace_none ... bench: 11 ns/iter (+/- 0) test tld_replace_none_some ... bench: 23 ns/iter (+/- 0) test tld_replace_some ... bench: 12 ns/iter (+/- 0)
Errors can be printed with {}, printing with {:?} does not work very well. Not actually related to this PR, but it came up when running the tests and now is as good a time to fix it as any.
This was motivated by a desire to remove allocation in the common pattern of let old = key.replace(None) do_something(); key.replace(old); This also switched the map representation from a Vec to a TreeMap. A Vec may be reasonable if there's only a couple TLD keys, but a TreeMap provides better behavior as the number of keys increases. Like the Vec, this TreeMap implementation does not shrink the container when a value is removed. Unlike Vec, this TreeMap implementation cannot reuse an empty node for a different key. Therefore any key that has been inserted into the TLD at least once will continue to take up space in the Map until the task ends. The expectation is that the majority of keys that are inserted into TLD will be expected to have a value for most of the rest of the task's lifetime. If this assumption is wrong, there are two reasonable ways to fix this that could be implemented in the future: 1. Provide an API call to either remove a specific key from the TLD and destruct its node (e.g. `remove()`), or instead to explicitly clean up all currently-empty nodes in the map (e.g. `compact()`). This is simple, but requires the user to explicitly call it. 2. Keep track of the number of empty nodes in the map and when the map is mutated (via `replace()`), if the number of empty nodes passes some threshold, compact it automatically. Alternatively, whenever a new key is inserted that hasn't been used before, compact the map at that point. --- Benchmarks: I ran 3 benchmarks. tld_replace_none just replaces the tld key with None repeatedly. tld_replace_some replaces it with Some repeatedly. And tld_replace_none_some simulates the common behavior of replacing with None, then replacing with the previous value again (which was a Some). Old implementation: test tld_replace_none ... bench: 20 ns/iter (+/- 0) test tld_replace_none_some ... bench: 77 ns/iter (+/- 4) test tld_replace_some ... bench: 57 ns/iter (+/- 2) New implementation: test tld_replace_none ... bench: 11 ns/iter (+/- 0) test tld_replace_none_some ... bench: 23 ns/iter (+/- 0) test tld_replace_some ... bench: 12 ns/iter (+/- 0)
This was motivated by a desire to remove allocation in the common
pattern of
This also switched the map representation from a Vec to a TreeMap. A Vec
may be reasonable if there's only a couple TLD keys, but a TreeMap
provides better behavior as the number of keys increases.
Like the Vec, this TreeMap implementation does not shrink the container
when a value is removed. Unlike Vec, this TreeMap implementation cannot
reuse an empty node for a different key. Therefore any key that has been
inserted into the TLD at least once will continue to take up space in
the Map until the task ends. The expectation is that the majority of
keys that are inserted into TLD will be expected to have a value for
most of the rest of the task's lifetime. If this assumption is wrong,
there are two reasonable ways to fix this that could be implemented in
the future:
destruct its node (e.g.
remove()
), or instead to explicitly cleanup all currently-empty nodes in the map (e.g.
compact()
). This issimple, but requires the user to explicitly call it.
is mutated (via
replace()
), if the number of empty nodes passessome threshold, compact it automatically. Alternatively, whenever a
new key is inserted that hasn't been used before, compact the map at
that point.
Benchmarks:
I ran 3 benchmarks. tld_replace_none just replaces the tld key with None
repeatedly. tld_replace_some replaces it with Some repeatedly. And
tld_replace_none_some simulates the common behavior of replacing with
None, then replacing with the previous value again (which was a Some).
Old implementation:
New implementation: