-
Notifications
You must be signed in to change notification settings - Fork 12.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Overhaul of the AllocRef
trait to match allocator-wg's latest consens
#69889
Conversation
AllocRef
trait to match allocator-wg's last consensAllocRef
trait to match allocator-wg's latest consens
Continuing the discussion from #69824:
If we decide to make |
Yes, that's true, but that's basically what we have seen in rust-lang/wg-allocators#38 (comment). While you have evaluated the measurments, you basically asked the same question:
You made the following conclusion:
@Wodann added to the discussion, that
We can conclude, that the current change should not have any performance disadvantages now but simplifies the API of If we should really manage to implement Edit: Please also note that growing/shrinking is usually in the cold path. For performance you will either alloc a memory block, which is large enough, beforehand, or you use a fast allocator like |
I could also imagine, that we could eventually add parameters to |
@bors try @rust-timer queue |
Awaiting bors try build completion |
⌛ Trying commit d4bcad283d39b6a821d68cd10f440b88d207eb81 with merge 34dcc17a44e3eb0718a3877c453516843a8abb13... |
☀️ Try build successful - checks-azure |
Queued 34dcc17a44e3eb0718a3877c453516843a8abb13 with parent 1581278, future comparison URL. |
Finished benchmarking try commit 34dcc17a44e3eb0718a3877c453516843a8abb13, comparison URL. |
Regarding the benchmark results, it makes sense to me that compilation would be slower as a result of these changes. We introduced an additional code path in I am however not sure whether that change would account for the total difference in time. Only more invasive benchmarking would be able to tell.. |
I have also expected a slight regression in compile times, but I did expected more regression in opt-builds than in debug builds. |
Quick caution that I just realized:
I still strongly believe that a zero-sized-allocation-aware |
I don't see where using Box::from_raw like that is actually guaranteed to be sound. Could you quote the spot or walk me through the steps? |
This is working, as |
Ok, I can't actually find wording to back myself up here. I do know I've seen code somewhere that assumed that's how
to be very explicit that only pointers allocated via the global allocator are ok, and not dangling pointers, even though that's how the current implementation works. In fact, given the wording of this (having just gone over it again), I'd argue
So ( |
I don't think there is anything to add there. It's still working for any
Everything listed still holds: |
I'm happy with the resolution of my comments; thanks! The implementation looks good overall to me.
Whatever the eventual guarantees are, I think we're safe for now if for no other reason than the fact that That said, this could be a problem given eventual unification of |
Thanks! I didn't expect that few code change requests with so many LoC changed. |
Yes, I agree that the allocation behavior is not changed by this PR. I just want to make sure the hazzard of eventually allowing the global allocator to be specified in terms of |
No,
The first area is copied (non_overlapping), the third will be zeroed, but it's not clear what happens to the second region. Zeroing it may be useful, but not necessarily faster. |
|
Yes, but |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
Retriggered CI (https://status.crates.io/incidents/ty016p23y1v1):
|
eb29baf
to
51eaa81
Compare
Squashed and rebased onto master. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finished reviewing the RawVec
changes. I am pretty happy with the reorganization around grow
and shrink
, but I would prefer if the special casing for ZSTs was kept. Since size_of::<T>()
is a compile-time constant, this special casing has no runtime overhead.
/// `Box<[T]>`, since `capacity()` won't yield the length. However, `with_capacity`, | ||
/// `shrink_to_fit`, and `from_box` will actually set `RawVec`'s private capacity | ||
/// field. This allows zero-sized types to not be special-cased by consumers of | ||
/// this type. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am hesistant to change the guarantees provided by the RawVec
type. I would prefer if we only changed the internals without affecting the API guarantees.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually this still holds, the internal capacity field are set with with_capacity
, shrink_to_fit
, and from_box
. I'll re-add this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// * Uses the excess returned from the allocator to use the largest available capacity.
/// ...
/// Note that the excess of a zero-sized types is always infinite, so `capacity()` always returns
/// `usize::MAX`. This means that you need to be careful when round-tripping this type with a
/// `Box<[T]>`, since `capacity()` won't yield the length. However, `with_capacity`,
/// `shrink_to_fit`, and `from_box` will actually set `RawVec`'s private capacity field. This allows
/// zero-sized types to not be special-cased by consumers of this type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally, into_box
should return Box<[MaybeUninit<T>]>
instead of Box<T>
.
I am also concerned about the performance impact of this change: link We should re-run the perf test after the changes, but I would like any performance issues to be resolved before this is merged. |
I have a few things in mind how we could reduce the performance regression. I will give updates as soon as I tested them out. This may need another restructuring of |
Have you made measurements to see exactly what is causing the regressions? |
I compared this branch with the current nightly. The performance of allocators are only affected, if the layout isn't statically created. Passing a layout constructed from Regarding the compile-time regression: on average, we have a performance regression of 1.15% (mean value of all listed average tests), which is IMO accaptable. This is caused by the slightly more complicated logic. In debug builds, this is probably caused by more code to be generated and in opt-builds by more things to be optimized. I suggest, that I'll fix your listed change request, then we merge this. Further, I'll try to optimize |
fd13b29
to
0e2ee30
Compare
The job Click to expand the log.
I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact |
The current implementation of |
Github don't let me reopen this. Follow up: #70362 |
You need to push the same commit as when you closed the PR for github to allow reopening a PR. |
Overhaul of the `AllocRef` trait to match allocator-wg's latest consens; Take 2 GitHub won't let me reopen rust-lang#69889 so I make a new PR. In addition to rust-lang#69889 this fixes the unsoundness of `RawVec::into_box` when using allocators supporting overallocating. Also it uses `MemoryBlock` in `AllocRef` to unify `_in_place` methods by passing `&mut MemoryBlock`. Additionally, `RawVec` now checks for `size_of::<T>()` again and ignore every ZST. The internal capacity of `RawVec` isn't used by ZSTs anymore, as `into_box` now requires a length to be specified. r? @Amanieu fixes rust-lang/wg-allocators#38 fixes rust-lang/wg-allocators#41 fixes rust-lang/wg-allocators#44 fixes rust-lang/wg-allocators#51
This is a major overhaul of the
AllocRef
trait. This PR is split into three parts:AllocRef
Box
andRawVec
core::alloc
into multiple, private submodulesThe new
AllocRef
trait:AllocRef
was changed to only have four methods:alloc
/dealloc
as required methods andgrow
/shrink
as provided methods. Growing and shrinking are mostly combined intorealloc
(like in C) but are actually 2 very different operations.alloc
,grow
,shrink
) now return a pointer and the actual allocation size like in the previously removed_excess
-API to support "overallocating".Click here for a code outline
(slightly stripped)
Guide for reviewing
There are four commits in this PR:
AllocRef
to only usealloc
,dealloc
,grow
, andshrink
(split view) introduces the rewrittenAllocRef
trait and only makes small adjustments to letstd
compile.Box
to use ZST awareAllocRef
is only a pretty small change to adjustBox
to use all features ofAllocRef
RawVec
to use ZST awareAllocRef
(split view) is roughly the same as UseAllocRef
for ZSTs inRawVec
#69824.core::alloc
module splitscore::alloc
private in submodules but in comparison to Splitcore::alloc
module #69863 it renamescore/alloc.rs
tocore/alloc/layout.rs
so other PRs modifyingLayout
should not be affected.Any further commits will probably fixes and small adjustments only. I will only force-push to rebase to keep things clean.
_zeroed
variants and pass a flag instead wg-allocators#44AllocRef
for ZSTs inRawVec
#69824core::alloc
module #69863r? @Amanieu
cc @Lokathor @Wodann