-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type aware allocators #91
Comments
I can certainly see the benefit of type of allocators for different types. But monomorphizing every allocation site seems quite expensive. With the allocator API you can pass different allocators to different allocatable functions, so there would be no need to make it generic, at least not that I can see. |
I agree that it seems to be a potentially expensive compile time cost, and should be kept in mind when exploring this direction. My intuition is that a similar cost is already paid by
While its true that programmers can manage multiple allocators, I do not think that is the best way to solve the above class of problems. It is not possible to enforce for large groups of programmers over a long period of time. It is not easy to do backwards compatibly if there is a new security/performance technology. There needs to be a mechanism for a type to choose allowable allocators for its containers or allocators need to be aware of the types it consumes. The latter is better because the cost is put on the allocator writer rather than the allocator user. |
Touché. Those are all good points. I am convinced so long as the generic overhead is not too high. |
I think this would prevent The C++ part of my brain says that the way to meet the desire here (assuming for now it's worth doing) is to specialize the choice of allocator, not specialize the allocator. So, for example, it could be |
There has been work with single type "slab allocators" of this sort before. CC @joshlf |
We must still keep the contract of "allocate and deallocate with the same size and alignment" for backwards compatibility, so no. Idea 3 above should not be enforced on existing types. I think a fast
I'm not sure if this will survive a transformation into/from raw parts unless the programmer carries
@Ericson2314 thanks that's a way better performance reason than the one I came up with. I added that as the fifth on the list above. Looks like the prior art APIs include |
I found this earlier issue that discussed a variant of this idea: #27 @Avi-D-coder and @gnzlbg seemed like they gave it some thought. @gnzlbg, I'm curious what do you think of the use cases laid out here? |
Actually parameterizing with |
C++ has typed Allocators and historically, they were a mistake. They bring more complexity than they save in terms of safety. Rust's decision to just take a Layout and return a |
@LeonineKing1199 Can you elaborate on what you think these mistakes and complexities are? |
well it would probably break the |
@Lokathor, backwards compatibility is required so of course the contract should be, 'deallocation with a different type than allocated must not fail if the size and alignment match'. I think I alluded to this in point 3 originally and also in response to @scottmcm. I edited the top comment to reiterate the point explicitly and at the top. Let me know if that is clearer. |
Is there an example of a type aware allocator where something other than size and alignment would be relevant for deallocation?
On October 24, 2021 6:02:33 PM EDT, Casper ***@***.***> wrote:
***@***.***, backwards compatibility is required so of course the contract should be, 'deallocation with a different type than allocated must not fail if the size and alignment match'. I think I alluded to this in point 3 originally and also in response to @scottmcm. I edited the top comment to reiterate the point explicitly and at the top. Let me know if that is clearer.
…
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
#91 (comment)
|
One that uses the typeid as a key to figure out an allocation pool. |
Thanks for the example!
…On October 24, 2021 6:21:06 PM EDT, Thom Chiovoloni ***@***.***> wrote:
One that uses the typeid as a key to figure out an allocation pool.
--
You are receiving this because you commented.
Reply to this email directly or view it on GitHub:
#91 (comment)
|
C++ has "type-aware Allocators". In general, this is a mistake because it introduces the need for a To do this, the Allocator must be "rebind-able" from In general, a rebinding mechanism only adds extra complexity when in Rust, there's already a simple API for creating the correct layouts, i.e. C++ had to support type-aware Allocators because stdlib containers would default construct elements so even element construction became a part of the Allocator API. Rust's stdlib seems to deliberately avoid this issue. From my experience, an Allocator that simply returns a pointer to an array of bytes and deallocates an array of bytes using user-provided layouts is the most simple and straight-forward implementation from which higher-level constructs can be built upon. The brutal simplicity of Rust's Allocators is what makes them so attractive to me. |
@LeonineKing1199 so I think the |
Can I see some pseudocode? |
impl Layout {
fn new<T>() -> Self {
Self {
size: mem::sizeof::<T>(),
align: mem::alignof::<T>(),
typeid: any::typeid::<T>() // Really this just has to be extensible metadata, not necessarily TypeId
}
}
}
impl Allocate for MyAllocator {
fn allocate(&self, layout: Layout) -> Result<NonNull<[u8]>, AllocError> {
// Choose suballocator based on type information.
match self.choose_suballocator(layout.typeid()) {
SlabAllocator => self.slab_allocator.allocate(layout),
LikesToRealloc => self.realloc_optimized_allocator.allocate(layout),
ExtraSecret => self.secret_allocator.allocate(layout),
Normal => self.default_allocator.allocate(layout),
}
}
fn deallocate(&self, ptr: *mut u8, layout: Layout) {
match self.choose_suballocator(layout.typeid) {
// Fast path: Try to deallocate with the same suballocator as you would allocate with given the type hint.
}
// If that does not work try deallocating with every suballocator
// if ptr is in not allocated anywhere...
abort!()
}
}
|
Ah, this looks you're after a more generic way of attaching metadata to a layout kind of struct. That's an interesting idea. I feel like it should theoretically live at a higher abstraction level, no? As it is now, Allocators could be a low-level trait that really just handle allocate and deallocate calls with something higher up the abstraction stack deciding which allocator should be used based on pertinent metadata. |
@CasperN big and bold at the top helps, yes :3 |
Potentially: The reason I think its a good idea to focus on |
This is, basically, wanna-be specialization. That's why I mentioned specialization in the default type parameter as a way to address
TBH, I also think that the allocator itself needing to know about your business logic classes is really weird from a layering perspective. Whereas a struct adding a specialization for "hey, I suggest using this allocator with me" seems more reasonable. Though it may cause a bunch of semver hazards from adding them. (Although I'm still skeptical of somehow specializing an allocator for something like u8 to make |
Yeah, this is an interesting design question and I think the decision ultimately comes down to scale. In a small codebase/team I agree its reasonable and better to have types choose their allocator everywhere. Especially for readability - this strategy does not have the same spooky action at a distance. For what its worth, this type-aware-allocators proposal does not rule out the types-choose-allocators approach. It just enables more options. In a large codebase/team I think types-choose-allocators does not scale too well. PartitionAlloc isolates strings and some other types because they're prone to exploitation. You'd need tooling and training to ban the native type in favor of
I agree but I don't think security and performance features are "business logic" per se. Ideally business logic can be separated and this is actually easier if the business logic types don't mention the allocator. Granted, the allocator does become business logic for some teams when the performance requirements get steep enough. Those teams should engineer their allocation patterns carefully. However, many (most?) teams don't reach that point and will use default types and allocators (or even slow languages) because developer velocity matters more. But one allocator team might care that hundreds of teams could be using their allocator better. Their work will be easier and less intrusive with type-aware-allocators. So yea, @scottmcm, I think I see your point and agree types-choosing-allocators is clearer for small projects however I think type-aware-allocators is more manageable at scale. |
I'm still puzzled by this one. |
Okay so I went and read their source: https://chromium.googlesource.com/chromium/blink/+/refs/heads/main/Source/wtf The TL; DR is that they define their own data-structures like
Yeah, I agree that would be the case so type-ids won't provide that fine grained metadata in these cases. That is sufficient if you want to achieve something like PartitionAlloc's BufferPartition since they isolate vectors and maps with strings... but its not ideal. |
For these sorts of things I suspect you want to customize the allocator based on the type doing the allocation rather than the type being allocated, e.g. the allocator should know that it's allocating from a There are reasons this is a pain, though, and I don't know exactly what trying to express this in a Rust API looks like. My hunch is solving this problem perfectly might break object safety, or at least, the ability to use runtime polymorphism with the allocator trait. (Admittedly this is fairly poorly supported now1, so maybe it's not worth worrying about here)
I think this might be neither here nor there — it would probably happen regardless even if C++'s Footnotes
|
I don't know if the ship has sailed on the Allocator trait's API yet, but I think it is a good idea to have an allocate/deallocate variant that is parameterized by
T
. I'll just call them "Type aware allocators". There are a few security opportunities w.r.t. heap exploits and maybe performance ones too.Implementations must be backwards compatible, which means if a pointer is allocated with
T
, deallocating withU
whereT
andU
have the same size and alignment must succeed, unless users opt into other behavior.Benefits:
Allocations that contain secret material should be stored on separate pages. With ASLR and guard pages, its difficult for an attacker with a heap overflow to reliably find and exploit this material. Having
T
in the API lets libraries hint to a shared allocator to get this separate treatment. Currently, libraries containing secrets would have to manage their own allocator.Similarly, an attacker with a heap overflow may want to exploit function pointers.
One mitigation employed by the Scudo hardened allocator involves "quarantining" free'd pointers and freeing them at a later point. This makes heap attacks less reliable too but kills memory locality: Imagine you're allocating and deallocating a string in a tight loop, tcmalloc and jemalloc will give you the same (hot) memory each time while quarantining forces you into something colder. So for performance reasons, this security feature is often disabled. In some sense, quarantining all pointers is overkill. Function pointers are more dangerous than, e.g., a vector of ints.
It might be a red flag if a program allocates a pointer and de-allocates it with a different type. This probably happens a lot already for existing code, but for new opt in types in completely safe code, this may point to malicious behavior and the allocator should terminate the program.
For performance reasons, it might be a good idea to put long lived shared data on different pages than hot, thread local data. This reason is a little iffy since a performance sensitive user should probably go with a locally scoped allocator to keep its memory nearby.
More performance reasons: Slab allocators
The text was updated successfully, but these errors were encountered: