Skip to content
This repository has been archived by the owner on Sep 22, 2023. It is now read-only.

An alternative implementation with a small buffer optimisation (SBO) #124

Open
BengtGustafsson opened this issue Aug 4, 2022 · 12 comments
Assignees

Comments

@BengtGustafsson
Copy link

I made an implementation of a polymorphic_value with SBO. It has a very similar API as this incarnation but removes the possibility to take over ownership of an object already existing on the heap. This was primarily due to the observation that if this object is small enough to fit the SBO it would have to be copied anyway, but it also solved the slicing problem that could so deviously occur in the original design. Instead the contained object is always created directly inside the polymorphic_value, either using an in_place_type constructor, a special emplace method or a make_polymorphic_value function.

I also opted to remove all possibility to assign, copy or cast between polymorphic_value<T> and polymorphic_value<U> to avoid the need for linked lists of this pointer conversion functions. This allows my implementation to use placement new on a member which allows polymorphic_value to work entirely without heap allocations unless the contained object is larger than the SBO buffer size. The performance is also improved as accessing the object only requires one virtual call, no access to control block object(s) on the heap.

Removing the constructors from pre-existing heap object pointers makes polymorphic_value more similar to a regular by value object, but I still opted to have a default constructor that leaves the object empty, an operator bool() to test this condition and a reset() to destroy the contained object. This is for two reasons. One is that it is often convenient to be able to encode an empty state, for instance if the required contained type of a member is not known when the surrounding object is constructed. The other reason is that if the constructor of the contained object throws in the in_place_type constructor the polymorphic_value would still be empty even if no default constructor was provided. std::variant has a special valueless_by_exception state for this situation and suggests adding a std::monostate as the first alternative if the variant is allowed to be empty. But as monostate does not inherit the T in a polymorphic_value this is not a feasible way to create a "nullable" polymorphic_value, so it seems best to keep this aspect of pointerness.

An alternative would be to specialize std::optional for std::polymorphic_value so that a plain polymorphic_value only has valueless_by_exception but an optional<polymorphic_value<T>> has optional::operator bool() implemented without overhead using the handler mechanism in polymorphic_value (via a friend declaration). The drawback with this solution is mainly that operator*() of optional would return the polymorphic_value and another * would have to be applied to get a reference to the contained object.

A third alternative would be to call the entire thing polymorphic_optional or possibly provide both polymorphic_value and polymorphic_optional.

A fourth alternative would be to retain the specialization std::optional<std::polymorphic_value<T>> but adapt the API to become logical in the specialization of std::optional, i.e. let operator* do both steps etc.

My preference would be to just document that std::polymorphic_value has the optional-ness built in and this would basically entail changing the name of get() to value() and let it throw on empty. The monadic operations of C++23 could also easily be added. The drawback with this is that dynamic_casting when the polymorphic_value is empty gets complicated. With get() you can write dynamic_cast<U*>(v.get()) to see if v contains an instance of U while with value you'd have to check for NULL first, then call value and take the address of the reference before doing the dynamic_cast. A U* get<U>() method could be added to simplify this, while a U& value<U>() would throw if the contained value is not a U (or empty).

Here are the files for my polymorphic_value and a simple test/demo program. I don't really know how to handle this "extra code base", maybe I should clone the repo and add a subdirectory? Note that this version still has more of a smart pointer API than an optional API.

polymorphic_value.zip
.

@Twon Twon self-assigned this Aug 4, 2022
@Twon
Copy link
Collaborator

Twon commented Aug 4, 2022

Thanks for sharing your implementation experience, it's really valuable to get feedback like this. While the most recent revisions of the papers for polymorphic_value/indirect_value contain a null state our most recent exploration has focused on making the null-state non-observable and representing via std::optional as you've described in option 2 but there is still some issue to think through as you've pointed out.

I'll have a look at the implementation you've shared and need to think through some of these points before I opine further but will follow up shortly.

@jbcoe
Copy link
Owner

jbcoe commented Aug 28, 2022

We've had a very long think about this and it's forced us to make a decision about polymorphic_value supporting a small-buffer optimisation.

polymorphic_value is designed to do a strictly better job of being a component of a composite object than a unique_ptr to a base class with a (series of manually implemented) clone methods. I'm convinced we've succeeded.

Adding a small buffer optimisation potentially removes the need for memory allocation. We don't know that potentially saving an allocation by making a composite object bigger (big enough for the small buffer) is better than using unique_ptr with clone. It's possible that a small buffer would be useful and better in which case users of this class can reach for allocators (which we now support) and the pointer constructor or use another type.

There's probably space in the world for SmallPolymorphicValue<T, BufferSize> just like there's space in the world for SmallVector<T, BufferSize> but this won't be the type we try to standardize.

We're much less concerned by the slicing problem than you seem to be. std::shared_ptr can incompletely destroy objects in much the same way but it's not a big problem in the wild.

We'll update wording for both polymorphic_value and indirect_value to make it clear that a small buffer optimisation cannot be added (outside of allocators) - the address of the owned object is unchanged after move construction.

@BengtGustafsson
Copy link
Author

As I see it making the null state non-observable would require (ignoring SBO) that the pointee is moved from but not deallocated in the move operations. I guess this will bring polymorphic_value in line with variant, as there must still be a valueless by exception state that can be set by emplace and constructors.

Maybe this is dependent on whether SBO is included: If SBO is included this would be a good drop in replacement for unique_ptr, that is, even if copying is not required the SBO feature itself is a good advantage over unique_ptr in many cases. But for this to be drop in replacement for a unique_ptr it must have an empty state. However, if SBO is not included there is no advantage in replacing unique_ptr with polymorphic_value unless the copyability is important as the number of heap operations is the same anyway. Thus optional<polymorphic_value> could be a better choice assuming there is a specialization which relies on the underlying pointer's null state and some friend declaration to access it. This would however cause problems if said specialization did not change the semantics of operator* to return the contained value of the polymorphic_value, not the polymorphic_value itself (which could be in the impossible null state). Well, maybe not that bad as operator* is UB if the optional is empty anyway.

@jbcoe
Copy link
Owner

jbcoe commented Aug 28, 2022

We're looking at adding a valueless_after_move state which is much easier to get into but invalid to access (other than to destroy/assign/query).

@BengtGustafsson
Copy link
Author

BengtGustafsson commented Aug 28, 2022

On slicing problem: With shared_ptr and unique_ptr you only get partial destruction if the destructor is not virtual, which is such a bad idea that compilers warn if it isn't.

In polymorphic_value the only reason that slicing at copy does not occur is that you didn't provide a subclass object as the constructor parameter via a base class reference/pointer.

I think that common patterns of construction invite errors, for instance in a case like this:

Base* select_policy(std::string policyName);

MyClass {
public:
    MyClass(Base* policy) : m_policy(policy) {}
    std::polymorphic_value<Base> m_policy;
};

MyClass myObject(selectPolicy("fast policy"));

This slices but not until myObject is copied, which makes the error more subtle.

Making the MyClass constructor templated on the pointer type does not help as selectPolicy returns a Base pointer anyway.

The only reasonable return type of selectPolicy is actually polymorphic_value which just moves the problem into that function. By only allowing emplace and/or tagged construction we force selectPolicy to return polymorphic_value so that the MyClass member construction can copy/move it. Furthermore the implementation of select policy is forced to be something like this, i.e. something that actually works:

polymorphic_value<Base> selectPolicy(std::string_view name)
{
    polymorphic_value<Base> ret;

    if (name == "fast policy")
        ret.emplace<FastPolicy>();
    else if (name == "accurate policy")
        ret.emplace<AccuratePolicy>();
    
    return ret;
}

In this example I also made good use (in my opinion) of the null state, which is used to indicate an unknown policy string where error handling is deferred to MyClass. This can of course be reformulated as:

polymorphic_value<Base> selectPolicy(std::string_view name)
{
    if (name == "fast policy")
        return {std::in_place_type_t<FastPolicy>);
    else if (name == "accurate policy")
        return {std::in_place_type_t<AccuratePolicy>);
    else
        ???
}

This can be solved by wrapping in optional, but it gets somewhat clumsy as you need to state the polymorphic_type explicitly in each return statement. And as this is a constructor parameter to std::optional a copy can't be avoided.

std::optional<polymorphic_value<Base>> selectPolicy(std::string_view name)
{
    if (name == "fast policy")
        return polymorphic_value<Base>{std::in_place_type_t<FastPolicy>);
    else if (name == "accurate policy")
        return polymorphic_value<Base>{std::in_place_type_t<AccuratePolicy>);
    else
        return std::nullopt;
}

@jbcoe
Copy link
Owner

jbcoe commented Aug 28, 2022

On slicing: Potential slicing by (incorrect) use of the pointer constructor will be picked up at run time and an exception of type bad_polymorphic_value_construction thrown.

I'd like to better understand your concerns here as I don't share them. Have you use cases where misuse of the pointer constructor was a common issue?

I'd suggest changing your API to:


MyClass {
public:
    template <typename T, typename ...Ts>
    MyClass(Ts&& ...ts) : m_policy(std::in_place_type<T>, std::forward<Ts>(ts)) {}
    std::polymorphic_value<Base> m_policy;
};

or

public:
    MyClass(std::polymorphic_value<Base> policy)) : m_policy(std::move(policy)) {}
    std::polymorphic_value<Base> m_policy;
};

@BengtGustafsson
Copy link
Author

Regarding SBO:

It is true that we would need some data to back up this, and that it depends on the data type and the usage pattern which is best: SBO which forces real moves at moves or pointer which forces allocation on copy. My implementation has a SBO size parameter which you can set to 0 to indicate that you don't want SBO to be involved.

My main reason to argue for SBO is that it widens the use cases to include most cases of unique_ptr, which could potentially reduce allocations by quite a lot.

There may be space for another vocabulary type to handle the SBO enabled case but if it is introduced it will be able to do anything that polymorphic_value does, and future generations will wonder why both exist. I'm not particularly happy with having both thread and jthread so not adding foreseeably useful features from the onset does not feel right.

@jbcoe
Copy link
Owner

jbcoe commented Aug 28, 2022

We need to see if the current design + allocators can do all that your SBO design can.

As a perhaps irrelevant aside, Rust's mature and excellent dyn_clone has no small-buffer optimisation and pretty much does what polymorphic_value does.

@jbcoe jbcoe changed the title Add SBO and make polymorphic_value more like an std::optional than a smart pointer. An alternative implementation with a small buffer optimisation (SBO) Aug 28, 2022
@BengtGustafsson
Copy link
Author

I forgot about bad_polymorphic_value_construction throwing. This reduces my concerns but it doesn't solve my example as the exception would be thrown all the time. It also costs a dynamic_cast to check this, and if it is specified to throw this must also happen in release builds.

Your changed MyClass API doesn't help if there must for some reason be a function selectPolicy that has to return the same type regardless of the selection string. Its return type must still be polymorphic_value.

SBO is not about being able to something new but about doing it more efficiently. What I know is that the gcc gang went to great lengths to be able to introduce SBO in std::string. That story also indicates that if SBO is to be included it should be in from the start, or a totally separate class as you suggested. I'm starting to lean towards that a straw poll in LWG is the right way to resolve this, especially as you have already presented and mostly got ok on the non-SBO version.

It seems more and more a weakness of the entire standardization process that once a paper has been presented the only changes "allowed" are those called for by that meeting. Other changes are seen as "dangerous" to the furthering of the proposal towards standardization, even if they are important improvements. There is of course always the trade off between getting something "out the door" and getting the right thing out the door, and C++ has had plenty of failures in both directions (and plenty of successes).

@BengtGustafsson
Copy link
Author

BengtGustafsson commented Aug 28, 2022

Re: Assignability between different Ts. This is what causes the problems with unbounded number of intermediate control blocks when multiple inheritance is involved.

I was thinking that maybe we don't have to ban this entirely to get rid of the control blocks. Maybe it is enough to ban copying between Ts which have different this pointers. I'm not sure if this can be detected without UB at this time, but it surely should. On the other hand I don't understand the use cases for assigning between polymorphic_values with diffferent T. The main intention was to act like a member of polymorphic type, preserving copyability. In this case the T is surely the same in both parent objects as they are of the same class. What other important enough use cases do you see, that motivates extra allocations for control blocks and so on?`

Come to think of it, doesn't this also conflict with the no-null idea? What happens if a downcast fails, like a dynamic_cast<T*>(base) or were you only thinking about upcasting?

Inspecting the code I see only upcasting, and no function to do downcasting (like these: https://en.cppreference.com/w/cpp/memory/shared_ptr/pointer_cast). No corresponding operator= though, which seems to be an oversight, or is there a reason for this inconsistency?

For completeness maybe there should be static and dynamic cast functions, and for the dynamic case, one that doesn't throw on error but returns some kind of null state. Seems hard for that to be std::optional though.

Note that if you grab the pointer out of a polymorphic value<Base> and cast that to Sub* and construct a polymorphic_value<Sub> this still slices if the original pv contains a subclass of Sub so to avoid that people try this there should be a specific downcast functionality. Unfortunately this creates a situation where a loop can create a chain of control blocks which is longer than the number of inheritance levels, well, unless there is some clever logic that finds an existing control block of the right kind in the incoming chain.

All of this of course only valid concern if constructing/assigning between polymorphic_values of different Ts is allowed, which I don't think is very useful and don't allow in my implementation.

@jbcoe
Copy link
Owner

jbcoe commented Aug 28, 2022

Re: "It seems more and more a weakness of the entire standardisation process that once a paper has been presented the only changes "allowed" are those called for by that meeting. Other changes are seen as "dangerous" to the furthering of the proposal towards standardisation, even if they are important improvements. There is of course always the trade off between getting something "out the door" and getting the right thing out the door, and C++ has had plenty of failures in both directions (and plenty of successes)."

I think as the paper author I'm at liberty to make changes as needed. I'm free to add/remove bits as I like, not only as requested by committee meetings. I'm even free to ignore the advice/polls of the committee sub-groups although getting papers through to a plenary session would then be rather tricky.

Pointer-like casts were removed at the request of the Library Evolution Working group some time ago. They could be re-added at later cost.

Converting (upcasting) operator= was removed intentionally to make upcasting explicit as it's not zero-cost in my reference implementation. If implementers can settle on a technique for zero-cost upcasting then we can re-introduce a converting operator= at a later date.

@BengtGustafsson
Copy link
Author

BengtGustafsson commented Sep 19, 2022

I've improved my alternate implementation and put it in a github repo here: Github
Note that while I obviously think mine is a superior design (or I wouldn't have set out to do it) I'm fully open for discussion and compromises and of course even better solutions. Maybe I'm too scared of extra allocations for instance. That can be discussed. On the other hand I noted that now, for the first time, polymorphism based on virtual methods can be brought to applications where heap allocations are prohibited, which could open interesting improvements for such code.

I'm not entirely happy with how the implementation of optional's monadic API turned out but I would not be happy about a std::optional<std::polymorphic_value> either with its double empty tests and double dereferences, so this is a test to see which is least bad. Maybe there should be a "monadic component" which can be used to implement the API given some way to detect emptiness. This component could then be applied to a optional, polymorphic_value, expected or whatever that has a operator bool to detect emptiness and a operator* to get the contents. I have not looked into this possibility, I just reimplemented the monadic API methods, which was easy enough.

I hope I have at least provided some food for thought to be considered before completing the standardization.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants