-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
heap-use-after-free in ExampleCluster #174
Comments
This seems to be a more fundamental problem, that is connected with clearing collections. A minimal reproducer is: #include "datamodel/ExampleClusterCollection.h"
int main () {
auto clusters = ExampleClusterCollection();
auto cluster = clusters.create();
clusters.clear(); // remove this and everything works
} |
This is a very fundamental problem that, as far as I can tell, happens every time we The underlying problem is that the call to ExampleCluster::~ExampleCluster() {
if (m_obj) m_obj->release();
} We would usually get a double-free here, but since Lines 22 to 30 in 6b18dfc
we get a use-after-free. The reason why we are not running in a double-free more often is, because first Since the |
Coming back to this once more. I think this is even more fundamental than I previously thought. Up until now I was under the impression that this will not be a problem in our "usual workflows" because the objects will (almost) never be created in the same scope as the collections are cleared. However, the issue goes deeper than that. E.g. the following minimal showcase also has the same heap-use-after-free problem: #include "datamodel/ExampleWithOneRelationCollection.h"
#include "datamodel/ExampleClusterCollection.h"
int main() {
auto clusters = ExampleClusterCollection();
auto oneRelations = ExampleWithOneRelationCollection();
// introduce a new scope to get around the originally described problem
{
auto cluster = clusters.create();
auto rel = oneRelations.create();
rel.cluster(cluster);
}
clusters.clear(); // clear this second and everything works
oneRelations.clear();
}
In this case ~ExampleWithOneRelationObj() {
if (m_cluster) delete m_cluster;
} Hence, it will call the deconstructor of |
OK. I think we have to go back to the drawing board here to think of all possible corner cases. What seems a possible conclusion is that the ref counting part and the data itself may need to be separated more clearly. Which would make it potentially incredibly complicated. |
Following up on our discussion yesterday: could the whole issue be resolved by introducing a new type of objects that are not yet managed (owned) by a collection ? auto clucol = event.get<ClusterCollection>("clusters") ;
for(i : range){
auto clu = clucol.create() ;
clu.setEnergy( 42. ) ;
}
// some candidate clusters:
for(j : range){
ClusterCand clu ; // a cluster candidate
clu.setEnergy( j * 4 ) ;
if( clu.energy() > 42.) {
clucol.emplace_back( clu ) ;
/// ownership and memory handling moved to collection for these clusters
}
}
/// clusters not passed to collection go out of scope and are deleted
This would greatly simplify the memory handling. Am I missing something ? |
Another possibility could be to introduce a "semi smart pointer" that is used instead of the raw template<typename T>
class maybe_shared_ptr {
// c'tors, d'tors, operator->, get()
private:
T* ptr;
struct ControlBlock {
std::atomic<unsigned> count{1};
std::atomic<bool> owned{true};
};
ControlBlock* ctrl_block{nullptr};
}; There are three different states:
Basically what would change with respect to the current situation is that the ref-count mainly controls the lifetime of the control block, and only conditionally that of the "managed" pointer. I am not yet entirely sure if this covers all edge cases, but I think it could. One of the drawbacks is that the user facing classes will double their size (essentially from one pointer to two pointers). However, I think that could be acceptable if it indeed solves all our problems. |
As of 29dfbdd there is at least one
heap-use-after-free
problem in our tests, and I think this also affects the generated code. It seems to have gone unnoticed until now because for some reason builds with gcc do not seem to suffer from any runtime problems. However, builds with clang do and occasionally lead to runtime problems, i.e. segmentation faults.After instrumenting the core podio library with AddressSanitizer, running the
tests/write
points to an attempt ofrelease
ing an already destroyedObj
again. I have attached the complete output below. I have currently named theExampleCluster
in the issue title, since this is the first instance where this problem occurs, but it doesn't have to be the only one: I am not yet sure whether this problem only affects types with relations, or if this is a more general problem that affects instances that have not been created viaCollection::create
but instead via directly constructing them and then adding them to a collection. I also do not know why gcc builds seem to be less affected by this at runtime compared to clang builds.address_sanitize_podio.txt
The text was updated successfully, but these errors were encountered: