Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assertion m_private_counter >= 0 failed (located in the destroy function, line in file: 141) #639

Closed
xiamr opened this issue Nov 5, 2021 · 13 comments · Fixed by #667
Closed
Assignees
Labels

Comments

@xiamr
Copy link

xiamr commented Nov 5, 2021

When using flow graph, the following assert failed with the following stderr output:

Assertion m_private_counter >= 0 failed (located in the destroy function, line in file: 141)
Detailed description: Private counter may not be less than 0

in the destroy function:
1636081508(1)

Call Stack:
image

_raise 0x000014a314b6637f
abort 0x000014a314b50db5
tbb::detail::r1::assertion_failure_impl assert_impl.h:56
operator() assert_impl.h:73
tbb::detail::d0::run_initializer<tbb::detail::r1::assertion_failure(char const*, int, char const*, char const*)::<lambda()> >(const struct {...} &, std::atomictbb::detail::d0::do_once_state &) _utils.h:288
tbb::detail::d0::atomic_do_once<tbb::detail::r1::assertion_failure(char const*, int, char const*, char const*)::<lambda()> >(const struct {...} &, std::atomictbb::detail::d0::do_once_state &) utils.h:277
tbb::detail::r1::assertion_failure assert_impl.h:73
tbb::detail::r1::small_object_pool_impl::destroy small_object_pool.cpp:141
tbb::detail::r1::thread_data::~thread_data thread_data.h:120
tbb::detail::r1::governor::auto_terminate governor.cpp:221
tbb::detail::r1::market::cleanup market.cpp:611
tbb::detail::r1::rml::private_worker::run private_server.cpp:283
tbb::detail::r1::rml::private_worker::thread_routine private_server.cpp:221
start_thread 0x000014a3156a815a
clone 0x000014a314c2bdd3

@xiamr
Copy link
Author

xiamr commented Nov 5, 2021

the issue is existed in oneTBB 2021.3 and 2021.4 version

@alexey-katranov
Copy link
Contributor

The issue seems weird like double-free or something like that. Can you provide mode details about your reproducer? What flow graph nodes are used and how? Is it x86?

@xiamr
Copy link
Author

xiamr commented Nov 9, 2021

It is x86-64. And the code has this issue:
企业微信截图_16364373962284

@pranasge
Copy link

It seems I have this issue to and it can be reproduced with simple test:

	struct TestStruct
	{
		char bytes[153]{ 0 };
		//char bytes[152]{ 0 }; //works
	};

	tbb::flow::graph graph;
	tbb::flow::queue_node<TestStruct> input{ graph };
	tbb::flow::function_node<TestStruct> func{ graph, tbb::flow::serial, [](const auto & input) { return input; } };

	tbb::flow::make_edge(input, func);

	ASSERT_TRUE(input.try_put(TestStruct{}));

	graph.wait_for_all();

Tested with oneTBB 2021.3 version.

@alexey-katranov
Copy link
Contributor

Thank you for the reproducer. The graph_task is really broken because it deallocates another type than allocated: _flow_graph_impl.h#L362-L367:

    allocator.deallocate(this, ed);

this is graph_task while it is a base class of real type.

Notify: @aleksei-fedotov

@pranasge
Copy link

pranasge commented Nov 18, 2021

This seems to be blocking issue for me. Is there easy workaround or version that does not have this issue?

@alexey-katranov
Copy link
Contributor

We are in the process of fixing. I am not sure about possible workaround (some crazy variants: reduce the message size, e.g. use the shared pointer or recompile oneTBB runtime with the disabled small object cache).
The root cause is that we detect the object size incorrectly during destruction of a graph message, overall, it is unlikely to introduce any issues in release (while there is possible race condition in a corner case if threads are heavily recreated during graph processing)

@pranasge
Copy link

Wrong deallocation sounded bad but seems to be less of an issue than I thought. Will use release version for now. Thanks for info.

@alexey-katranov
Copy link
Contributor

@pranasge , do you have the possibility to check #667 that it fixes your original problem?

@pranasge
Copy link

Yes, my tests pass without problems now.

@inline42
Copy link

inline42 commented Aug 14, 2022

I am having same issue. Is this in prod? Or I have to check out a specific commit?

Also can I have some background of why is this happening please? Which object is deallocated, is it the operator that goes in the function node? Or something unreleted to my implementation? It sounds scary :)

@vlserov
Copy link

vlserov commented Sep 29, 2022

Commit 6edb5c3 was not became part of v2021.5.0.
Should be included in latest v2021.6.0 release.

@Nekto89
Copy link
Contributor

Nekto89 commented Aug 3, 2023

google has led me here. I'm getting this assertion non-deterministically on v2021.9.0 on Linux and macOS. I haven't been able to minimize the case yet.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants