-
Notifications
You must be signed in to change notification settings - Fork 477
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix io_service_fixture_with_threads: create given threads count #75
Conversation
@@ -57,6 +57,18 @@ TEST_CASE("single waiter") | |||
|
|||
TEST_CASE_FIXTURE(io_service_fixture_with_threads<3>, "multi-threaded") | |||
{ | |||
#if (CPPCORO_CPU_X86) | |||
#if (1) | |||
const int iterations = 10'000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you know how/why this change works around the issue with x86 optimised builds?
I can only assume that because it's now a variable that is captured by reference that this makes the code below slightly more complicated and less able to be optimised.
I have seen several different tests all fail under x86 optimised over various recent versions of msvc.
This mostly just seems like bad codegen for coroutines under x86, although I haven't looked into some of the more recent failures to identify the root cause yet. I've been largely just ignoring x86 optimised failures lately and am hoping that the codegen bugs will be fixed in the 15.7 update.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I have no idea. I had no time to take a look into assembler output yet.
Also, bug disappear if iterations count will be less (lets say 6'000). In this case integer literal can be used.
recent versions of msvc
I'm using 15.6.4 and this fix helps me, but, as I said, appveyor build still fails. Since there are no linker errors (that I have for 15.6.4), appveyor has older MSVC version (15.6.3 ?). So (if this is codegen bug) something was changed in latest version.
Also, because less number of iteration helps, I thought that this is something relative to thread's stack size (?), but increasing stack size to 10 MB from default 1 MB does not help.
I can try to change iterations count to 2'000 and build again on appveyor if you think changes should be accepted.
Thank you for looking into the change :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have confirmed that this change also makes the test pass on MSVC 15.7 (preview 2).
That is very weird about the changing iteration causing issues. Perhaps reducing the iteration count reduces the contention on shared data-structures and thus is avoiding some race-conditions?
The AppVeyor tests are still running with an older version of msvc 15.6.2 but are failing on a different test (writing a file). This test only seems to fail on the AppVeyor CI machines. I haven't been able to reproduce it on a dev machine. My current working theory for that failing test is also bad x86 codegen but I don't have any evidence other than "other things are broken under x86 optimised due to compiler bugs so it's likely this is too".
I'll put through a change to disable the failing file test under x86 optimised for now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lewissbaker, sorry for late response. Just to be sure: can I help you with this somehow ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you be able to amend the PR to remove the change to test/single_consumer_async_auto_reset_event_tests.cpp ?
I'll put the PR through with just the fix to the io_service_fixture constructor.
I've put through a change on master that ignores test failures on x86 for now and will leave it as a known issue. There's no point in working around the compiler bug in the test - users of the library are just as likely to run into the bug in their own code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes reverted (not sure that was done right, I have few uneeded commits in the forked repository, but you have 1 file in this the pull request, finally)
@@ -25,7 +25,10 @@ struct io_service_fixture | |||
m_ioThreads.reserve(threadCount); | |||
try | |||
{ | |||
m_ioThreads.emplace_back([this] { m_ioService.process_events(); }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice find!
… array access` exception" This reverts commit 8467a21.
Looks like small typo: only one thread was created always.
UPD: And looking into failed optimized x86 build shows that "multi-threaded" case with 3 threads fails with exceptions:
and callstack is always the same:
This test-case will fail with any number of threads that is greater than one.
With some dancing around I found "fix", but I have no idea what happens.
Here it is: link.
Unfortunately, this strange fix works fine for my local build, but causes tests to fail on appveyor in some other place (?) with another exception (?)