Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/test timeout #807

Conversation

darbyjohnston
Copy link
Contributor

Hi,

Running the OpenEXR.IlmImf test on one of my dev machines takes about 38 minutes which exceeds the default CTest timeout of 1500 seconds, causing it to be marked as failing. This change just doubles the timeout to 3000 seconds. The machine is not terribly slow, it's a quad core Xeon @ 3.7Ghz, maybe the slower file access on Windows is partially at fault?

This is on Windows 10 with Visual Studio 2019, compiled with Debug/x64.

Test project C:/dev/openexr/openexr-build
Start 1: IlmBase.Half
1/5 Test #1: IlmBase.Half .....................   Passed    7.16 sec
Start 2: IlmBase.Iex
2/5 Test #2: IlmBase.Iex ......................   Passed    0.01 sec
Start 3: IlmBase.Imath
3/5 Test #3: IlmBase.Imath ....................   Passed   31.03 sec
Start 4: OpenEXR.IlmImf
4/5 Test #4: OpenEXR.IlmImf ...................   Passed  2343.49 sec
Start 5: OpenEXR.IlmImfUtil
5/5 Test #5: OpenEXR.IlmImfUtil ...............   Passed  100.89 sec

100% tests passed, 0 tests failed out of 5

Total Test time (real) = 2482.62 sec

Signed-off-by: Darby Johnston <darbyjohnston@yahoo.com>
@meshula
Copy link
Contributor

meshula commented Aug 8, 2020

I wonder if there's a smart way to cut the test4 time in half....

@darbyjohnston
Copy link
Contributor Author

darbyjohnston commented Aug 10, 2020

For comparison, here are the test times on the same hardware with CentOS 7 (dual boot machine):

Test project /home/darby/dev/openexr/openexr-build
Start  1: IlmBase.Half
1/11 Test  #1: IlmBase.Half .........................   Passed    1.03 sec
Start  2: IlmBase.Iex
2/11 Test  #2: IlmBase.Iex ..........................   Passed    0.00 sec
Start  3: IlmBase.Imath
3/11 Test  #3: IlmBase.Imath ........................   Passed    3.54 sec
Start  4: OpenEXR.IlmImf
4/11 Test  #4: OpenEXR.IlmImf .......................   Passed  338.07 sec
Start  5: OpenEXR.IlmImfUtil
5/11 Test  #5: OpenEXR.IlmImfUtil ...................   Passed   22.94 sec

I can't imagine there is that large of a difference between CPU usage, so I suspect it's the disk access. I double checked that disk compression was turned off on Windows 10:

> fsutil behavior query disablecompression
DisableCompression = 1  (Enabled)

Both tests were run on SATA SSDs.

@cary-ilm
Copy link
Member

cary-ilm commented Aug 10, 2020 via email

@darbyjohnston
Copy link
Contributor Author

Hi @cary-ilm,

I tried testing on another machine (Windows 10 Home, 8 core Ryzen, NVMe drive), which I thought would be faster, and actually got worse results:

Test project C:/dev/openexr/openexr-build
Start 1: IlmBase.Half
Test #1: IlmBase.Half .....................   Passed    9.45 sec
Start 2: IlmBase.Iex
Test #2: IlmBase.Iex ......................   Passed    0.06 sec
Start 3: IlmBase.Imath
Test #3: IlmBase.Imath ....................   Passed   39.19 sec
Start 4: OpenEXR.IlmImf
Test #4: OpenEXR.IlmImf ...................   Passed  2900.37 sec
Start 5: OpenEXR.IlmImfUtil
Test #5: OpenEXR.IlmImfUtil ...............   Passed  131.07 sec

Maybe this could also be from building the code in "Debug" mode, which seems to have a larger performance impact on Windows than Linux?

Is there a way to specify the timeout limit at execution time?

I believe you can also set the timeout on the CTest command line.

@kdt3rd
Copy link
Contributor

kdt3rd commented Nov 7, 2020

@darbyjohnston - I do not know what it is, but in trying to get the github actions workflows running for windows, I also noticed that in debug mode, the windows tests are extremely slow (optimized, they seem fine). On the current master, I have split it such that each test runs as a separate test, hoping I could ask you to compile that in debug and narrow down which test is particularly slow and maybe do a bit of a profile to see what it is about that which is so much slower in debug? thanks in advance...

The main branch will will be in a high state of development flux for a little while as we're doing a full re-org after the Imath split, so it may look a bit different than what you have setup currently...

@meshula
Copy link
Contributor

meshula commented Nov 9, 2020

I don't have a reference available at the moment, but there is a large amount of heap-sanity checking that goes on in a MSVC debug build and hence the terrible performance. I believe that under VS2019 preprocessor macros were introduced that elide that facility and get comparable performance times to release (at least same order-of-magnitude, anyway). Given that release and debug builds under msvc have vastly different runtime behavior (zero-initialized versus not, sentinel values for constructed objects versus destructed objects, 0xfeeefeee to indicate freed allocations, and so on), I wonder what we might gain from running the tests in an msvc debug mode in general though?

@peterhillman
Copy link
Contributor

I wonder what we might gain from running the tests in an msvc debug mode in general though?

Presumably a debug build of OpenEXRTest needs to be available to debug any issues reported by running tests in release mode. That would likely be done by running the OpenEXRTest binary directly, rather than via the ctest mechanism.

Now @kdt3rd has made it run tests separately, perhaps ctest in debug mode could default to running one brief test (e.g. testMultiPartApi) just to make sure that the OpenEXRTest binary builds and runs. Release mode ctest should still run all the tests.

@meshula
Copy link
Contributor

meshula commented Nov 10, 2020

@peterhillman I'm not arguing against having a debug build, clearly we benefit from that for debugging. I'm arguing that running (as opposed to building) the entirety of the full suite in debug mode isn't providing useful information in and of itself. To devil's advocate my own suggestion about running that build, it is true that the MSVC diagnostics detect double deletes and freed memory access, but I would argue that ASAN, TSAN, and UBSAN should be our focus instead on that front.

Anyway, as you note:

Now @kdt3rd has made it run tests separately, perhaps ctest in debug mode could default to running one brief test (e.g. testMultiPartApi) just to make sure that the OpenEXRTest binary builds and runs. Release mode ctest should still run all the tests

I think that would be an excellent resolution to the issue.

@darbyjohnston
Copy link
Contributor Author

Hi Nick, the message you posted to OpenEXR dev discussion yesterday, "slow tests on Windows", I assume is related to this? I tried syncing the latest from master and using:

set(CMAKE_C_FLAGS /GS-)
set(CMAKE_CXX_FLAGS /GS-)
add_definitions(-D_ITERATOR_DEBUG_LEVEL=0)
add_definitions(-D_HAS_ITERATOR_DEBUGGING=0)

This reduced the test run time in debug mode from 2877 seconds to 2266 seconds.

When I initially opened this pull request I assumed that the tests should be run in debug mode, but if that's not the case then maybe this is a non issue? Instead of special casing the debug mode to only run a single test, maybe just don't run the tests at all for the GitHub actions Windows/debug run? As Peter mentioned if a test fails and needs to be debugged on Windows then a developer can run the OpenEXRTest binary directly.

@meshula
Copy link
Contributor

meshula commented Nov 13, 2020

That didn't make as a big a difference as I expected!

Don't run theGitHub actions to test specifically Windows/debug sounds like the right move to me.

@darbyjohnston darbyjohnston deleted the bugfix/test_timeout branch September 2, 2022 19:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants