Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nightly GitHub Actions Build Fail on Mon, July 31st 2023 #4221

Closed
github-actions bot opened this issue Jul 31, 2023 · 11 comments
Closed

Nightly GitHub Actions Build Fail on Mon, July 31st 2023 #4221

github-actions bot opened this issue Jul 31, 2023 · 11 comments
Assignees
Labels

Comments

@github-actions
Copy link
Contributor

See run for more details:
https://github.com/TileDB-Inc/TileDB/actions/runs/5710056189

@ihnorton
Copy link
Member

023-07-31T03:49:56.2333546Z     Building Custom Rule D:/a/TileDB/TileDB/test/CMakeLists.txt
2023-07-31T03:49:56.4255743Z C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(248,5): error MSB8066: Custom build for 'D:\a\TileDB\TileDB\build\tiledb\CMakeFiles\f68ba6c806224f8b7d97fcc8498aa1d2\check.rule;D:\a\TileDB\TileDB\test\CMakeLists.txt' exited with code -1. [D:\a\TileDB\TileDB\build\tiledb\test\check.vcxproj] [D:\a\TileDB\TileDB\build\check.vcxproj]
2023-07-31T03:49:56.7630494Z C:\Program Files\Microsoft Visual Studio\2022\Enterprise\MSBuild\Microsoft\VC\v170\Microsoft.CppCommon.targets(248,5): error MSB8066: Custom build for 'D:\a\TileDB\TileDB\build\CMakeFiles\07a5710490f67ffdee8655138aa22e3e\check.rule;D:\a\TileDB\TileDB\CMakeLists.txt' exited with code 1. [D:\a\TileDB\TileDB\build\check.vcxproj]
2023-07-31T03:49:57.5648752Z ##[error]Process completed with exit code 1.
2023-07-31T03:49:57.6168229Z Post job cleanup.

puzzling...

@davisp
Copy link
Contributor

davisp commented Jul 31, 2023

@teo-tsirpanis
Copy link
Member

After managing to reproduce it on my machine I took a closer look into this. After running CTest, the generated Visual Studio project runs this command:

setlocal
"C:\Program Files\Microsoft Visual Studio\2022\Community\Common7\IDE\CommonExtensions\Microsoft\CMake\CMake\bin\cmake.exe" -SC:/Users/teo/code/TileDB -BC:/Users/teo/code/TileDB/build/nightly/tiledb --check-stamp-file C:/Users/teo/code/TileDB/build/nightly/tiledb/test/CMakeFiles/generate.stamp
if %errorlevel% neq 0 goto :cmEnd
:cmEnd
endlocal & call :cmErrorLevel %errorlevel% & goto :cmDone
:cmErrorLevel
exit /b %1
:cmDone
if %errorlevel% neq 0 goto :VCEnd

which seems to be the one that fails with an exit code of -1.

I generated an MSBuild binary log of the failed test run. For anyone interested, it can be viewed with https://msbuildlog.com/.

@teo-tsirpanis
Copy link
Member

Maybe the way we run the tests with cmake --build . --target check has a problem. We don't use --target check anywhere else in CI. 🤔

@teo-tsirpanis
Copy link
Member

The last successful nightly build was on 366b15b and the first failing nightly build was on c2bbf59.

Between them the biggest change was #4034, which still touched very little CMake stuff.

@davisp
Copy link
Contributor

davisp commented Aug 11, 2023

I tried looking for a change as well in that time frame and couldn't find any smoking guns. I was wondering if there was a version change of something on the builders.

Does that cmake --build . --target check run successfully? I tried to download the msbuildlog app on macOS, but macOS was very insistent it was malware so I didn't pursue it any further since you've already gone through it.

I wonder if we could add debug logging around that stamp file to see if/when it exists? Perhaps one of the filesystem tests does something weird when cleaning up data and accidentally deletes the CMake directory?

@teo-tsirpanis
Copy link
Member

I was wondering if there was a version change of something on the builders.

I will try rerunning a nightly build on 366b15b.

Does that cmake --build . --target check run successfully?

Not on my machine. These MSBuild logs are from a local build. After many tries I could not make GHA upload them as artifacts.

I tried to download the msbuildlog app on macOS, but macOS was very insistent it was malware so I didn't pursue it any further since you've already gone through it.

The linked site has an online viewer.

@davisp
Copy link
Contributor

davisp commented Aug 11, 2023

The linked site has an online viewer.

Huh! So it does. I must've just saw the macOS installation and clicked through without seeing the big green button for the online version.

@teo-tsirpanis
Copy link
Member

I will try rerunning a nightly build on 366b15b.

It suceeds. 😩

I will now try running a nightly build for each of the commits in the range. 🤞🏻

Something else I noticed is that around the time the builds started failing, GHA updated CMake from 3.26.4 to 3.27.0 but I don't think this is to blame, because locally I am still on 2.26.4.

@teo-tsirpanis
Copy link
Member

Found it. Nightlies started failing since #4087.

Right before it it is #4034, whose nightlies succeed.

@davisp
Copy link
Contributor

davisp commented Aug 14, 2023

I've confirmed on my fork that @teo-tsirpanis has seemed to have found that error in my SSL config PR. I ended up running a modified version with the ssl config tests disabled and that completes successfully. I'm running more tests to see why the tests appear to pass but apparently don't on the Windows nightly runners and not normal PR runners.

One thing that I did realize is that the build-services-{stop|start}.sh scripts aren't run as part of the nightlies, though that's also Posix specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants