Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate non-deterministic CI failures #19

Closed
dan-zheng opened this issue Oct 6, 2018 · 6 comments
Closed

Investigate non-deterministic CI failures #19

dan-zheng opened this issue Oct 6, 2018 · 6 comments
Assignees
Labels
bug Something isn't working

Comments

@dan-zheng
Copy link
Collaborator

Travis CI is failing non-deterministically for some reason.
Usually it takes me 2-3 attempts of retriggering CI for tests to pass.

@dan-zheng dan-zheng added the bug Something isn't working label Oct 6, 2018
@feiwang3311
Copy link
Owner

feiwang3311 commented Oct 6, 2018

What is the error message like when it fails?
I noticed it sometimes too. It is my test design problem, because some of the tests requires to run the generated C++ code. The testRun function always generate the C++ code in /tmp/snippet.cpp, compile it to /tmp/snippet, then run it. Sometimes the resource is not available, and I see error messages such as: "/tmp/snippet" file is not available (probably because tests are normally run with multi-threads).
I think I can add a fix such that the file name is random string.

@dan-zheng
Copy link
Collaborator Author

From my observations, error messages differed from run to run. Here's one example:

[info] - add_broadcast4 *** FAILED ***
[info]   java.io.IOException: Cannot run program "/tmp/snippet": error=2, No such file or directory
[info]   at java.lang.ProcessBuilder.start(ProcessBuilder.java:1048)
[info]   at scala.sys.process.ProcessBuilderImpl$Simple.run(ProcessBuilderImpl.scala:71)
[info]   at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.lineStream(ProcessBuilderImpl.scala:143)
[info]   at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.lineStream(ProcessBuilderImpl.scala:109)
[info]   at scala.sys.process.ProcessBuilder.lines(ProcessBuilder.scala:178)
[info]   at scala.sys.process.ProcessBuilder.lines$(ProcessBuilder.scala:178)
[info]   at scala.sys.process.ProcessBuilderImpl$AbstractBuilder.lines(ProcessBuilderImpl.scala:87)
[info]   at lantern.DslDriverC.eval(dslapi.scala:501)
[info]   at lantern.BroadCastingTest.$anonfun$new$5(test_broadcast.scala:85)
[info]   at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:12)

Perhaps randomizing snippet filenames fixes this issue. I'll observe and close this issue if that is the case.

@feiwang3311
Copy link
Owner

Yep. That was the same problem. My recent push should have used random name now.

@dan-zheng
Copy link
Collaborator Author

dan-zheng commented Oct 7, 2018

One downside to randomizing snippet filenames is that each test invocation generates a new file in /tmp. If tests are run many times, there'll be an explosion in the number of snippet files.

A better strategy may be to use the testcase name as the snippet filename, e.g. test("vector-vector-dot") -> /tmp/vector-vector-dot.cpp.

Or, if the testcase name is not accessible, a good half-measure would be to use a common prefix for all snippet filenames, e.g. /tmp/lantern-<...>.cpp. This makes it easier to identify the snippet files and delete them all at once, when desired.

@TiarkRompf
Copy link
Collaborator

I think it's better to run tests sequentially anyways (for reproducibility) so let's just turn parallelism off (there's an sbt flag to do that, check LMS repo).

But let's also name files according to test names, as you suggest @dan-zheng.

@dan-zheng dan-zheng self-assigned this Oct 7, 2018
@dan-zheng
Copy link
Collaborator Author

dan-zheng commented Oct 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants