-
Notifications
You must be signed in to change notification settings - Fork 127
"java.lang.NumberFormatException" Race condition in coverage file #19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Oh, actually that hasn't fixed it. :(
I take it this doesn't happen on unix? I'll look into options for atomic file writes on Windows... |
Might be windows new lines breaking it. Try stripping \r
|
That's not it -- there's no whitespace in the file, just numbers and ";"s On my machine, sometimes the numbers are interleaved with the ";"s: Here's an extract:
Note the missing semicolons near the start, and the double semicolon near the end. |
Are you running tests in parallel ?
|
Yes |
It wasn't designed to be thread safe. I guess we need to synchronize on On 11 March 2014 10:11, Richard Bradley notifications@github.com wrote:
|
I thought you might have been relying on short file appends being atomic. I think we should be able to do something like that. I don't see why they are not atomic. Probably the java "FileWriter" isn't using the win32 api in the right way to get atomic appends.
Yes, locking should also do it. It might kill my test throughput.
Yes (but if we got atomic file appends working, we wouldn't even need to rely on that). |
No one on linux has had issues but I don't know anyone who runs tests in On 11 March 2014 10:46, Richard Bradley notifications@github.com wrote:
|
I was wrong. I changed the version number and did a "sbt publishLocal", but it seems it wasn't enough to get this patch to be used in my other project. I'm trying various things to get this patch to be picked up so I can test this... |
Right -- I've chased down all the different version numbers which need changing to test this patch, and I can confirm that neither the patch proposed here, nor a version using I'm going to do some stress testing of |
I can't see any way around locking at the moment.
I'll ask on StackOverflow if anyone knows how to achieve atomic file append in Java and submit a pull request with a lock. |
We should update the docs to note that using |
Yeah might be better to do deferred writing somehow. A lock per statement
|
How about a version which spawns a Future for each write and doesn't wait for the result? This version passes my load test:
|
I suppose the main alternative is a threadsafe queue, with a worker thread pulling "invoke"s off the queue and writing them to disk. That sounds like a fair bit of functionality to write and test (and race conditions to consider etc.). I had a look at log4j to see how it deals with this sort of issue -- it has a global lock on the logger file appender. |
The thing with log4j is the number of calls is a lot less. If we lock in the invoker then a statement like |
That sounds pretty sensible. I'll see if I can code it up over the next few days. Or how about a global concurrent Set[Statement] per JVM and write it out on process exit? Would that be better? What do other coverage frameworks do? |
SCCT is the only other scala one, and that's not thread safe either IIRC. Things like jacoco instrument the bytecode by attaching to the JVM. |
We could do something like |
Writing out on exit should work too, although I'd have to think about how it knows when its finished. It won't be enough to use shutdown hooks because you might run the whole thing in a single process and then the report generators will fail because the file won't have been written. |
This is what I have so far def invoked(id: Int, path: String) = { And the corresponding load // loads all the invoked statement ids |
... but at some level I notice that Is this article still accurate? Maybe it would be easier to port statement-level coverage into |
I think since jacoco uses the agent its not writing anything at all, so it won't need locks. It's just a big list of invoked statements. We have to write out because maven won't keep the jvm alive between phases. Jacoco's support will never be as good as scoverage. The reason is with Scala the mapping from source to bytecode is not bijective, whereas with Java its closer to being bijective. If you do a pattern match in scala, then you might have things of interest for example (did you call the match, did you match case 1, did you match case 2, ....). The bytecode generated might be 30 ops or it might be 1 op. It's hard to come back from the bytecode and map into the source level statements. That's why I believe its better to do AST manipulation. |
Fair enough. I've had a closer look since writing the above, and jacoco isn't as full-featured or as mature as I had first thought.
Looks like the latter then ;-)
I'm still not convinced that this isn't worth looking into -- jacoco's "big list of invoked statements" is the same as |
I agree that there are probably good solutions in jacoco that we can borrow.
|
…y using one file per thread.
I had a look in Jacoco. I’m not familiar with the codebase, but it appears to me that they use a single global JVM lock on each instrumented instruction. Using one file per thread works well for me (my highly concurrent test suite now passes). It also seems likely to be more efficient and more similar to uninstrumented code than a single global lock. I think this solution is much better than a global lock. I have made a few minor tweaks in this pull request (#25): I added that concurrency test for Invoker/IOUtils, as this is a good record that the file-per-thread solution works, and a good starting point for experimenting on alternatives in case we revisit this in the future; and I clarified the comments on Invoker. |
I'm very surprised that my humble change is better than what jacoco have done. I think if you ping them they may come up with some scenarios we didn't think of that render my file per thread incorrect. At the very least it will validate what we are doing here. |
In addition. I guess at the end of the day it doesn't really matter if a test takes 1ms or 10ms. It's only important in production. So I think maybe the lack of efficiency in the jacoco style lock across the entire JVM is not really an issue. |
I think it's also quite likely that I have misunderstood the code and that they have a per-class coverage lock, rather than a global coverage lock. I think the "probes" get locked, for writing, which is less than global. |
Not entirely -- some of my integration tests failed with coverage turned on because the instrumentation slowed them down so much that they timed out waiting for something to happen and reported failure. Operations which take 100ms un-instrumented now take 2s+ with coverage turned on. I now have to have two sets of timeouts configured: one for normal test runs and one for instrumented test runs. |
Ok fair enough
|
…ition-file-per-thread #19 Fix the Windows race condition in the measurement file by using one file per thread.
Can you run your tests single threaded with the lock and without the locks. Doesn't matter if the output is garbled, I just want to see the different in timings. Basically I'm curious to know - is it the locking mechanism taking the time or is there actual contention going on. |
Is 0.98.0 working ok on windows now ? |
0.98.2 handles all this well. |
Thanks, yes, There are some performance issues which I have been working on for this thread, but I'll raise those as separate pull requests. |
FYI: it looks like Java's file writer implementation doesn't set the native "FILE_APPEND_DATA" or "O_APPEND" flag when appending to files, so it looks like it won't support atomic writes on either platform: http://stackoverflow.com/a/24620026/8261 |
Oh right. Well for our use case it's not important as each thread writes to its own file now anyway. |
I'm seeing intermittent errors like the following when running coverage reports with "scoverage" on Windows:
I think this is because of a race condition in
Invoker.invoked
which is causing the "id" and the ";" to be written non-atomically.This may only be a problem on Windows. Perhaps two adjacent small appends are atomic on POSIX but not on Windows?
I will submit a patch to consolidate the two writes into one. This seems to fix the issue on Windows, and it seems like a good idea on POSIX as well.
The text was updated successfully, but these errors were encountered: