-
Notifications
You must be signed in to change notification settings - Fork 174
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix for multithreaded execution #336
Conversation
Hi @pouryafard75 and thanks a lot for spotting this nasty concurrency bug that prevents parallelizing diffs. I took a look at your solution but I am not sure it's how we should proceed. Initially, I used a global type registry to have a fast equality test for types, using only a reference check instead of a string comparison, but your solution would revert to this slower equality checking (and this is a very frequent operation). By thinking more about the root of the problem, I think the culprit here is the type constructor (https://github.com/GumTreeDiff/gumtree/blob/main/core/src/main/java/com/github/gumtreediff/tree/TypeSet.java#L38-L40) that is not thread-safe. Maybe we could try to make this method Cheers! |
@jrfaller Thanks for your feedback. |
Hi @pouryafard75 and thanks for trying the fix out! I completely agree with you that the test case should be located inside the core package, and should be as small as possible. I think the problem appears in such a case :
So if I am not wrong, just having one-node trees (or trees with several nodes with the same type) to diff concurrently could trigger the bug, the assert predicate would be that all combination of types should return WDYT? |
@jrfaller Thanks for your comment. This is what I have replicated
I also moved it to TestTree. but the problem is, it doesn't fail frequently. I assume the previous test due to the time it took to parse the tree and generate, had higher chance of failure. |
I have this one : @Test
public void testTypeThreading() throws InterruptedException {
int n = 20;
ExecutorService exec = Executors.newFixedThreadPool(n);
List<Type> types = new ArrayList<>();
for (int i = 0; i < n; i++) {
exec.submit(() -> {
types.add(TypeSet.type("foo"));
});
}
exec.awaitTermination(1, java.util.concurrent.TimeUnit.SECONDS);
for (Type t1 : types) {
for (Type t2: types) {
assertSame(t1, t2);
}
}
} Seems to fail almost every time on my machine, is it the case for you too? |
@jrfaller Yes this works. (I was fixated to take advantage of repeated tests to reproduce the bug 😬) |
Hello,
I've identified a bug affecting multithreaded execution.
This PR consists of 3 commits, and I'll break down each commit:
I have created a repeated test (executes in separate threads) (commit#1) which runs GumTree Classic matcher on a file revision, and observed discrepancies in the number of generated mappings. To enable a multithreaded environment, I relied on JUnit 5's CONCURRENT execution, requiring the addition of junit-platform.properties.
Test result in a multithreaded environment before the fix
After investigating, I realized the issue lies in the equality check for types
getType() == t.getType()
, which works in a single-threaded environment but fails when multiple GumTree instances are running. In commit#2, I addressed this problem for Classic and Simple Matchers (also ZsMatcher since Classic depends on that)Test result in a multithreaded environment after the fix
Additionally, I noticed the same issue across different matchers, leading to commit#3, where I replicated the fix throughout the entire project.
P.S: I'm aware that the test should ideally be located in the core module, but due to the dependency on JdtTreeGenerator, I couldn't move it. I can export the tree XML and load it to eliminate the dependency. Please let me know your thoughts.