Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix for multithreaded execution #336

Merged
merged 2 commits into from
Feb 9, 2024
Merged

Conversation

pouryafard75
Copy link
Contributor

@pouryafard75 pouryafard75 commented Feb 5, 2024

Hello,

I've identified a bug affecting multithreaded execution.

This PR consists of 3 commits, and I'll break down each commit:

  • I have created a repeated test (executes in separate threads) (commit#1) which runs GumTree Classic matcher on a file revision, and observed discrepancies in the number of generated mappings. To enable a multithreaded environment, I relied on JUnit 5's CONCURRENT execution, requiring the addition of junit-platform.properties.
    Flaky test
    Test result in a multithreaded environment before the fix

  • After investigating, I realized the issue lies in the equality check for types getType() == t.getType(), which works in a single-threaded environment but fails when multiple GumTree instances are running. In commit#2, I addressed this problem for Classic and Simple Matchers (also ZsMatcher since Classic depends on that)Stable Test
    Test result in a multithreaded environment after the fix

  • Additionally, I noticed the same issue across different matchers, leading to commit#3, where I replicated the fix throughout the entire project.

P.S: I'm aware that the test should ideally be located in the core module, but due to the dependency on JdtTreeGenerator, I couldn't move it. I can export the tree XML and load it to eliminate the dependency. Please let me know your thoughts.

@pouryafard75 pouryafard75 marked this pull request as ready for review February 5, 2024 21:12
@jrfaller
Copy link
Member

jrfaller commented Feb 7, 2024

Hi @pouryafard75 and thanks a lot for spotting this nasty concurrency bug that prevents parallelizing diffs.

I took a look at your solution but I am not sure it's how we should proceed. Initially, I used a global type registry to have a fast equality test for types, using only a reference check instead of a string comparison, but your solution would revert to this slower equality checking (and this is a very frequent operation).

By thinking more about the root of the problem, I think the culprit here is the type constructor (https://github.com/GumTreeDiff/gumtree/blob/main/core/src/main/java/com/github/gumtreediff/tree/TypeSet.java#L38-L40) that is not thread-safe. Maybe we could try to make this method synchronized? Could you try this fix to see if it would work with your test?

Cheers!

@pouryafard75
Copy link
Contributor Author

pouryafard75 commented Feb 7, 2024

@jrfaller Thanks for your feedback.
Your suggested solution works perfectly, and I have updated the PR.
Would you recommend moving the test to core and then loading the tree from xml file? Because the test doesnt have to do anything with jdt.

@jrfaller
Copy link
Member

jrfaller commented Feb 7, 2024

Hi @pouryafard75 and thanks for trying the fix out!

I completely agree with you that the test case should be located inside the core package, and should be as small as possible.

I think the problem appears in such a case :

  • node with type A is created, registry requested to create a type for A, ref r1 returned
  • at the same time, another node with type A is created, registry concurrently requested to create a type for A, ref r2 returned (hiding ref r1 forever)
  • from now on every node with type A will be assigned r2 and the one node with r1 will be seen as having an incompatible type.

So if I am not wrong, just having one-node trees (or trees with several nodes with the same type) to diff concurrently could trigger the bug, the assert predicate would be that all combination of types should return true to ==.

WDYT?

@pouryafard75
Copy link
Contributor Author

pouryafard75 commented Feb 7, 2024

@jrfaller Thanks for your comment.
Its a bit challenging to make the test fail in case of having super small trees. Also we cannot create the Type in the setup, because that will ensure the type has been added to registry properly before any tests.

This is what I have replicated

@RepeatedTest(10)
    public void testMultiThreading() {
        Type foo1 = TypeSet.type("foo");
        Type foo2 = TypeSet.type("foo");
        assertSame(foo1,foo2);
    }

I also moved it to TestTree. but the problem is, it doesn't fail frequently. I assume the previous test due to the time it took to parse the tree and generate, had higher chance of failure.

@jrfaller
Copy link
Member

jrfaller commented Feb 8, 2024

I have this one :

    @Test
    public void testTypeThreading() throws InterruptedException {
        int n = 20;
        ExecutorService exec = Executors.newFixedThreadPool(n);
        List<Type> types = new ArrayList<>();

        for (int i = 0; i < n; i++) {
            exec.submit(() -> {
                types.add(TypeSet.type("foo"));
            });
        }

        exec.awaitTermination(1, java.util.concurrent.TimeUnit.SECONDS);

        for (Type t1 : types) {
            for (Type t2: types) {
                assertSame(t1, t2);
            }
        }
    }

Seems to fail almost every time on my machine, is it the case for you too?

@jrfaller jrfaller added the bug label Feb 8, 2024
@jrfaller jrfaller self-assigned this Feb 8, 2024
@pouryafard75
Copy link
Contributor Author

@jrfaller Yes this works. (I was fixated to take advantage of repeated tests to reproduce the bug 😬)
I have updated the PR.

@jrfaller jrfaller merged commit 0f50160 into GumTreeDiff:main Feb 9, 2024
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants