Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix jar file reference close race condition #43257

Merged
merged 2 commits into from
Sep 13, 2024

Conversation

franz1981
Copy link
Contributor

@franz1981 franz1981 commented Sep 12, 2024

Possible solution for #43158

This is fixing a potential race while virtual threads try to "help" and in-progress release by moving forward the shared atomic reference: we need to make sure that JarFileReferences cleanup don't mess up with a freshly jar file shared by the virtual thread.
At the same time if there's an in progress jar file ref load from a platform thread, the virtual thread doesn't have to replace the in progress task.

@franz1981 franz1981 marked this pull request as ready for review September 12, 2024 21:41
@quarkus-bot quarkus-bot bot added the area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins label Sep 12, 2024
@franz1981
Copy link
Contributor Author

franz1981 commented Sep 12, 2024

FYI @pcasaes if you can give it a shot: I've implemented this quickly and using the simplest scheme i.e. using the flip into negative value to bound the domain of reference count values.

It has 2 strict requirements:

  1. acquire and releases should always be in pair
  2. releases can never close the file if the reference hasn't been "marked for closing"

I hope I'm not making it too strict

@franz1981
Copy link
Contributor Author

franz1981 commented Sep 12, 2024

@mariofusco PTAL

@dmlloyd too

If you prefer me to use a parity bit or any other mechanism to simplify the reasoning here, any suggestion is welcome

@franz1981
Copy link
Contributor Author

I've found another logic error in the original implementation for the virtual threading part, fixing it

@franz1981
Copy link
Contributor Author

@mariofusco @geoand

Re

var futureRef = jarFileReference.get();
if (futureRef != null) {
// The jarfile has been already used and it's going to be removed from the cache,
// so the future must be already completed
var ref = futureRef.getNow(null);
if (ref != null) {
ref.markForClosing(this);
}
}

I'm concerned re this because if markForClosing is mandatory to close the jar file, it means that if we have a race between this close and a loading which doesn't have yet completed the future, it will miss it - causing the file ref to not be closed: this is fine/expected?

@franz1981
Copy link
Contributor Author

@gsmet I have to check yet if we need some tests and counters to verify that after a reset of caches all the jar file ref are correctly nulled out or that a retained jar file ref (acquired by some thread, and not yet released) - after a reset of caches - it's correctly closed (and nulled out) after being released.

@quarkus-bot
Copy link

quarkus-bot bot commented Sep 13, 2024

Status for workflow Quarkus CI

This is the status report for running Quarkus CI on commit bd1fe70.

✅ The latest workflow run for the pull request has completed successfully.

It should be safe to merge provided you have a look at the other checks in the summary.

You can consult the Develocity build scans.


Flaky tests - Develocity

⚙️ Native Tests - HTTP

📦 integration-tests/rest-client-reactive

io.quarkus.it.rest.client.BasicTestIT.shouldCreateClientSpans - History

  • expected: <1> but was: <2> - org.opentest4j.AssertionFailedError
org.opentest4j.AssertionFailedError: expected: <1> but was: <2>
	at io.quarkus.it.rest.client.BasicTest.shouldCreateClientSpans(BasicTest.java:216)
	at java.base/java.lang.reflect.Method.invoke(Method.java:569)
	at io.quarkus.test.junit.QuarkusTestExtension.interceptTestMethod(QuarkusTestExtension.java:810)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)
	at java.base/java.util.ArrayList.forEach(ArrayList.java:1511)

/**
* Ask to close this reference.
* If there are no readers currently accessing the jarFile also close it, otherwise defer the closing when the last reader
* will leave.
*/
void close(JarResource jarResource) {
release(jarResource);
void markForClosing(JarResource jarResource) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method doesn't really close the resource, but mark it as ready to be closed, now its method better reflect this behaviour.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be in the code, not here :)

return;
}
// close must change the value into a negative one or zeroing
if (referenceCounter.compareAndSet(count, addCount(-count, -1))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The referenceCounter is turned into a negative value to indicate (in an idempotent way) that the resource has been marked to be closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

if (localJarFileRefFuture != null && localJarFileRefFuture.isDone()) {
JarFileReference jarFileReference = localJarFileRefFuture.join();
if (jarFileReference.acquire()) {
return consumeSharedJarFile(jarFileReference, jarResource, resource, fileConsumer);
}
closingLocalJarFileRef = true;
Copy link
Contributor

@mariofusco mariofusco Sep 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The acquire failure implies that the reference is already marked to be closed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again

Copy link
Contributor

@mariofusco mariofusco left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The former implementation was designed under the strong assumption that the close method on the JarFileReference was called once and only once. This issue demonstrates that this assumption was no longer true, so the main goal of this pull request is making the close idempotent (or more precisely the markForClosing since this is semantic of that method).

@mariofusco
Copy link
Contributor

@gsmet @geoand This fix has been also already verified by @pcasaes, so I believe that this pull request is ready to be merged.

Copy link
Member

@gsmet gsmet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's merge this to be able to backport it to 3.14.4.

@mariofusco As for the comments, please add them in a follow-up PR, that we will merge and backport later.

@gsmet gsmet merged commit e20deb4 into quarkusio:main Sep 13, 2024
52 checks passed
@quarkus-bot quarkus-bot bot added this to the 3.16 - main milestone Sep 13, 2024
@gsmet
Copy link
Member

gsmet commented Sep 13, 2024

Thanks for taking a look at this.

@gsmet gsmet modified the milestones: 3.16 - main, 3.15.0 Sep 13, 2024
@pcasaes
Copy link
Contributor

pcasaes commented Sep 13, 2024

Nice! Thanks a lot!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/devtools Issues/PR related to maven, gradle, platform and cli tooling/plugins triage/flaky-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants