Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize loading of multirelease JAR files in a WAR #3669

Open
OndroMih opened this issue Feb 17, 2024 · 7 comments
Open

Optimize loading of multirelease JAR files in a WAR #3669

OndroMih opened this issue Feb 17, 2024 · 7 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@OndroMih
Copy link
Contributor

OndroMih commented Feb 17, 2024

Hi, I noticed that when a WAR file contains a multirelease JAR file, it takes an extremely long time to deploy even a simple application.

Here's an example application:
myapp.war.zip (source code here:
jsf-hello-world.zip

To reproduce, just run the Piranha Web Profile distribution with the app, like this:

java -jar piranha-dist-webprofile-24.2.0.jar --war-file myapp.war

I took a few thread dumps during deployment:
threaddumps.zip

On my computer, it took 96 seconds (1.5 minutes) to deploy the reproducer app (actually, deployment failed, but that's because of a classloading issue in my app with javax/jakarta package prefix, that's not very relevant in this case)

@OndroMih
Copy link
Contributor Author

According to the thread dumps, the source of inefficiency seems to be the MultiReleaseResource class. In the versionedEntry method, it iterates over a list of Java versions, and for each version it tries to read a versioned resource. This happens for every class in any multirelease JAR, because this action is initiated by the Annotation Scanner, which attempts to load all classes in the WAR file.

Most of the time, only one or a few classes have a versioned resource. In my reproducer, the only multirelase JAR is bytebuddy, which only contains module-info.class for Java 9+, no other class is versioned. The current behavior tries to load a class for all Java versions from 9 to 21, and then in the usual location, if it doesn't find any class for a specific version. This means, that for every class in a multirelease JAR, it attempts to load the class 12 times, it doesn't find it, and then it loads it on the 13th attempt. On future Java versions, it will get worse and worse.

I suggest that the MultiReleaseResource, on the first attempt to load a resource, scans the contents of the META-INF/versions folder and stores the list of resources for each Java version into memory. And then it would attempt to load a versioned resource only if it exists in the JAR file. It should then know exactly which resource to load, for a specific version or at the default location, and would attempt to load a resource only once.

@OndroMih
Copy link
Contributor Author

OndroMih commented Feb 17, 2024

Another optimization is to unpack JAR files into a temp folder and then load resources from that folder. This would optimize time to load resources for any JAR file, even if it's not a multirelease JAR.

Overall, the deployment times are not very nice. Even if I turned the bytebuddy JAR file in my reproducer to a non-multirelease one, it still took about 30 seconds to deploy on my computer (actually, deployment failed because of the javax/jakarta issue in my app)

@OndroMih
Copy link
Contributor Author

Another optimization, again for any JAR file, could be to use multiple (virtual) threads to read from a JAR file. Each JAR file can be read by a different thread, and even different resources from the same JAR file could be read by a different thread, if each thread opens its own JarFile pointing to the same JAR file (JarFile uses synchronized access so it doesn't help if multiple threads use the same JarFile instance, but opening multiple JarFile instances against the same file for reading is not a problem).

We could use virtual threads, then we could create a virtual thread for each classpath resource. However, I'm not sure if this is an optimization, or it would be slower because it would open a JarFile for each classpath resource.

@mnriem mnriem added the enhancement New feature or request label Feb 17, 2024
@mnriem
Copy link
Contributor

mnriem commented Feb 17, 2024

@Thihup If you have bandwidth, please have a look. @OndroMih Feel free to come up with a fix also.

@Thihup
Copy link
Collaborator

Thihup commented Feb 17, 2024

@OndroMih, Thank you for the investigation!

When I implemented the feature, I was already aware of the performance issue (see #1507 (comment)).

If I recall correctly, fixing it was a bit more complex because we don't use a Jar file; instead, we utilize the Resource interface. This allows us to use the MultiReleaseResource as a wrapper around other Resource implementations.

A cache would probably suffice, since, most of the time, there are only a few multi-release classes in a Jar/Resource.

Regarding the use of Virtual Threads, I believe it pins the current thread to read a file, so I'm not certain it would optimize the process. However, my understanding of this topic could be incorrect.

@mnriem Currently, I'm using Windows, and I need to check if I can still compile the project. I recall encountering some failing tests. First, I need to fix my setup before fixing this issue.

@mnriem mnriem added the help wanted Extra attention is needed label Feb 28, 2024
Copy link

This issue is stale because it has been open 170 days with no activity. Remove stale label or comment or this will be closed in 10 days

@github-actions github-actions bot added the stale label Aug 17, 2024
@OndroMih
Copy link
Contributor Author

I didn't have time to investigate how to fix this. I'd like to get to this when I have some time.

@github-actions github-actions bot removed the stale label Aug 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

3 participants