Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #10164 - improve MetaInfConfiguration use of Resource objects #10171

Closed
wants to merge 16 commits into from

Conversation

joakime
Copy link
Contributor

@joakime joakime commented Jul 28, 2023

Replace the use of ResourceFactory.newResource(URI) with a simple Resource.resolve(String) where appropriate.

Remove unused methods.
Align ee10 and ee9 versions of MetaInfConfiguration behaviors with regards to Resource APIs.

@joakime joakime marked this pull request as draft July 28, 2023 21:21
@joakime joakime marked this pull request as ready for review July 28, 2023 21:56
@joakime joakime marked this pull request as draft July 29, 2023 17:49
FileID.isJavaArchive(resource.getURI()) &&
!"jar".equals(resource.getURI().getScheme()))
{
return context.getResourceFactory().newJarFileResource(resource.getURI());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're only using the context to get the ResourceFactory, so you should pass the ResourceFactory as a parameter instead of the context.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? Why can't we just trust that the Resource will be the right type of resource, and that it will open the jar when and if it needs to? Why does the user of the Resource suddenly have to be aware of the internal implementation of Resource and JarFileResource?

@@ -767,6 +714,17 @@ protected List<Resource> findExtraClasspathDirs(WebAppContext context)
.collect(Collectors.toList());
}

private Resource ensureJarsAreOpened(WebAppContext context, Resource resource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this method's name. At the very least, it should have a singular name as it only deals with one resource.

@BeforeEach
public void beforeEach()
{
assertThat(FileSystemPool.INSTANCE.mounts(), empty());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍
more of these == better

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this static call thread safe?
As tests from other classes are running in parallel I remember having random failures with this when I implemented parallel tests.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@olamy Tests including this assertion cannot be parallelized. Is there a way to hint that with an annotation or something?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Class level annotation @Isolated from import org.junit.jupiter.api.parallel.Isolated

@@ -387,10 +361,11 @@ public void scanJars(final WebAppContext context, Collection<Resource> jars, boo
public void scanForResources(WebAppContext context, Resource target, ConcurrentHashMap<Resource, Resource> cache)
{
// Resource target does not exist
if (target == null)
if (Resources.missing(target))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should also check that target is a directory now that you ensure jars are opened, shouldn't you? And javadoc it?

@@ -450,7 +416,7 @@ else if (LOG.isDebugEnabled())

private static boolean isEmptyResource(Resource resourcesDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should probably be inlined since it has an equivalent helper.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

plus the name doesn't at all match what the method actually tests. It returns true if the resource is not a directory or is not readable... so how is that empty?

* @return the list of tlds found
* @throws IOException if unable to scan the directory
*/
public Collection<URL> getTlds(Path dir) throws IOException
private Collection<URL> getTlds(Resource res) throws IOException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logic of this method changed: it doesn't walk the entire jar anymore but only lists the contents of META-INF. We should check the spec to make sure this is correct, and document this behavior in the javadoc.

if (Resources.isReadableDirectory(webInf))
{
Resource webInfLib = webInf.resolve("lib");
if (webInf == null || !webInf.exists() || !webInf.isDirectory())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should keep the isReadableDirectory() check instead of exploding/slightly changing it.

@@ -777,6 +714,17 @@ protected List<Resource> findExtraClasspathDirs(WebAppContext context)
.collect(Collectors.toList());
}

private Resource ensureJarsAreOpened(WebAppContext context, Resource resource)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto EE10: I don't like this name.

* @return the list of tlds found
* @throws IOException if unable to scan the directory
*/
private Collection<URL> getTlds(Path dir) throws IOException
private Collection<URL> getTlds(Resource res) throws IOException
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto EE9: you changed the logic, so we should check if it's valid, and if so, change the javadoc accordingly.

resourcesDir = resourceFactory.newResource(URIUtil.uriJarPrefix(uri, "!/META-INF/resources"));
}

resourcesDir = target.resolve("META-INF/resources");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This requires target to be a directory, this should be asserted and javadoc'ed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need the if (target.isDirectory()) don't you?

Copy link
Contributor

@janbartel janbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR seems to be now requiring more typing of Resource rather than less. Also, I thought the task here was to somehow obtain these META-INF resources/tlds/fragments from the classpath, rather than requiring us to explicitly open jars.

resourcesDir = resourceFactory.newResource(URIUtil.uriJarPrefix(uri, "!/META-INF/resources"));
}

resourcesDir = target.resolve("META-INF/resources");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still need the if (target.isDirectory()) don't you?

webFrag = resourceFactory.newResource(URIUtil.uriJarPrefix(uri, "!/META-INF/web-fragment.xml"));
}

webFrag = jar.resolve("META-INF/web-fragment.xml");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, don't you need to test first if it is a dir or a jar?

FileID.isJavaArchive(resource.getURI()) &&
!"jar".equals(resource.getURI().getScheme()))
{
return context.getResourceFactory().newJarFileResource(resource.getURI());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to do this? Why can't we just trust that the Resource will be the right type of resource, and that it will open the jar when and if it needs to? Why does the user of the Resource suddenly have to be aware of the internal implementation of Resource and JarFileResource?

@@ -285,28 +285,33 @@ private List<Resource> getBundleAsResource(ResourceFactory resourceFactory, Bund
File file = BundleFileLocatorHelperFactory.getFactory().getHelper().getBundleInstallLocation(bundle);
if (file.isDirectory())
{
// Add directory bundle as a directory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't add the directory in which the bundles are stored as a Resource, we just want the individual bundles.

}
}
}
}
resources.add(resourceFactory.newResource(file.toPath())); //TODO really???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I doubt we need the directory that is the bundle installation directory as a Resource.

}
else
{
resources.add(resourceFactory.newResource(file.toPath()));
// Treat bundle as jar file that needs to be opened (so that resources within it can be found)
resources.add(resourceFactory.newJarFileResource(file.toPath().toUri()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we explicitly having to use JarFileResource?

@@ -34,7 +34,7 @@ public class WebAppDefaultServletCacheTest
private Path resourcePath;

@BeforeEach
protected void setUp() throws Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this change directly relevant to this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this to PR #10180, once that's merged, this one will go away here.

@@ -48,7 +48,7 @@ protected void setUp() throws Exception
}

@AfterEach
protected void tearDown() throws Exception
public void tearDown() throws Exception
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See previous comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've moved this to PR #10180 (same as above), once that's merged, this one will go away here.

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the structure of much of the existing code difficult to understand... and whilst this PR is probably going in the right direction, I think it makes it even more difficult to understand.

findAndFilterContainerPaths is a good example (at being bad):

The addContainerResource variable does nothing for clarity, as it puts action before the iteration. Just inlining this variable would help with clarity, but as we are streaming anyway, then I think we'd do better to split up that consumer into individual maps and filters and embrace the stream to clearly show the steps taken to process the list

The code should be something like:

  getAllContainerJars(context).stream()
      .filter(new UriPatternPredicate(pattern, false))
      .map(resourceFactory::newResource)
      .filter(Objects::nonNull)
      .filter(Resource::exists)
      .map(MetaInfConfiguration::toDirectory) // if jar open, otherwise leave as is
      .filter(Resource::isDirectory) // probably not needed?
      .forEach(context.getMetaData()::addContentResource);

Edit: the toDirectory method is rename of ensureJarsAreOpened

@joakime
Copy link
Contributor Author

joakime commented Jul 31, 2023

To address some of the questions from @janbartel ...

A Jar file is different than an open Jar.

A Resource pointing to the file file://path/to/foo.jar is different than one where we are walking the contents of it via jar:file://path/to/foo.jar!/
This distinction has been present since the beginning.
While the majority of cases in code we need the open Jar, we do occasionally need the Jar file (eg: when we copy the WEB-INF/lib/foo.jar to ${webapp.tempdir}/foo.jar to support META-INF/resources)

Then there's the nuance of where we can be deep in a WAR (jar:file://opt/mybase/webapps/foo.war!/WEB-INF/lib/foo.jar) which is a Jar file reference inside of a war.

The purpose of this PR is to reduce the constant ResourceFactory.newResource() usages in favor of a simple Resource.resolve(String) where appropriate.
That means, in code, we work with open Jar files.
We don't have code that supported auto-opening of JAR files, but that can be a solution too. (I'll experiment, probably in a different branch).

This PR is intended to be a first-pass to cleanup our ResourceFactory usages in a sane way (so we have reference counting, and the dump is more sane too).
A future PR will cleanup our open Resource's too, if the WebApp Configuration is done and the Resource does not need to be open for the running of the webapp, then close it before starting the webapp.

@janbartel
Copy link
Contributor

I think If you can rename the 'ensureOpenJars' method to something that indicates what it is really doing, namely creating a directory resource, then things will be a lot clearer.

@gregw
Copy link
Contributor

gregw commented Aug 1, 2023

I think If you can rename the 'ensureOpenJars' method to something that indicates what it is really doing, namely creating a directory resource, then things will be a lot clearer.

I think toDirectory works if it is used with a stream map, as you map to directory (if already a directory then return it, if it can be mounted as a directory mount it, otherwise return null). In fact could probably simplify the suggestion above to:

  getAllContainerJars(context).stream()
      .filter(new UriPatternPredicate(pattern, false))
      .map(resourceFactory::newResource)
      .map(MetaInfConfiguration::toDirectory)
      .filter(Objects::nonNull)
      .forEach(context.getMetaData()::addContentResource);

@gregw
Copy link
Contributor

gregw commented Aug 1, 2023

The purpose of this PR is to reduce the constant ResourceFactory.newResource() usages in favor of a simple Resource.resolve(String) where appropriate.

That is a good goal and it is what this PR does.... but it took Jan and me half the width of Italy on a very fast train to work it out. I think the problem is, start with a confusing spec, implement in java 1.5, let go stale, then partially update to streams, then change resource abstraction a few times, then add a poorly named method in this PR, and it is all very confusing.

So I think we both like what this PR is doing, but perhaps not where it is doing it. Hence my suggestion to clean up the whole class and fully embrace the stream approach so that each phase of the stream well documents the transformations needed.

@joakime
Copy link
Contributor Author

joakime commented Aug 3, 2023

I think If you can rename the 'ensureOpenJars' method to something that indicates what it is really doing, namely creating a directory resource, then things will be a lot clearer.

I think toDirectory works if it is used with a stream map, as you map to directory (if already a directory then return it, if it can be mounted as a directory mount it, otherwise return null). In fact could probably simplify the suggestion above to:

  getAllContainerJars(context).stream()
      .filter(new UriPatternPredicate(pattern, false))
      .map(resourceFactory::newResource)
      .map(MetaInfConfiguration::toDirectory)
      .filter(Objects::nonNull)
      .forEach(context.getMetaData()::addContentResource);

I like it, and I'm working towards this approach (in a different branch ATM)

@joakime joakime closed this Oct 11, 2023
@joakime joakime deleted the fix/12.0.x/less-metainf-refcounts branch October 25, 2023 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
No open projects
Status: ✅ Done
Development

Successfully merging this pull request may close these issues.

Needless META-INF/resources and web-fragment.xml mounts
5 participants