Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue #8886 - support extensible Resource URI schemes #8888

Merged
merged 10 commits into from
Nov 23, 2022

Conversation

joakime
Copy link
Contributor

@joakime joakime commented Nov 11, 2022

Initial experiment to see what Resource would look like with extensible URI scheme support.

This is to allow the eventual BundleResource implementation in the jetty-osgi packages that respond to a uri schemes like bundleresource://62.fwk1116834242/org/eclipse/jetty/ee9/webapp/webdefault-ee9.xml

@joakime joakime requested review from sbordet, gregw and lorban and removed request for sbordet, gregw and lorban November 11, 2022 16:17
@joakime joakime self-assigned this Nov 11, 2022
@joakime joakime linked an issue Nov 11, 2022 that may be closed by this pull request
ResourceFactory.addResourceFactory("custom", new CustomResourceFactory());
Resource resource = ResourceFactory.root().newResource("custom://foo");
assertThat(resource.getURI(), is(URI.create("custom://foo")));
assertThat(resource.getName(), is("custom-impl"));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shows how the custom uri scheme could be supported.

I see the bootstrap of jetty-osgi registering their ResourceFactory for the bundlresource scheme in it's initialization phase.

RESOURCE_FACTORIES.put("jar", new MountedPathResourceFactory());
PathResourceFactory pathResourceFactory = new PathResourceFactory();
RESOURCE_FACTORIES.put("file", pathResourceFactory);
RESOURCE_FACTORIES.put("jrt", pathResourceFactory);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this extra metadata, mountIfNeeded() could make mounting more generic. Here's some pseudocode to illustrate this:

ResourceFactory rf = RESOURCE_FACTORIES.get(scheme);
if (rf instanceof MountedPathResourceFactory) // or instanceof NewInterfaceThatIsAMountingMarker maybe?
  return FileSystemPool.INSTANCE.mount(uri);
else
  return null;

@@ -176,7 +168,7 @@ public static boolean isSameName(Path pathA, Path pathB)
{
if (!uri.isAbsolute())
throw new IllegalArgumentException("not an absolute uri: " + uri);
if (!bypassAllowedSchemeCheck && !ALLOWED_SCHEMES.contains(uri.getScheme()))
if (!bypassAllowedSchemeCheck && !SUPPORTED_SCHEMES.contains(uri.getScheme()))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I should review the need for bypassAllowedSchemeCheck, it probably doesn't have much purpose left

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the general approach. This would allow us to add back MemoryResource if the need arose in the future.

However, I'm wondering about the static nature of it? Should there be a non static API so that individual resource factories can be configured? Or are they free to use a non static impl anyway already? Perhaps the static methods need to named to make it clear they are default mappings used by the default factory implementations?

I also think it would be good to have a fallback resource factory, that is tried if the scheme is not known. By default this would be a factory that just returns null, but could be replaced by a factory that tried a URL to see if a handler was installed.

Since we know we have at least one usage of URL, we should probably still add a UrlResource impl, but only use it if there is a specific registration or a fallback registration.

@joakime
Copy link
Contributor Author

joakime commented Nov 11, 2022

Since we know we have at least one usage of URL, we should probably still add a UrlResource impl, but only use it if there is a specific registration or a fallback registration.

The OSGi layer's using URL stream handlers are going to have to change with the new JDK releases.
The URL stream handling is changing, significantly enough that the existing OSGI implementations must make changes to satisfy the new JDK reality.
(The OpenJDK folks are addressing the numerous StackOverflowException that the existing URL Stream handling causes, such as changing when a URL scheme is registered, and no longer supporting lazy initialization)

Having a UrlResource serves no purpose in the general sense.
For our jetty-osgi layer, we would need 2 implementations of this new ResourceFactory.

  • ClassicJdkBundleResourceFactory - using the old school URL Stream Handling that OSGi provides on older JDKs
  • ModernBundleResourceFactory - that uses the OSGi bundle behaviors directly, not via the URL Stream Handlers. (this is one way that the OSGi implementations are researching, the other being the FileSystem implementation, both not using URL Stream Handlers).

The jetty-osgi bootstrapping would have to make a call on what ResourceFactory to register/use, presumably based on some combination on OSGi support level, and the JDK runtime version level.

@gregw
Copy link
Contributor

gregw commented Nov 11, 2022

@joakime I don't understand. You say UrlResource "serves no purpose in the general sense", but then go onto say we need a resource factory that supports "old school URL Stream Handling"? So it serves the purpose to support that needed implementation.

Note also that a UrlResource is entirely decoupled from the URL stream handling mechanism. It may well be that stream handlers change in future JVMs, but the URL class will remain in all it's uglyness for sometime. A UrlResource class will continue to work against the URL API even if the handler mechanism underneath is significantly changed.

We have a use-case for it and it is a blocker for releasing jetty-12, so we need to support a UrlResource at least as an option.

@gregw
Copy link
Contributor

gregw commented Nov 11, 2022

Note also, that because URL handling might be changing in future JVMs, then that is PRECISELY why we should be hiding URL behind the Resource abstraction. Rather than have URL specific code scattered around our code base, we should just use Resource. If/when URL mechanism changes in the JVM, then we can adapt in our resource layer and all the code using it will now be able to deal with the new URL mechanism (or something else if that scheme is now handled by something different - like a modern bundle resource factory).

This is exactly why we have our resource abstraction, so we can hide such details and migrations from our code and deal with it in only one place.

@joakime
Copy link
Contributor Author

joakime commented Nov 14, 2022

"serves no purpose in the general sense"

This means in a general URL support sense.
We have support for schemes jar, jrt, and file currently.

We don't want to support all of the schemes that have registered URL Stream Handler's for Resource, that's just dangerous.
This is opening us up to have a log4j style attack.

Here's a small list, as seen during our jetty-osgi test execution (using Java 11)

protocol owner
ftp: JVM
http: JVM
https: JVM
jmod: JVM
jrt: JVM
mailto: JVM
war: Jetty
bundleentry: Eclipse OSGi
bundleresource: Eclipse OSGi
reference: Eclipse OSGi
fake: Felix
classpath: Pax
link: Pax
mvn: Pax
wrap: Pax
activator: Pax
bytes: Plexus (now deprecated)
nested: Eclipse Sisu
bndloader: bndlib

{
try
{
newConnection();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Existence could probably be cached, i.e have a Boolean that is null if never tested and set by the first call to newConnection()... but then we should not worry too much about this class being efficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this class is used for resourcebundle: based URL protocols then we don't want existence cached.
If this class is used for https: then we also don't want the existence cached.

Besides, any kind of caching on our part would just conflict with the URLConnection.setUseCaches(boolean) behaviors.

@gregw
Copy link
Contributor

gregw commented Nov 14, 2022

We don't want to support all of the schemes that have registered URL Stream Handler's for Resource, that's just dangerous

Primarily we don't want to use a URL stream handler for any resource for which we have a better implementation.
But I'm OK with not supporting arbitrary by default, so long as it is easy to define new mappings (and easy in XML/module).

+ CompositeResourceFactory is
  doing what Resource.create(URI) did
  before.
+ CompositeResourceFactory is tracking
  mounts, and allowing the ability
  to report onMounted (useful for RF.ROOT)
+ ResourceFactory.ROOT,
  ResourceFactory.Closable, and
  ResourceFactory.LifeCycle all use
  this new CompositeResourceFactory
+ Is package private
+ Add ResourceFactory.registerResourceFactory(String)
@joakime joakime marked this pull request as ready for review November 21, 2022 20:53
@joakime joakime requested review from lorban and gregw November 21, 2022 20:53
Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, but whilst this PR is needed and holding up other work... I still do not agree with the needless protocol check. It adds no value and is just something that will need to be worked around in future if we ever want a catch-all factory.

}
}

static class CompositeResourceFactory implements ResourceFactory
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is not a public name.... but CompositeResourceFactory is not really correct.
How about DelegatingResourceFactory as it delegates to a another factory.
Or maybe BySchemeResourceFactory as it chooses another factory by scheme.
Or to be absolutely precise DelegatingBySchemeResourceFactory.

Anything to avoid the contentious and misunderstood "Composite"!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about UriResourceFactory?

private int connectTimeout;
private boolean useCaches;

protected UrlResourceFactory(String protocol)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joakime said:

I'm leaving the check there.

Sorry, but I still don't agree with having the protocol here.

The protocol will be used ....

  • to ensure that an old ResourceFactory instance isn't being reused for a different protocol.

But that is controlled by the Index of scheme to factory. We don't need to encode checks that index works.

  • for output on toString() to indicate the supported protocol.

That is only meaningful in context. In context we'd be dumping the Index with it's full mapping of scheme to factory, so such a dump will now contain the scheme twice: once in the key and then again in the value.

  • for configuring the URLConnection via protocol based system properties (usecaches, connecttimeout, etc).

But is is not?

There's no harm in having multiple UrlResourceFactory implementations, one for each requested protocol, it's a support layer for old school behaviors anyway.

It is not "old school" yet. It is currently the only supported mechanism for several schemes. Besides, "old school" is not an excuse for not doing it right.

Having the protocol in this class prohibits it ever being used as a catch-all. I'm not sure we need a catch-all, but it is conceivable. If we wanted one, then with this needless protocol check, we'd have to iterator over all the URL handlers registered and register a factory for each. Or we could just trust that Map works and do away with a needless check.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • for configuring the URLConnection via protocol based system properties (usecaches, connecttimeout, etc).

But is is not?

Don't follow you on this one?

Having the protocol in this class prohibits it ever being used as a catch-all. I'm not sure we need a catch-all, but it is conceivable. If we wanted one, then with this needless protocol check, we'd have to iterator over all the URL handlers registered and register a factory for each. Or we could just trust that Map works and do away with a needless check.

A catch-all is dangerous, and should never be a solution we implement.
And we have an Index controlling use/access to these implementation already.

Copy link
Contributor

@gregw gregw Nov 22, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • for configuring the URLConnection via protocol based system properties (usecaches, connecttimeout, etc).

But is is not?

Don't follow you on this one?

You don't use it for configuring the "URLConnection via protocol bases system properties". The System properties used are class name based. Not sure we should be using any system properties for configuration anyway!

And we have an Index controlling use/access to these implementation already

Exactly, so we don't need a double check!

Comment on lines 40 to 41
this.connectTimeout = Integer.parseInt(System.getProperty(this.getClass().getName() + ".connectTimeout", "1000"));
this.useCaches = Boolean.parseBoolean(System.getProperty(this.getClass().getName() + ".useCaches", "true"));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not continue using System Properties for configuration. There are setters available.

Copy link
Contributor

@gregw gregw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@joakime joakime merged commit 84208f9 into jetty-12.0.x Nov 23, 2022
@joakime joakime deleted the fix/jetty-12-extensible-resource-schemes branch November 23, 2022 00:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Changes to Resource no longer support custom url schemes.
3 participants