Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java 9: requiring elasticsearch from module-info.java fails with "module not found" #28984

Closed
reda-alaoui opened this issue Mar 12, 2018 · 23 comments
Labels
:Delivery/Build Build or test infrastructure jdk9 Team:Delivery Meta label for Delivery team team-discuss

Comments

@reda-alaoui
Copy link

reda-alaoui commented Mar 12, 2018

Elasticsearch version (bin/elasticsearch --version): 6.2.2

JVM version (java -version): 9.0.4

OS version (uname -a if on a Unix-like system): Ubuntu 17.10

Description of the problem including expected versus actual behavior:

We are migrating our Maven project to Java 9.
In each module of our project, we add a module-info.java.
Everything works fine for every dependencies except elasticsearch.

Requiring it from module-info fails with:

[ERROR] Failed to execute goal org.apache.maven.plugins:maven-compiler-plugin:3.7.0:compile (default-compile) on project foo: Compilation failure
[ERROR] /foo/src/main/java/module-info.java:[10,12] module not found: elasticsearch

Steps to reproduce:

  1. Create basic maven project targeting JDK 9
  2. Add elasticsearch as a dependency
  3. Add a module-info.java with "requires elasticsearch;"
  4. Try to build the project

Analysis:

I debugged maven-compiler-plugin and that lead me to jdk.internal.module.ModulePath.
It fails in jdk.internal.module.ModulePath at line 557:

// parse each service configuration file
        for (String sn : serviceNames) {
            JarEntry entry = jf.getJarEntry(SERVICES_PREFIX + sn);
            List<String> providerClasses = new ArrayList<>();
            try (InputStream in = jf.getInputStream(entry)) {
                BufferedReader reader
                    = new BufferedReader(new InputStreamReader(in, "UTF-8"));
                String cn;
                while ((cn = nextLine(reader)) != null) {
                    if (cn.length() > 0) {
                        String pn = packageName(cn);
                        if (!packages.contains(pn)) {
                            String msg = "Provider class " + cn + " not in module";
                            throw new InvalidModuleDescriptorException(msg); // <-- FAILING HERE
                        }
                        providerClasses.add(cn);
                    }
                }
            }
            if (!providerClasses.isEmpty())
                builder.provides(sn, providerClasses);
        }

The issue comes from the fact that no elasticsearch packages match the following declared Java services:

  • server/src/main/resources/META-INF/services/org.apache.lucene.codecs.Codec
  • server/src/main/resources/META-INF/services/org.apache.lucene.codecs.PostingsFormat
  • server/src/main/resources/META-INF/services/org.apache.lucene.codecs.DocValuesFormat

Those service declarations either need to be moved to another project, or a fake class should be added to org.apache.lucene.codecs.

@reda-alaoui reda-alaoui changed the title Java 9: requires elasticsearch is failing Java 9: requires elasticsearch from module-info.java fails Mar 12, 2018
@reda-alaoui reda-alaoui changed the title Java 9: requires elasticsearch from module-info.java fails Java 9: requiring elasticsearch from module-info.java fails Mar 12, 2018
@rjernst
Copy link
Member

rjernst commented Mar 13, 2018

no elasticsearch packages match the following declared Java services

Can you elaborate on this? I think Codec and DocValuesFormat are leftover from long ago when we had custom versions of these in ES. PostingsFormat contains Completion50PostingsFormat which is still used. Is the issue just that Codec and DocValuesFormat are empty (we can remove them)?

@reda-alaoui
Copy link
Author

I added a Gradle test module in the PR, but I couldn't reproduce my maven project behaviour.
Maybe it is just a Maven issue.

@rjernst,
I didn't see that both of them were empty.
Still, the issue is also caused by org.apache.lucene.codecs.PostingsFormat for which I had to add 2 fake classes:

This is really weird.
I am creating a git repository with a Maven project reproducing the issue.

@reda-alaoui
Copy link
Author

I created https://github.com/Cosium/jigsaw-elasticsearch which allows to reproduce the issue.

@reda-alaoui
Copy link
Author

And a ticket in Maven compiler tracker https://issues.apache.org/jira/browse/MCOMPILER-328

@reda-alaoui
Copy link
Author

https://issues.apache.org/jira/browse/MCOMPILER-328?focusedCommentId=16398365&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16398365 confirms that Maven get the automodule name directly from the JDK.

That would mean that Gradle computes the module name itself, which is more brittle.

@reda-alaoui
Copy link
Author

reda-alaoui commented Mar 14, 2018

I created a Gradle equivalent:
https://github.com/Cosium/jigsaw-elasticsearch-gradle
I reproduce the same issue as Maven.

Therefore:

@reda-alaoui reda-alaoui changed the title Java 9: requiring elasticsearch from module-info.java fails Java 9: requiring elasticsearch from module-info.java fails with "module not found" Mar 14, 2018
@jasontedor jasontedor added the :Delivery/Build Build or test infrastructure label Mar 14, 2018
@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra

@reda-alaoui
Copy link
Author

reda-alaoui commented Mar 15, 2018

After much debugging !

Classpath vs Modulepath in Java 9

The modulepath should contain all modular jars. That includes jars containing well defined module-info.java and jar eligible to automodule.
Elasticsearch is not currently eligible to automodule because it violates some Jigsaw encapsulation about Java services.

A jar in the module path can't access a jar in the classpath.

Maven behaviour

For each jar:

  • Maven provides the jar to JDK ModuleFinder
  • If the ModuleFinder considers the Jar as a module, Maven adds the Jar to --module-path
  • If the ModuleFinder does not consider the Jar as a module (i.e. elasticsearch), Maven adds it to -classpath

With elasticsearch as unique dependency:

[DEBUG] Command line options:
[DEBUG] -d /srv/git/cosium/jigsaw-elasticsearch/target/classes \
-classpath /srv/git/cosium/jigsaw-elasticsearch/target/classes:/home/rhousni/.m2/repository/org/elasticsearch/elasticsearch/6.2.2/elasticsearch-6.2.2.jar: \
-sourcepath /srv/git/cosium/jigsaw-elasticsearch/src/main/java:/srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations: \
-s /srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations \
-g -verbose -nowarn -target 9 -source 9

With commons-io as unique dependency:

[DEBUG] Command line options:
[DEBUG] -d /srv/git/cosium/jigsaw-elasticsearch/target/classes \
-classpath /srv/git/cosium/jigsaw-elasticsearch/target/classes: \
--module-path /home/rhousni/.m2/repository/commons-io/commons-io/2.4/commons-io-2.4.jar: \
-sourcepath /srv/git/cosium/jigsaw-elasticsearch/src/main/java:/srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations: \
-s /srv/git/cosium/jigsaw-elasticsearch/target/generated-sources/annotations \
-g -verbose -nowarn -target 9 -source 9

Gradle behaviour

Gradle doesn't make any kind of jigsaw module autodetection.
By default, even in Java 9, Gradle will put all dependencies into the classpath.
This is confirmed by the tutorial that expects us to populate the --module-path and override -classpath manually.

The experimental jigsaw plugin, seems to do the same thing:
https://github.com/gradle/gradle-java-modules/blob/master/src/main/java/org/gradle/java/JigsawPlugin.java

Conclusion

The jar built by my project being defined by a module-info.java, it must resides in --module-path. Since elasticsearch fails the JDK automodule selection phase during Maven build, elasticsearch ends in -classpath, making it unreachable by my jar.

IMO, there is no point in trying to reproduce the current issue with a dedicated Gradle module, since it entirely depends on the way the end user will configure Gradle. Besides, it is unlikely that Gradle user is aware of the current issue since he will probably never use the JDK api allowing to select automodules.

So the fix seems to be the good one to me.
The question seems to be what kind of automatic test could detect this kind of defect.

I think it would be easier to have a Maven module, since Maven has the correct behaviour.
What do you think?

@reda-alaoui
Copy link
Author

Ping @rjernst , @jasontedor

@reda-alaoui
Copy link
Author

Ping @rjernst, @jasontedor

@nipafx
Copy link

nipafx commented Jun 26, 2018

I don't understand how this is a >non-issue. On the contrary, this is a serious issue for anyone who wants to use Elasticsearch with the module system and I think it should be investigated.

@jasontedor
Copy link
Member

@nicolaiparlog The label was a mislabeling and I have removed it.

@reda-alaoui
Copy link
Author

@jasontedor the label is still on the linked PR

@tomdw
Copy link

tomdw commented Aug 10, 2018

@jasontedor any idea on when this will be fixed/included in a release? thx!

@uschindler
Copy link
Contributor

You can only fix this by making Elasticsearch a real module. Automodules don't work for Elasticsearch. There are several problems: Elasticsearch contains classes from a package it does not own (org.apache.lucene) and also defines SPIs for them. This needs to be exposed via module-info.java. But this won't still not work, as Lucene is not module conformant (for the same reasons).

@tomdw
Copy link

tomdw commented Sep 5, 2018

@uschindler as the discussion in this issue shows, there is a fix available, it only has to be merged and released. So converting to an explicit module is not needed yet. Putting an automatic module name in the manifest does no harm, on the contrary, it makes sure that elasticsearch reserves its module name.

@uschindler
Copy link
Contributor

The problem is that it won't still not work at runtime, because the SPI won't load. The good thing is that Elasticsearch does no longer implement SPIs (Codecs). And if you just use the Elasticsearch client, you won't run into trouble, because Lucene is just a dependency, but it's actually not used (only for classloading).

@uschindler
Copy link
Contributor

So in short there are multiple problems: The empty/missing META-INF/services are a bug in Elasticsearch. This violates the spec (also for older Java versions, but nothing cares). Once you have fixed this and added an automatic module name, it compiles.
But this does not mean that it works at the end. Lucene unfortunately is defining the same packages in several modules, so at runtime this would cause issues with automatic modules. There is no fix for this at the moment and there won't be one for longer time (this would require a complete restructuring of Lucene). In addition, the SPI infrastructure of Lucene needs an update to be Java 9 compliant, which is also not yet done.
If you use Elasticsearch at the moment just as a client, then it could be fine also at runtime, but that's just because Java does not fully verify the module requirements/corectness at the moment.

@tomdw
Copy link

tomdw commented Sep 6, 2018

@uschindler thx for the clarification. I do think that if this gets fixed for elasticsearch it can be deployed as automatic module and we can keep lucene on the classpath for now. (automatic modules can read from the classpath) That way at least explicit modules can use the elasticsearch automatic module, and lucene remains on the classpath until a redesign has been done.

@tomdw
Copy link

tomdw commented Jan 12, 2019

@jasontedor can someone prioritise this issue so that the elasticsearch jar is at least compatible with jdk specs? i.e. including META-INF/services files for classes and packages that don't exist in the jar is just wrong.

Then the community can start using elasticsearch on the module path of the jdk.

Thx in advance.

@alpar-t
Copy link
Contributor

alpar-t commented Feb 1, 2019

I did some tests on the current master. Since we don't support elasticsearch, I think that the us-case that triggers this is being able to write a modularized application that talks to ES via one of the clients.
Since ES doesn't produce modules yet, one can only use automatic modules.
the approach I took was to start with this modules-info.java:

module org.elasticsearch.qa { 
   requires elasticsearch.rest.high.level.client;
}

And just a class to test the communication:

package org.elasticsearch.qa;

import org.apache.http.HttpHost;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

public class SmokeTestHighLevelRestClient {

    public static void main(String [] args) {
        var client = new RestHighLevelClient(
            RestClient.builder(new HttpHost("localhost", 9200, "http"))
        );
        var searchRequest = new SearchRequest();
        var searchSourceBuilder = new SearchSourceBuilder();
        searchSourceBuilder.query(QueryBuilders.matchAllQuery());
        searchRequest.source(searchSourceBuilder);
        client.search(searchRequest, RequestOptions.DEFAULT);
    }
}

Note that this now requires to have more requires in modules-info.java:

module org.elasticsearch.qa { 
   requires elasticsearch.rest.high.level.client;
   requires elasticsearch;
   requires httpcore;
}

We have to tell Gradle to put these jars on the module path and make them automatic modules.

configurations {
    modulePath
}

dependencies {
    modulePath "org.elasticsearch.client:elasticsearch-rest-client:$version"
}

configurations {
    modulePath
    compile.extendsFrom modulePath
}

dependencies {
    modulePath "org.elasticsearch.client:elasticsearch-rest-client:$version"  {
       exclude group: " org.apache.lucene"
   }
}

Like @uschindler mentions, elasticsearch has classes in lucene packages and lucene itself has packages spread across modules.
The lather wouldn't be a problem for ES at least not with automatic modules as we could keep Lucene on the classpath only as automatic modules are allowed to access it and it's used for class-loading.
Elasticsearch itself however has the same problem as lucene, it uses the same package in different modules, and fixing that would require shuffling packages around and breaking changes. E.x. both the elasticsearch and high-level-rest-client jar have org.elasticsearch.client.

Probably the best path to be able to use the high level rest client in a modular application is to remove elasticsearch(server) as a dependency, and reshuffle packages so it doesn't have any in common with the low level rest client. If the dependencies allow it we could potentially make it into a real module.

I was not able to replicate the issues with the empty service files just by requiring elasticsearch from a modules-info. I haven't checked what changed, but the linked PR that adds a reproduction is old. I tried against latest master.

module org.elasticsearch.qa {
   requires elasticsearch;
}

With this Gradle setup

onfigurations {
    modulePath {
        transitive false
    }
}

dependencies {
    modulePath "org.elasticsearch:elasticsearch:$version"
}

compileJava {
    doFirst {
        options.compilerArgs += [
            '--module-path', configurations.modulePath.asPath,
        ]
    }
}

I'm not saying we shouldn't remove the empty service files. I'll create a PR for that, no need to keep them around, but I don't see how this setup could be useful.
It doesn't pick up transitive dependencies so it can't do anything usefully, and picking up transitive dependencies on the module path breaks it.

To summarize I would like to move forward with the following:

  • open a PR just to remove empty service files
  • close this issue and the linked PR
  • submit a new issue to make the high level rest client into a module and take the discussion there for clarity. Note the the effort required to do so is significant and involves breaking changes that can only be done in major versions.

@jasontedor @rjernst thoughts ?

Right now the best bet for modular applications is to use the low level rest client, or keep all the elasticsearch jars on the classpath and have all ES specific code localized in an automated module in their application to be able to access them. This would allow the rest of the application to benefit from modularization without requiring elasticsearch to change.

@reda-alaoui
Copy link
Author

reda-alaoui commented Feb 1, 2019

I was not able to replicate the issues with the empty service files just by requiring elasticsearch from a modules-info

As mentionned in the previous messages, if you did that with Gradle it's normal.
Try with Maven.

@alpar-t
Copy link
Contributor

alpar-t commented Feb 4, 2019

Closing in favor of #38299

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Delivery/Build Build or test infrastructure jdk9 Team:Delivery Meta label for Delivery team team-discuss
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants