Issues related to boot module layer synthesizing with closed-world approach #3733

ivan-ristovic · 2021-08-27T12:02:55Z

Describe the issue

NI boot module layer currently only contains modules which contents are reachable after analysis. This behavior follows the closed-world assumption and is naturally translated to module system as well. However, in the case of modules, it can in some cases cause unintended results. This issue shows a simple example, with the goal to open a discussion about potential changes in the current closed-world approach when it comes to modules.

Describe GraalVM and your environment:

This example assumes that graal repository is even with latest master, or that it is based on a commit after the merge of [GR-30957] Implement Module/ModuleLayer substitutions #3445

Steps to reproduce the issue

A simple example can be a modularized application with two modules - one of which will be a "library" module exporting a package that will be used by the "main" module.

The "library" module can be defined as:

module core.app {
        exports core.util;
}

It contains a class core.util.WorkerUtil with the following definition:

package core.util;

public class WorkerUtil {
        public static void doSomething() {
                System.out.println("WorkerUtil working...");
        }
}

The "main" module can be defined as:

module main.app {
        requires core.app;
}

Our main method, contained in main.app module, can be defined as:

package example.app;

import java.lang.ModuleLayer;
import core.util.WorkerUtil;

public class AppMain {

        // This function can also be in another module
        public static void useLibrary() {
                WorkerUtil.doSomething();
        }
        
        public static void main(String[] args) {
                useLibrary();
                assert ModuleLayer.boot().modules()
                        .stream()
                        .anyMatch(m -> m.getName().equals("core.app"));
        }
}

We can build an image of this application (in this example all jar files that are created by javac are located in pkg directory):

$ USE_NATIVE_IMAGE_JAVA_PLATFORM_MODULE_SYSTEM=true native-image -ea -p pkg -m main.app/example.app.AppMain

We can now verify that the assertion will hold by running the image. If we do not invoke the useLibrary() method, the entire core.app module will not be included in the image, as types from that module are not used. Re-running native-image in the same way as above (however this time not invoking useLibrary() method), and invoking the generated executable will lead to an AssertionError:

Exception in thread "main" java.lang.AssertionError
	at example.app.AppMain.main(AppMain.java:14)

This way modules can unintentionally break dependant modules simply by removing dependencies.

More details

For this example, the size of an executable which only contains only reachable modules (14 in this case): 11481768 B (~11M)

For this example, the size of an executable which contains all modules on the module path (26 in this case): 11485864 B (~11M)

From here, we can see that the difference is 4096 B, which averages to ~341 B per module (this value correlates to the number of packages contained in the module).

The text was updated successfully, but these errors were encountered:

olpaw · 2021-08-27T12:51:20Z

We can now verify that the assertion will hold by running the image. If we do not invoke the useLibrary() method, the entire core.app module will not be included in the image, as types from that module are not used. Re-running native-image in the same way as above (however this time not invoking useLibrary() method), and invoking the generated executable will lead to an AssertionError

IMO, this is a feature and not a bug.

It is equally simple to construct a trivial sample application that uses reflection that passes on JVM but fails as native-image (if no configuration is provided during image build).

The question is what compromise we are more willing to present to our users:

Always show all modules in the bootlayer even if no classes are needed from any of those modules. I.e. there could be images with lots of modules in the bootlayer that are all empty. This would be hard to explain to users.
Only show modules in the bootlayer of our actually reachable classes. For this behaviour we can provide the user with the same explanation that we already have for e.g. reflection, proxies, resources, ... that this is a result of us building the image under closed world assumption and that we are only putting into the image what can can seen as reachable by the static analysis plus image configuration.

olpaw · 2021-08-27T13:08:07Z

Btw, comparing the output of

public class HelloJava {
    public static void main(String[] args) {
        System.out.println(java.util.Arrays.toString(Package.getPackages()));
    } 
}

when running on the JVM compared to running as native-image gives very different results.

Here also the output of Package.getPackages() is driven by the analysis results as it is now also implemented for the boot module-layer seen at image runtime. I.e. the current boot module layer synthesizing with closed-world approach is consistent with how we handle things elsewhere.

christianwimmer · 2021-08-27T18:01:07Z

We certainly do not want to include completely unused modules in the image heap, neither from the module path nor from the JDK. Why should a Hello World contain module information about, e.g., the java.sql module.

Now the question is: when is a module "used". Clearly, a module is used when at least one class of that module is included. But modules also have dependencies. If I understand the module system correctly, then module dependencies are an acyclic graph. So I think when a module is used, also all of its dependencies should be in the image heap. So in the example above, main.app requires core.app and therefore the module information of core.app would be in the image heap.

But if you have another module core.second that is just added to the module path on the command line, but then no class of that module is reachable, then it would not be included in the image heap.

There are also optional module dependencies (requires static) - so what to do with those? Probably not include automatically.

What to do with the "unnamed" module, which automatically depends on all named modules? Not include all dependencies of course (because that would result in all modules being always included).

vjovanov · 2021-08-30T14:31:04Z

We certainly do not want to include completely unused modules in the image heap, neither from the module path nor from the JDK.

Why not? The overhead is below one page in the image heap.

Why should a Hello World contain module information about, e.g., the java.sql module.

Because it does that when one runs "Hello, World!" with java.

I am asking because this would mean that we need an additional config file, agent changes, etc, for the code above to work as expected. Given the low overhead of including modules into the heap, I would argue that it is not worth complicating config further.

christianwimmer · 2021-08-30T17:27:53Z

As I said, the example above (where there is a direct module dependency) needs to include the "library" module. So that then works "as expected".

And for the case of JDK modules and application modules that are just passed in on the command line and then not used: here you need to define what "expected" means. Clearly, we also do not preserve the list of .jar files passed in on the class path. If you expect that every class, package, or module that you specified on the command line is available at run time, then you cannot use a closed-world assumption.

And no, we do not need an additional configuration file for modules. When you want a module to be available at runtime, register one class of it (like the module-info class) for reflection.

vjovanov · 2021-08-31T12:04:25Z

The example above would not work if we are looking for any module from the JDK. But then, for such code, one should state in the module-info.java that the given JDK module is required.

I am OK with the proposal above given that the removal of completely unused modules can be also achieved with jlink.

olpaw · 2021-08-31T12:28:54Z

given that the removal of completely unused modules can be also achieved with jlink

Good argument!

A real-world application that relies on seeing unused modules at runtime is bogus/fragile for exactly that reason.

ivan-ristovic · 2021-08-31T15:01:28Z

Thanks for the feedback! This can be assigned to me, I will enhance the boot layer module set to include required modules as well.

ivan-ristovic added bug native-image labels Aug 27, 2021

ivan-ristovic changed the title ~~Issues related to module layer synthesizing with closed-world approach~~ Issues related to boot module layer synthesizing with closed-world approach Aug 27, 2021

vjovanov assigned christianwimmer, vjovanov and olpaw Aug 27, 2021

vjovanov unassigned christianwimmer Aug 31, 2021

olpaw assigned ivan-ristovic and unassigned vjovanov and olpaw Aug 31, 2021

ivan-ristovic mentioned this issue Sep 20, 2021

[GR-34108] Extend boot module layer to include required modules #3821

Merged

olpaw linked a pull request Sep 30, 2021 that will close this issue

[GR-34108] Extend boot module layer to include required modules #3821

Merged

olpaw added this to the 22.0 milestone Sep 30, 2021

graalvmbot closed this as completed in #3821 Oct 30, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issues related to boot module layer synthesizing with closed-world approach #3733

Issues related to boot module layer synthesizing with closed-world approach #3733

ivan-ristovic commented Aug 27, 2021 •

edited

Loading

olpaw commented Aug 27, 2021

olpaw commented Aug 27, 2021 •

edited

Loading

christianwimmer commented Aug 27, 2021

vjovanov commented Aug 30, 2021 •

edited

Loading

christianwimmer commented Aug 30, 2021

vjovanov commented Aug 31, 2021

olpaw commented Aug 31, 2021

ivan-ristovic commented Aug 31, 2021

Issues related to boot module layer synthesizing with closed-world approach #3733

Issues related to boot module layer synthesizing with closed-world approach #3733

Comments

ivan-ristovic commented Aug 27, 2021 • edited Loading

Describe the issue

Describe GraalVM and your environment:

Steps to reproduce the issue

More details

olpaw commented Aug 27, 2021

olpaw commented Aug 27, 2021 • edited Loading

christianwimmer commented Aug 27, 2021

vjovanov commented Aug 30, 2021 • edited Loading

christianwimmer commented Aug 30, 2021

vjovanov commented Aug 31, 2021

olpaw commented Aug 31, 2021

ivan-ristovic commented Aug 31, 2021

ivan-ristovic commented Aug 27, 2021 •

edited

Loading

olpaw commented Aug 27, 2021 •

edited

Loading

vjovanov commented Aug 30, 2021 •

edited

Loading