Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Stable Analysis Plugin API #88980

Open
15 of 25 tasks
pgomulka opened this issue Aug 1, 2022 · 1 comment
Open
15 of 25 tasks

Stable Analysis Plugin API #88980

pgomulka opened this issue Aug 1, 2022 · 1 comment
Labels
:Core/Infra/Plugins Plugin API and infrastructure >enhancement Meta Team:Core/Infra Meta label for core/infra team

Comments

@pgomulka
Copy link
Contributor

pgomulka commented Aug 1, 2022

Description

Description
This is a meta issue tracking subtasks to introduce a new api for analysis plugins.

The analysis plugin api will offer a stable a stable api upon which a plugin developer can depend. Plugins using the new api will no longer have to have an elasticsearch:server dependency, and therefore will be less prone to be affected by breaking changes whenever ES server internals are changed.

The API is derived from org.elasticsearch.index.analysis and contains the minimum set of interfaces and methods to implement analysis plugin.

Goals:

  1. Design a minimum set of interfaces to allow for analysis plugin development
  2. Forbid server dependency to prevent plugins be affected by changes in server internals

Phase 0 - will allow to implement a plugin with no settings

  1. Design a minimum set of interfaces and methods which are subset of current analysis api. This task should create new gradle modules which will later be used as dependencies by plugin developers. The Api will also contain annotations which will allow analysis components (i.e. CharTokenFilterFactory), TokenizerFactory to be found. AnalysisPlugin interface will no longer be needed. Stable Plugin API module and analysis interfaces #88775 Fix module and package names for stable plugin api #89772 Rename NamedComponent name parameter to value #91306
  1. Search for analysis components. Currently done with the use of AnalysisPlugin interface which have getters returning Map of Names to AnalysisProvider interface. The new stable plugins will no longer need to implement AnalysisPlugin interface. The analysis componentsannotated with@NamedComponent` will be scanned upon install. As a result of a scan during install time a cache file will be created describing what components are exported by a plugin. (to be done in a separate task) This task is loading the named_component.json file and extensibles.json and storest taht information in StablePluginRegistry [Stable plugin API] Load plugin named components #89969
  2. Analysis component registration and creation. The analysis components implementing the new api will have to be wrapped in the old analysis interfaces to allow the code to work. This logic is currently done in AnalysisModule#setup* methods. Ideally the analysis component classes are only loaded when first used. Register stable plugins in ActionModule #90067
  3. An example plugin to be build presenting how the api can be used. Example stable plugin #90805
  4. Plugin Loading - A new stable plugin classloader to be implemented, preventing plugin to access server components. This plugin will load a group of unmodularized jars into a synthetic module, that is, as if all the classes in the jars were part of a single module.
  1. Plugin descriptor - a new plugin descriptor does not require AnalsysPlugin class and contains isModular property Add support for reading stable plugin descriptors #88731 and Add stable indicator for plugin descriptor #88823
  2. gradle es stable plugin - build plugin that creates stable-plugin-descriptor.properties and named-components.json Create gradle plugin for ES stable plugins #90355

Phase 1 - will allow to implement a plugin with analysis settings

  1. Introduce @Inject and @AnalysisSettings. @Inject should be used by plugin developer to mark constructor in his analysis component where settings will be injected. @AnalsysiSettings should be used by plugin developer to mark an interface with getter methods for analysis settings. Upon analysis component creation, an implementation should be generated for the annotated settings interface. Ideally we should ASM to have good performance (dynamic proxy is another option, but might have poor performance). Within plugin an interface is used and it should result in more readable code in a plugin. Example from PoC https://github.com/elastic/elasticsearch/compare/main...pgomulka::annotations?expand=1#diff-b7343d12bb8b9a99d1b06deedfe600755ab3e4bae45be6c2b6d634d851eae771R37

Phase 3. [ ] An existing plugin migration

Phase x. additional features and nice to have

Validation of plugin's setting interface

  • setting's name clashes with node (index, cluster? ) setting
  • validaiton upon settings creation types validation? range validation? This will introduce the Validator interface
@pgomulka pgomulka added >enhancement Meta needs:triage Requires assignment of a team area label :Core/Infra/Plugins Plugin API and infrastructure and removed needs:triage Requires assignment of a team area label labels Aug 1, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Aug 1, 2022
pgomulka added a commit that referenced this issue Aug 30, 2022
This commit adds stable analysis plugin API with analysis components interfaces and annotations.
It does not contain any usage of it yet. Separate changes to introduce example plugins or refactoring to existing ones will follow later.

It contains two gradle modules. One plugin-api with two annotations Nameable and NamedComponent, which can be reused for plugins other than analysis.
And second analysis-plugin-api which contains analysis components (TokenFilterFactory, CharFilterFactory etc)

NamedComponent - used by plugin developer - indicates that a Nameable component will be registered under a given name.
Nameable - for analysis plugins it is only used by the stable analysis api designers (ES) - indicates that component have a name and should be declared with NamedComponent

additional tasks that will follow: #88980
pgomulka added a commit that referenced this issue Sep 5, 2022
the convention for packages and module names is:
org.elasticsearch.plugin[.analysis].api

module-info.java and package-info.java were using incorrect names
and not following the convention
relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Sep 8, 2022
This commits adds an Extensible annotation aimed to mark
things that can be loaded by a component loader.
It also marks Analyzer api components (AnalyzerFactory, CharFilterFactory,
TokenFilterFactory and TokenizerFactory) with this annotation.

relates elastic#88980
pgomulka added a commit that referenced this issue Sep 9, 2022
This commits adds an Extensible annotation aimed to mark things that can be loaded by a component loader.
It also marks Analyzer api components (AnalyzerFactory, CharFilterFactory, TokenFilterFactory and TokenizerFactory) with this annotation.

relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Sep 9, 2022
Stable plugins are using @extensible and @NamedComponents annotations
to mark components to be loaded.
This commit is loading extensible classNames from extensibles.json and
named components from named_components.json

relates elastic#88980
pgomulka added a commit that referenced this issue Sep 13, 2022
Stable plugins are using @ extensible and @ NamedComponents annotations
to mark components to be loaded.
This commit is loading extensible classNames from extensibles.json and
named components from named_components.json

The scanning mechanism that can generate these files will be done later in a gradle plugin/plugin installer

relates #88980
pgomulka added a commit that referenced this issue Sep 21, 2022
Stable plugins are using @extensible and @NamedComponents annotations
to mark components to be loaded.
This commit is loading extensible classNames from extensibles.json and
named components from named_components.json

relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Sep 26, 2022
New ES stable plugins when built should have a stable-plugin-descriptor.properties file
instead of plugin-descriptor.properties.
New plugins also do not use classname property in the plugin descriptor

relates elastic#88980
pgomulka added a commit that referenced this issue Oct 10, 2022
New ES stable plugins when built should have a stable-plugin-descriptor.properties file instead of plugin-descriptor.properties.
New plugins also do not use classname property in the plugin descriptor
new plugin will also scan classes and libraries for @NamedComponents and will create named_components.json file. That file contains a map of Extensible interface (like TokenizerFactory) to a map of "component name" to "className"

This commit extracts common logic from PluginBuildPlugin into BasePluginBuildPLugin so that it can also be used by StableBuildPlugin
the differences are:
classaname - used in old plugin, but not in the new one (stable)
the plugin descriptor file name - the new one has stable-plugin-descriptor.properties
dependencies - the new plugin does not need elasticsearch as a dependency.
We might want to consider if we want to add test framework dependency in the future.

relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 12, 2022
…plugin descriptor

Stable plugins do not use className, extendedPlugins and hasNativeController properties.
They also dont necessarily have moduleName.
These properties should not be present in stable-plugin-descriptor, even if they are empty
because the parsing will fail compalining about unused properties.

Old plugins can use these properties, but they use defaults when the property is not present
These defaults are:
has.native.controller = false
extended.plugins = emptyList
moduleName = null
classname is required and will throw excpetion when old plugin is not configuring this property

relates elastic#88980
pgomulka added a commit that referenced this issue Oct 12, 2022
…plugin descriptor (#90835)

Stable plugins do not use className, extendedPlugins and hasNativeController properties. They also don't necessarily have moduleName.
These properties should not be present in stable-plugin-descriptor, even if they are empty because the parsing will fail complaining about unused properties.

Old plugins can use these properties, but they use defaults when the property is not present These defaults are:
has.native.controller = false
extended.plugins = emptyList
moduleName = null
classname is required and will throw exception when old plugin is not configuring this property

relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Oct 13, 2022
Stable plugins do not extend Plugin class and are not instantiated in PluginService.
Therefore a simple StablePluginPlaceHolder has to be used in order to make pluginService
aware of the plugin and its descriptor. It is needed to return information about loaded
stable plugins in cluster state.

relates elastic#88980
pgomulka added a commit that referenced this issue Oct 14, 2022
Stable plugins do not extend Plugin class and are not instantiated in PluginService.
Therefore a simple StablePluginPlaceHolder has to be used in order to make pluginService
aware of the plugin and its descriptor. It is needed to return information about loaded
stable plugins in cluster state.

relates #88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Nov 4, 2022
to allow @NamedComponent("name") syntax java annotation should
have single value annotation

relates elastic#88980
pgomulka added a commit that referenced this issue Nov 4, 2022
to allow @NamedComponent("name") syntax java annotation should have single value annotation

relates #88980
pgomulka added a commit that referenced this issue Dec 13, 2022
Stable plugins do not have a dependency on server, therefore cannot access Settings, NodeSettings or IndexSettings classes. Plugins implementing new stable plugin api will use set of annotations to mark an interface that works a as a facade for settings used by their plugin.
This will allow to validate the values provided against the restrictions defined in the plugin's settings interface

This commit introduces set of annotations in libs/plugin-api that allow to annotate an interface in plugins that will be later injected into a plugin instance. These annotations can possibly be used not only by analysis plugins in the future.
The implementation of the interface generated in server is using dynamic proxy mechanism.

relates #88980
pgomulka added a commit that referenced this issue Jan 9, 2023
new stable plugins require generated named_components.json file which contains all analysis components implemented by this plugin. The generation is currently done in build-tools by elasticsearch.stable-esplugin
However this makes the generation only available for plugins using gradle. Plugin developers using maven or other building tooling will not be able to use it.

This commits refactors the scanning logic into libs:plugin-scanner which will allow for plugin install command to perform the scanning too.

relates #88980
pgomulka added a commit that referenced this issue Jan 11, 2023
New stable plugins example with injected settings.
Plugin developer creates an interface and annotates it with @AnalysisSettings.
The constructor in a plugin component (annotated with @NamedComponent) has to be annotated with @Inject
Upon plugin component creation an implementation of the interface will be created - a dynamic proxy - which will delegate methods from interface to properties in a Settings instance.
This PR introduces an example of using all currently supported types : int, long, double, string and list (of strings)
relates ##88980
danielmitterdorfer pushed a commit to danielmitterdorfer/elasticsearch that referenced this issue Jan 12, 2023
New stable plugins example with injected settings.
Plugin developer creates an interface and annotates it with @AnalysisSettings.
The constructor in a plugin component (annotated with @NamedComponent) has to be annotated with @Inject
Upon plugin component creation an implementation of the interface will be created - a dynamic proxy - which will delegate methods from interface to properties in a Settings instance.
This PR introduces an example of using all currently supported types : int, long, double, string and list (of strings)
relates #elastic#88980
pgomulka added a commit to pgomulka/elasticsearch that referenced this issue Jan 13, 2023
new stable plugins require generated named_components.json file which contains all analysis components implemented by this plugin. The generation is currently done in build-tools by elasticsearch.stable-esplugin
However this makes the generation only available for plugins using gradle. Plugin developers using maven or other building tooling will not be able to use it.

This commits refactors the scanning logic into libs:plugin-scanner which will allow for plugin install command to perform the scanning too.

relates elastic#88980
pgomulka added a commit that referenced this issue Jan 14, 2023
new stable plugins require generated named_components.json file which contains all analysis components implemented by this plugin. The generation is currently done in build-tools by elasticsearch.stable-esplugin However this makes the generation only available for plugins using gradle. Plugin developers using maven or other building tooling will not be able to use it.

This commits refactors the scanning logic into libs:plugin-scanner which will allow for plugin install command to perform the scanning too.

relates #88980
backports #92437
pgomulka added a commit that referenced this issue Jan 18, 2023
stable plugins not build with ES's gradle plugin will not have named_components.json file.
To allow these plugins to expose their named components, a scan can be performed upon install.

relates #88980
nik9000 referenced this issue May 11, 2023
SourceLookup mixes up several concerns - lazy loading, map access to scripts,
different access providers - and duplicates logic (such as that choosing how to
apply filtering) that is better handled directly in the Source interface.

This commit removes SourceLookup entirely and replaces it with a new
SourceProvider interface, with a simple stored fields reader implementation.
SearchLookup implements this interface directly, and the fetch phase uses
a custom implementation to provide its separately loaded source to fetch-time
scripts.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Plugins Plugin API and infrastructure >enhancement Meta Team:Core/Infra Meta label for core/infra team
Projects
None yet
Development

No branches or pull requests

2 participants