-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
init: PluginFactory support #1
Conversation
Some plugins operate by wrapping one or more separately-maintained plugins, and do so by instantiating them directly. This can be problematic because the pipeline's execution context is not propagated to the inner plugin, which can cause features that rely on execution context such as pipeline settings to not work as expected. This support adapter introduces a PluginFactory, that can be used to instantiate inner plugins that are pre-contextualized with the outer plugin's execution context. By including this support adapter into a plugin, we ensure that the plugin has a `LogStash::Plugin#plugin_factory`, which can be used to load contextualized proxies for plugin classes, which in turn can be instantiated in the normal way, replacing the direct instantiation of the plugin class: ~~~ diff def register grok_args = {} - @inner = ::LogStash::Filters::Grok.new(grok_args) + @inner = plugin_factory.filter('grok').new(grok_args) end ~~~
When instantiating an inner plugin without an explicit id, ensure that a meaningful distinct id is used that includes the parent plugin's id and context about the inner plugin's nature.
INIT_MUTEX = Mutex.new | ||
private_constant :INIT_MUTEX | ||
|
||
## | ||
# @return [PluginFactory] | ||
def plugin_factory | ||
@_plugin_factory || INIT_MUTEX.synchronize do | ||
@_plugin_factory ||= PluginFactory.new(self) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review note:
I'm not overly-fond of introducing a global lock here, and hadn't planned on having our PluginFactory
being permanently referenced from a plugin after being used, but I found both justified in order to provide meaningful default distinct id generation that has a unique sequence number scoped to the outer plugin.
- the cost of an outer plugin retaining a
PluginFactory
is very small, since it effectively contains only a single reference to the outer plugin itself and a singleAtomicLong
. - the scope of locking is very small (single simple object instantiation) and is avoided after an outer plugin has successfully acquired a single
PluginFactory
.
If we do not want this lock, or want to free the PluginFactory
as soon as it is used, we could use a single global sequence id generator to ensure the distinctness of ids, but the distinct number at the tail would become global and no longer related to the id of the outer plugin.
class PluginFactory
sequence_id = java.util.concurrent.atomic.AtomicLong.new(0)
SEQUENCE_GENERATOR = proc { sequence_id.increment_and_get }
private_constant :SEQUENCE_GENERATOR
# ...
private
def next_sequence_id
SEQUENCE_GENERATOR.call
end
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect plugins to define multiple inner plugins of the same type?
And if so, is it unreasonable to mandate that they provide a unique (hopefully meaningful) id to disambiguate between these inner plugins, rather than introducing/requiring a global lock to generate a sequence number?
Also, are we expecting the id
to show up anywhere other than in the debug
logs at creation? As far as I could tell, that was the only place it shows up, seeing as the inner plugin does not show up in the monitoring API..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect plugins to define multiple inner plugins of the same type?
Codecs come to mind, but I don't think we have reasonable expectations either way.
is it unreasonable to mandate that they provide a unique (hopefully meaningful) id to disambiguate between these inner plugins, rather than introducing/requiring a global lock to generate a sequence number?
I would rather do so in one testable place than require each consumer to handle their own logic. In the above comment I provided a way to have a lock-free global sequence number, which would make it very difficult for a user to use the factory incorrectly (no implicit collisions). Another way would be to have "#{outer_id}/inner-#{inner_type}-#{inner-name}@#{SecureRandom.uuid}"
.
Also, are we expecting the
id
to show up anywhere other than in thedebug
logs at creation?
Maybe? I can't remember the specifics of logging context, but I believe that the plugin id is part of the logger context (or should be?). I would hope that a plugin instantiated with an execution context would get wired up to all of the metrics at some point in the future, even if it isn't now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense.
I think I was a little surprised that there was nothing to contextualize the inner plugin with the outer plugin when supplying an id
, compared to what we see with no id
specified. Maybe some logging to state that an inner plugin is being created and to what plugin it belongs to might suffice
By avoiding mocking our outer instance's `Plugin#execution_context`, we eliminate an rspec-internal race condition that occurs when tracking calls to that mock from many threads simultaneously.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great stuff! I have a couple of comments inline, plus a more general one - do we have any sort of plan to have inner plugins somehow visible in the monitoring API, and/or a mechanism to publish their stats?
.travis.yml
Outdated
- logstash-plugins/.ci:travis/travis.yml@1.x | ||
matrix: | ||
include: | ||
- env: ELASTIC_STACK_VERSION=7.9.3 # last release pre-introduction of core Contextualizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the end of life matrix, 7.9.3
is no longer supported - does it make sense to add the extra complexity for an unsupported version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can simplify and require 7.17, which is the only 7.x that is not EOL.
Doing so will require users to be on 7.17 to consume fixes to any plugins that rely on this feature, but will also simplify the adapter and specs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apparently 7.13
+ is still supported through February. 🤷
I'm easy either way on this, but I tend to prefer simplification if possible
s.name = 'logstash-mixin-plugin_factory_support' | ||
s.version = '1.0.0' | ||
s.licenses = %w(Apache-2.0) | ||
s.summary = "Support for the Plugin Factory introduced in Logstash 8.3, for plugins wishing to use this API on older Logstashes" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks outdated, unless I'm missing something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. This was written when I was hoping to target a companion core-support in 8.3. Good catch. I'll reword.
INIT_MUTEX = Mutex.new | ||
private_constant :INIT_MUTEX | ||
|
||
## | ||
# @return [PluginFactory] | ||
def plugin_factory | ||
@_plugin_factory || INIT_MUTEX.synchronize do | ||
@_plugin_factory ||= PluginFactory.new(self) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we expect plugins to define multiple inner plugins of the same type?
And if so, is it unreasonable to mandate that they provide a unique (hopefully meaningful) id to disambiguate between these inner plugins, rather than introducing/requiring a global lock to generate a sequence number?
Also, are we expecting the id
to show up anywhere other than in the debug
logs at creation? As far as I could tell, that was the only place it shows up, seeing as the inner plugin does not show up in the monitoring API..
# @param params [Hash{String=>Object}] | ||
# @return [LogStash::Plugin] | ||
def new(params={}) | ||
params_with_id = params.include?('id') ? params : params.merge('id' => generate_inner_id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to scope this provided id underneath the outer id? Or provide an option to do so?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would like to avoid changing the semantics of explicitly-given parameters, and right now this PluginProxy<type.downcase,name.downcase>#new
behaves identically to its counterpart LogStash::#{type}s::#{name}::new
method for all explicit parameters.
I consider providing an implicit useful default to be non-breaking of expectations.
We could require an explicit id
parameter, but that would push the complexity to calling code where it is likely to be implemented in different ways by different callers, typically with less contextual help than we can offer by generating a default in a consistent way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This now logs a bread-crumb linking the inner plugin id with its outer plugin id, at debug-level because this really is an implementation detail of the outer plugin and I would like to avoid flooding info
-level since excessive info
-level noise tends to distract people who are digging into unrelated problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Release notes
Introduced PluginFactory support adapter, to allow plugins to safely instantiate inner plugins while maintaining execution context.
What does this PR do?
Some plugins operate by wrapping one or more separately-maintained plugins,
and do so by instantiating them directly. This can be problematic because
the pipeline's execution context is not propagated to the inner plugin, which
can cause features that rely on execution context such as pipeline settings
to not work as expected.
This support adapter introduces a
PluginFactory
, that can be used toinstantiate inner plugins that are pre-contextualized with the outer plugin's
execution context.
By including this support adapter into a plugin, we ensure that the plugin
has a
LogStash::Plugin#plugin_factory
, which can be used to loadcontextualized proxies for plugin classes, which in turn can be instantiated
in the normal way, replacing the direct instantiation of the plugin class:
As with our other support adapters, this reaches into Logstash internals to
effectively back-port functionality from new Logstash core to plugins when
they are run on older Logstashes, and is expected to be API-stable with the
feature as-implemented in Logstash core.
LogStash::Plugin
,LogStash::Plugin#plugin_factory
returns aPluginFactory
, which supportsPluginFactory#input(input_name)
->PluginClassProxy#new(params={})
-> input instancePluginFactory#output(output_name)
->PluginClassProxy#new(params={})
-> output instancePluginFactory#codec(codec_name)
->PluginClassProxy#new(params={})
-> codec instancePluginFactory#filter(filter_name)
->PluginClassProxy#new(params={})
-> filter instanceWhy is it important/What is the impact to the user?
Inner plugins need access to the effective execution context in order to properly respect settings like
pipeline.ecs_compatibility
.Checklist
How to test this PR locally
A separate PR will be made against the HTTP Input plugin, which embeds codecs, so that they can properly respect the pipeline-level settings with regard to ECS compatibility
Related issues
Related: elastic/logstash#13978
Closes
Closes #2