Skip to content

Conversation

@15ljindal
Copy link
Contributor

@15ljindal 15ljindal commented Feb 14, 2020

Adds config option "proxy.config.plugin.dynamic_reload_mode" which can take values 0 (disable) or 1 (enable). The default value will be 0.
This will be applicable only for remap plugins (global plugins do not yet support dynamic reload).
The assumption is that if dynamic reload is disabled, it will be disabled for all plugins specified in remap.config.
This is applicable for ATS 9 (branch 9.0.x)

@15ljindal 15ljindal changed the title Issue #6400 - Add config option to enable/disable dynamic reload feature Issue #6400 - Add config option to enable/disable dynamic plugin reload feature Feb 14, 2020
@sudheerv sudheerv added this to the 10.0.0 milestone Feb 14, 2020
@vmamidi
Copy link
Contributor

vmamidi commented Feb 14, 2020

This can't be overridden?

@15ljindal
Copy link
Contributor Author

15ljindal commented Feb 15, 2020

This can't be overridden?

Sorry, but I am not sure what exactly you are asking here. We have this config parameter which will simply turn off this feature so that we have old-style loading of all plugins mentioned in remap.config.

@vmamidi
Copy link
Contributor

vmamidi commented Feb 15, 2020

This can't be overridden?

Sorry, but I am not sure what exactly you are asking here. We have this config parameter which will simply turn off this feature so that we have old-style loading of all plugins mentioned in remap.config.

Can it be changed to turn on or off per remap rule?

@gtenev
Copy link
Contributor

gtenev commented Feb 15, 2020

@15ljindal, thank you for working on this PR! Here is the requested review.

To achieve plugin isolation ATSv9 opens with dlopen()a temporary DSO file copy. Hence adding an on-off-switch to skip the copying is obviously the right direction.

I wonder if you would agree to change your implementation after considering the following points.

Reduce the scope of the option?

If we implement the on-off-switch in PluginDso we prevent practically the isolation/reloading of all plugin types regardless. What we actually need here is a on-off-switch only for global and remap plugins (for compatibility).

ATSv9 plugin loading infrastructure that enabled the “plugin reload” feature was also meant to unify all DSO loading mechanisms in ATS. One of the design goals was to allow eventually to move global plugins over and use the same infrastructure to (re)load other non-global and non-remap plugins types, one example could be @jrushford's parent selection plugins.

Rename configuration option?

Instead of thinking in terms of switching on and off the “plugin reload” maybe we could think more in terms of switching on and off the isolation between the plugins of different types and versions. The isolation is just prerequisite for "plugin reload", not the same thing.

May be disable_global_remap_plugin_isolation or something like that?
Better names welcome? (naming is one of the 3 challenges in computer science).

isPluginDynamicReloadEnabled out of PluginDso?

My personal preference would be to move the functionality of isPluginDynamicReloadEnabled() out of PluginDso because it introduces unnecessary dependency in PluginDso on the ATS configuration infrastructure (i.e. think about writing a unit test for the new option, future refactoring, etc).

Design / implementation idea

There is probably no single best way but may be we could do something like the following.

# In plugin factory user code (caller)
bool plugin_reload_enabled =  CallerClassOrNameSpace::isPluginDynamicReloadEnabled();
RemapPluginInst *pi = PluginFactory::getRemapPlugin(…, plugin_reload_enabled); 

PluginFactory::getRemapPlugin() based on plugin_reload_enabled could then set the plugin effective and runtime paths accordingly (same or different)

PluginDso::load() could skip copying of the DSO file if plugin effective and runtime paths are the same (which would disable practically the isolation bringing back the old behavior).

In such a way the plugin loading design / implementation would not prevent us from

  • isolating / reloading other types of plugins.
  • going with any of the option types - “compile-time”, “run-time”, “overridable” / “non-overridable”, on or off by default (whatever we decide we want to do here)

Unit tests and documentation

Please make sure the existing "plugin reload" behavior still works / unit tests still pass and also add unit tests testing the isolation effect of the new option. The new plugin loading infrastructure has a pretty decent coverage (IMHO), let's keep it that way!

Thank you for adding config option documentation! It will be great to add a paragraph about the isolation problem between global and remap plugins in reloading-plugins.en.html design document (something that we missed to explained initially).

Copy link
Contributor

@gtenev gtenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix/update the unit-tests and also see my comments on the PR about reducing the scope, alternative implementations.

@sudheerv
Copy link
Contributor

sudheerv commented Feb 16, 2020

@gtenev @vmamidi - We need something quick to unblock us that will preserve the old behavior. We have no plans of using dynamic plugin reload in the near future and Lokesh’s current patch allows us to unblock our ATS9 migration effort and move us to have it in production sooner where I’m sure we will see other issues that will need troubleshooting.

I agree with Vijay’s suggestion on making this overridable but given the time crunch we are in, this probably is a decent start. We can always extend it further in the future. Unless there are any fundamental issues you see with the patch, would be great if you can review it from the perspective of KISS at this point. We have already applied this in our internal ats9 and are moving forward with this patch at this point.

Oh, I do agree we definitely need to fix the tests and I think it’s only a matter of choosing the correct default (ON) for the new config, which sounds reasonable.

@15ljindal pls feel free to correct me if I mis-stated anything :)

std::string error;
void *s = nullptr;
CHECK(getSymbol("pluginDsoVersionTest", s, error));
int (*version)() = reinterpret_cast<int (*)()>(s);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great place for auto.

@SolidWallOfCode
Copy link
Member

Definitely impressed by the testing code. I need to spend some more time reviewing that.

@gtenev
Copy link
Contributor

gtenev commented Feb 17, 2020

@sudheerv @15ljindal although I used a long explanation to show where I am coming from, I don't believe the idea I gave you about an alternative implementation is any more complicated or more time consuming than what you proposed with this PR.

What I proposed is of the same complexity and is just less limiting (reduces the scope of the option impact just to global/remap plugins). It will also allow you to do what @vmamidi was asking for - "overridable" runtime option and still allow us to disable the "plugin reload" feature for you (whichever we choose).

IMHO we need to be a little bit more forward looking than this.

@15ljindal
Copy link
Contributor Author

Please fix/update the unit-tests and also see my comments on the PR about reducing the scope, alternative implementations.

Thanks a lot for the feedback @gtenev .

Scope: The scope of this change is to make sure we achieve the legacy behavior of plugin loading and keep it as simple as possible (as @sudheerv mentioned). From the user point of view, the behavior would be - "Once user has started the traffic server process, user would not want to do any kind re-loading of a plugin. Once a plugin has been loaded (be it part of plugin.config or remap.config), user would not want to reload the same plugin Dso again." This is what the behavior was prior to "dynamic reload" feature and having that behavior is the goal of this PR.
If we want to do anything which adds more capabilities (e.g. per remap line option to enable/disable the dynamic reload) then they can be par of future work as @sudheerv mentioned.

Option name: In line with the above mentioned goal I feel names like "proxy.config.plugin.dynamic_reload_mode" or "proxy.config.plugin.dynamic_reload_enabled" are simpler and thus more suitable. This option would be non-reloadable and I am leaning towards using a global variable as @SolidWallOfCode suggested in the implementation. By default, the "dynamic reload" feature would be ENABLED.

Implementation: I see your point. As you suggested, one of the ideas I considered (and implemented first) was to just pass the bool "dynamicReloadEnabled" as part of function calls (like PluginFactory::getRemapPlugin(..., bool dynamicReloadEnabled) or RemapPluginInfo::load(..., bool dynamicReloadEnabled) or wherever needed. However, that does not scale well.

One of the most critical requirements to restore legacy behavior is that we do not end up deleting any user specified Dso file. Function PluginDso::clean(...) which deletes the Dso file should effectively be a NOP when dynamic reload is disabled. This function can be called on multiple code paths (from PluginDso::unload(...) or PluginDso::~PluginDso() etc. As such the most suitable way to make PluginDso::clean(...) a NOP is to either have a class member bool _dynamicReloadEnabled as in the current version of PR, or use a global variable which I can create. Same can be said about function PluginFactory::clean(...)

Testing: Already existing tests in the folder unit-tests are very comprehensive. I have made sure all those tests pass with this PR and they are practically untouched - the only change is passing the variable "bool dynamicReloadEnabled" as TRUE. So I am not exactly sure what your concern is regarding existing tests. Please let me know if I mis-read something.
Additionally, I have added some tests using the existing infra to test the legacy behavior(s) this PR is targeting.

Documentation: I am going to add more documentation to reloading-plugins.en.html to explain this change in more detail. Thanks for pointing that out.

Please let me know your thoughts so that I can work on updating this PR and get this change in at the earliest.

@15ljindal 15ljindal force-pushed the plugin_dynamic_reload_config_option branch from 83d825a to 0a6d2fe Compare February 21, 2020 06:12
@15ljindal
Copy link
Contributor Author

Please fix/update the unit-tests and also see my comments on the PR about reducing the scope, alternative implementations.

I have updated the pull request based on our discussions here. Please review.

Copy link
Contributor

@gtenev gtenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@15ljindal: my initial comment about tests: your initial PR version seemed to have flipped the default behavior (dynamic reload = false, which is now true) and also autest tests were failing. I was simply asking to make sure the existing tests are still valid since they were built with the assumption of the "plugin isolation always on" (I used TDD and all but still blind spots are possible).

Thank you for adding tests testing the presence and lack of isolation between "mixed" and "non-mixed" mode and updating existing tests accordingly (please see my comments in the code).

I think we should also test / add unit tests to validate the behavior related to TSRemapDone, TSRemapDeleteInstance, etc with the new option (you can find examples of similar tests in the code)

}

// this is a simple class to simulate loading of Plugin Dso as done during loading of global plugins
class GlobalPluginInfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could improve code reuse here - extend PluginDso and override load(). This way there would be no need to have yet another implementation of getSymbol(), you could still expose the protected member _dlh through dlOpenHandle() only for the scope of this unit-test and use the getPluginVersion() trick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is to keep it obvious that this class mimics loading of global plugin and is thus completely independent of all the infrastructure which is part of dynamic plugin (re)loading / remap plugins. Current implementation is simple and clear, and I would prefer to keep it that way.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please see my other comment below

*/

void *
PluginDso::dlOpenHandle() const
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To increase encapsulation _dlh was not exposed on purpose to prevent users from accessing it directly (hiding the implementation details). I still made it protected so when we need to test the behavior of PluginDso instances we could inherit from it and expose internals but only for the scope of the unit-test (you have probably already noticed this pattern in the unit-tests)

Copy link
Contributor Author

@15ljindal 15ljindal Feb 24, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same thing in mind originally, but here is my thought process.

First, I feel it is a reasonable assumption to assume that the dso will be opened using dlopen and thus asking for its handle seems reasonable as well.

I looked at how we can use the suggestion of having a class derived from PluginDso where this member can be accessed in unit-tests (file test_PluginFactory.cc), and there does not seem to be an easy way to do that without writing too much bolier-plate code for a small change. I will try to explain here.
There are calls like RemapPluginInst *plugin_v2 = factory2->getRemapPlugin(configName, 0, nullptr, error2, isPluginDynamicReloadEnabled()); which return an object of type RemapPluginInst which holds a pointer to an object of type RemapPluginInfo. In order to get get access to _dlh without this PluginDso::dlOpenHandle() function, I think I would have to define class RemapPluginInfoTest : public RemapPluginInfo and then define copy constructors in class RemapPluginInfoTest, RemapPluginInfo and PluginDso, so that I can do something like
RemapPluginInst *plugin_v2 = factory2->getRemapPlugin(configName, 0, nullptr, error2, isPluginDynamicReloadEnabled());
RemapPluginInfoTest test_object(plugin_v2->_plugin);
CHECK(test_object._dlh == ...)

In short, above seems like an over-kill to me and thus, having this function PluginDso::dlOpenHandle() seems fine to me. Please let me know if I missed something or if you have any further thoughts.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure why my response does not show up here. Please see - #6421 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, I feel it is a reasonable assumption to assume that the dso will be opened using dlopen and thus asking for its handle seems reasonable as well.

Encapsulation can be an obstacle to simplicity but it has its benefits - think in terms of preventing the PluginDso user from calling dlclose directly, also about possibility to implement internal PluginDso housekeeping

In short, above seems like an over-kill

I think it could be expected to have little bit more "cumbersome" unit-test code in the name of better module design. IMHO it does not mean that this is not worth pursuing, it rather means that we have not come up with a better way to do it.

Clean and simple

True, access modifiers, design patterns often go in the opposite direction to simplicity (similarly security vs usability) but it is believed that they could help long term. Also sometimes it just seem more complex than it actually is (i.e. removing the coupling in PluginDso did not really make the code more complex)

I would prefer to keep it that way.

Fair enough, I appreciate the thorough unit-tests related to the plugin isolation! I think we are in a good shape here - coverage is good and we can catch regressions in future which is the final goal here.

Copy link
Contributor

@gtenev gtenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Disabling isolation in PluginDso by using a global variable would introduce unnecessary coupling - disabling the isolation between global/remap plugins would disable the isolation between all possible (future) plugin types.

It seems feasible through “abstraction” (separation of concerns) to separate the behavior from the implementation in such way so that it limits the scope of the option to the particular use-case and still keep the internal implementation generic enough to enable the isolation for all other use cases (basically allowing to run loadable and non-loadable plugins side-by-side).

It also seems likely to achieve more modular and generic design if we keep PluginDso as decoupled and cohesive (reusable) as possible, anticipating and not precluding future changes, hopefully handle well new (unexpected) requirements.

This review describes an implementation idea to enable/disable DSO isolation per each getRemapPlugin() call. We had a pretty good out-of-band slack discussion with @15ljindal and we seems to agree that it is worth trying out. We also discussed LinkedIn’s use case as well as various possible other use-cases that could benefit from it.

Copy link
Member

@SolidWallOfCode SolidWallOfCode left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I read through the changes again and I don't see anything that @gtenev hasn't already brought up. I'm good with this if Gancho is.

gtenev
gtenev previously approved these changes Feb 28, 2020
Copy link
Contributor

@gtenev gtenev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@15ljindal, thank you for working on this PR and also agreeing to remove the initial PluginDso coupling!

This change (remap/global plugin isolation switch option) seems a good first step. Assuming it was tested and now meets LinkedIn's need at the moment, I think we can keep building / improving on top of it as we discover more while going in this direction.

…ad feature

for plugins
=> proxy.config.plugin.dynamic_reload_mode - 1 (default) enables the
dynamic reload feature, 0 disables it
=> Adds to and refactors unit-tests for the dynamic plugin reload feature
@15ljindal 15ljindal force-pushed the plugin_dynamic_reload_config_option branch from 6cd8a21 to 5f90643 Compare February 28, 2020 19:58
Copy link
Contributor

@sudheerv sudheerv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @15ljindal !

I'll merge this once the CI completes.

@apache apache deleted a comment from sudheerv Mar 1, 2020
@sudheerv
Copy link
Contributor

sudheerv commented Mar 1, 2020

@15ljindal Can you take a look at the clang failure?

@randall
Copy link
Contributor

randall commented Mar 3, 2020

[approve ci clang-analyzer]

@randall
Copy link
Contributor

randall commented Mar 3, 2020

clang-analyzer issue was fixed by #6470.

@sudheerv sudheerv merged commit 889e0fa into apache:master Mar 3, 2020
@zwoop zwoop added the OnDocs This is for PR currently running, or will run, on the Docs ATS server label Mar 5, 2020
@zwoop zwoop changed the title Issue #6400 - Add config option to enable/disable dynamic plugin reload feature Add config option to enable/disable dynamic plugin reload feature Mar 5, 2020
@zwoop
Copy link
Contributor

zwoop commented Mar 6, 2020

Tested on docs.
Cherry-picked to v9.0.x branch.

@zwoop zwoop removed the OnDocs This is for PR currently running, or will run, on the Docs ATS server label Mar 6, 2020
@zwoop zwoop modified the milestones: 10.0.0, 9.0.0 Mar 6, 2020
@zwoop zwoop changed the title Add config option to enable/disable dynamic plugin reload feature Add config option to enable/disable dynamic plugin reload feature May 14, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Dynamic loading breaks behavior for a plugin who is both remap and global.

7 participants